(ed.info.gz) Regular Expressions
Info Catalog
(ed.info.gz) Line Addressing
(ed.info.gz) Top
(ed.info.gz) Commands
5 Regular Expressions
*********************
Regular expressions are patterns used in selecting text. For example,
the `ed' command
g/STRING/
prints all lines containing STRING. Regular expressions are also used
by the `s' command for selecting old text to be replaced with new text.
In addition to a specifying string literals, regular expressions can
represent classes of strings. Strings thus represented are said to be
matched by the corresponding regular expression. If it is possible for a
regular expression to match several strings in a line, then the
left-most longest match is the one selected.
The following symbols are used in constructing regular expressions:
`C'
Any character C not listed below, including `{', `}', `(', `)',
`<' and `>', matches itself.
`\C'
Any backslash-escaped character C, other than `{', `}', `(', `)',
`<', `>', `b', `B', `w', `W', `+' and `?', matches itself.
`.'
Matches any single character.
`[CHAR-CLASS]'
Matches any single character in CHAR-CLASS. To include a `]' in
CHAR-CLASS, it must be the first character. A range of characters
may be specified by separating the end characters of the range
with a `-', e.g., `a-z' specifies the lower case characters. The
following literal expressions can also be used in CHAR-CLASS to
specify sets of characters:
[:alnum:] [:cntrl:] [:lower:] [:space:]
[:alpha:] [:digit:] [:print:] [:upper:]
[:blank:] [:graph:] [:punct:] [:xdigit:]
If `-' appears as the first or last character of CHAR-CLASS, then
it matches itself. All other characters in CHAR-CLASS match
themselves.
Patterns in CHAR-CLASS of the form:
[.COL-ELM.]
[=COL-ELM=]
where COL-ELM is a "collating element" are interpreted according
to `locale (5)'. See `regex (3)' for an explanation of these
constructs.
`[^CHAR-CLASS]'
Matches any single character, other than newline, not in
CHAR-CLASS. CHAR-CLASS is defined as above.
`^'
If `^' is the first character of a regular expression, then it
anchors the regular expression to the beginning of a line.
Otherwise, it matches itself.
`$'
If `$' is the last character of a regular expression, it anchors
the regular expression to the end of a line. Otherwise, it matches
itself.
`\(RE\)'
Defines a (possibly null) subexpression RE. Subexpressions may be
nested. A subsequent backreference of the form `\N', where N is a
number in the range [1,9], expands to the text matched by the Nth
subexpression. For example, the regular expression `\(a.c\)\1'
matches the string `abcabc', but not `abcadc'. Subexpressions are
ordered relative to their left delimiter.
`*'
Matches the single character regular expression or subexpression
immediately preceding it zero or more times. If `*' is the first
character of a regular expression or subexpression, then it matches
itself. The `*' operator sometimes yields unexpected results. For
example, the regular expression `b*' matches the beginning of the
string `abbb', as opposed to the substring `bbb', since a null
match is the only left-most match.
`\{N,M\}'
`\{N,\}'
`\{N\}'
Matches the single character regular expression or subexpression
immediately preceding it at least N and at most M times. If M is
omitted, then it matches at least N times. If the comma is also
omitted, then it matches exactly N times. If any of these forms
occurs first in a regular expression or subexpression, then it is
interpreted literally (i.e., the regular expression `\{2\}'
matches the string `{2}', and so on).
`\<'
`\>'
Anchors the single character regular expression or subexpression
immediately following it to the beginning (in the case of `\<') or
ending (in the case of `\>') of a "word", i.e., in ASCII, a
maximal string of alphanumeric characters, including the
underscore (_).
The following extended operators are preceded by a backslash `\' to
distinguish them from traditional `ed' syntax.
`\`'
`\''
Unconditionally matches the beginning `\`' or ending `\'' of a
line.
`\?'
Optionally matches the single character regular expression or
subexpression immediately preceding it. For example, the regular
expression `a[bd]\?c' matches the strings `abc', `adc' and `ac'.
If `\?' occurs at the beginning of a regular expressions or
subexpression, then it matches a literal `?'.
`\+'
Matches the single character regular expression or subexpression
immediately preceding it one or more times. So the regular
expression `a+' is shorthand for `aa*'. If `\+' occurs at the
beginning of a regular expression or subexpression, then it
matches a literal `+'.
`\b'
Matches the beginning or ending (null string) of a word. Thus the
regular expression `\bhello\b' is equivalent to `\<hello\>'.
However, `\b\b' is a valid regular expression whereas `\<\>' is
not.
`\B'
Matches (a null string) inside a word.
`\w'
Matches any character in a word.
`\W'
Matches any character not in a word.
Info Catalog
(ed.info.gz) Line Addressing
(ed.info.gz) Top
(ed.info.gz) Commands
automatically generated by
info2html