(gawk.info.gz) Leftmost Longest
Info Catalog
(gawk.info.gz) Case-sensitivity
(gawk.info.gz) Regexp
(gawk.info.gz) Computed Regexps
3.7 How Much Text Matches?
==========================
Consider the following:
echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
This example uses the `sub()' function (which we haven't discussed
yet; String Functions) to make a change to the input record.
Here, the regexp `/a+/' indicates "one or more `a' characters," and the
replacement text is `<A>'.
The input contains four `a' characters. `awk' (and POSIX) regular
expressions always match the leftmost, _longest_ sequence of input
characters that can match. Thus, all four `a' characters are replaced
with `<A>' in this example:
$ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
-| <A>bcd
For simple match/no-match tests, this is not so important. But when
doing text matching and substitutions with the `match()', `sub()',
`gsub()', and `gensub()' functions, it is very important. String
Functions, for more information on these functions. Understanding
this principle is also important for regexp-based record and field
splitting ( Records, and also Field Separators).
Info Catalog
(gawk.info.gz) Case-sensitivity
(gawk.info.gz) Regexp
(gawk.info.gz) Computed Regexps
automatically generated by
info2html