(gawk.info.gz) Using Constant Regexps

Info Catalog (gawk.info.gz) Constants (gawk.info.gz) Expressions (gawk.info.gz) Variables
 
 5.2 Using Regular Expression Constants
 ======================================
 
 When used on the righthand side of the `~' or `!~' operators, a regexp
 constant merely stands for the regexp that is to be matched.  However,
 regexp constants (such as `/foo/') may be used like simple expressions.
 When a regexp constant appears by itself, it has the same meaning as if
 it appeared in a pattern, i.e., `($0 ~ /foo/)' (d.c.)   Expression
 Patterns.  This means that the following two code segments:
 
      if ($0 ~ /barfly/ || $0 ~ /camelot/)
          print "found"
 
 and:
 
      if (/barfly/ || /camelot/)
          print "found"
 
 are exactly equivalent.  One rather bizarre consequence of this rule is
 that the following Boolean expression is valid, but does not do what
 the user probably intended:
 
      # note that /foo/ is on the left of the ~
      if (/foo/ ~ $1) print "found foo"
 
 This code is "obviously" testing `$1' for a match against the regexp
 `/foo/'.  But in fact, the expression `/foo/ ~ $1' actually means `($0
 ~ /foo/) ~ $1'.  In other words, first match the input record against
 the regexp `/foo/'.  The result is either zero or one, depending upon
 the success or failure of the match.  That result is then matched
 against the first field in the record.  Because it is unlikely that you
 would ever really want to make this kind of test, `gawk' issues a
 warning when it sees this construct in a program.  Another consequence
 of this rule is that the assignment statement:
 
      matches = /foo/
 
 assigns either zero or one to the variable `matches', depending upon
 the contents of the current input record.  This feature of the language
 has never been well documented until the POSIX specification.
 
    Constant regular expressions are also used as the first argument for
 the `gensub', `sub', and `gsub' functions, and as the second argument
 of the `match' function ( String Functions).  Modern
 implementations of `awk', including `gawk', allow the third argument of
 `split' to be a regexp constant, but some older implementations do not.
 (d.c.)  This can lead to confusion when attempting to use regexp
 constants as arguments to user-defined functions ( User-defined).
 For example:
 
      function mysub(pat, repl, str, global)
      {
          if (global)
              gsub(pat, repl, str)
          else
              sub(pat, repl, str)
          return str
      }
 
      {
          ...
          text = "hi! hi yourself!"
          mysub(/hi/, "howdy", text, 1)
          ...
      }
 
    In this example, the programmer wants to pass a regexp constant to
 the user-defined function `mysub', which in turn passes it on to either
 `sub' or `gsub'.  However, what really happens is that the `pat'
 parameter is either one or zero, depending upon whether or not `$0'
 matches `/hi/'.  `gawk' issues a warning when it sees a regexp constant
 used as a parameter to a user-defined function, since passing a truth
 value in this way is probably not what was intended.
 
Info Catalog (gawk.info.gz) Constants (gawk.info.gz) Expressions (gawk.info.gz) Variables
automatically generated by info2html