(gawk.info.gz) Conversion

Info Catalog (gawk.info.gz) Variables (gawk.info.gz) Expressions (gawk.info.gz) Arithmetic Ops
 
 5.4 Conversion of Strings and Numbers
 =====================================
 
 Strings are converted to numbers and numbers are converted to strings,
 if the context of the `awk' program demands it.  For example, if the
 value of either `foo' or `bar' in the expression `foo + bar' happens to
 be a string, it is converted to a number before the addition is
 performed.  If numeric values appear in string concatenation, they are
 converted to strings.  Consider the following:
 
      two = 2; three = 3
      print (two three) + 4
 
 This prints the (numeric) value 27.  The numeric values of the
 variables `two' and `three' are converted to strings and concatenated
 together.  The resulting string is converted back to the number 23, to
 which 4 is then added.
 
    If, for some reason, you need to force a number to be converted to a
 string, concatenate the empty string, `""', with that number.  To force
 a string to be converted to a number, add zero to that string.  A
 string is converted to a number by interpreting any numeric prefix of
 the string as numerals: `"2.5"' converts to 2.5, `"1e3"' converts to
 1000, and `"25fix"' has a numeric value of 25.  Strings that can't be
 interpreted as valid numbers convert to zero.
 
    The exact manner in which numbers are converted into strings is
 controlled by the `awk' built-in variable `CONVFMT' ( Built-in
 Variables).  Numbers are converted using the `sprintf' function with
 `CONVFMT' as the format specifier ( String Functions).
 
    `CONVFMT''s default value is `"%.6g"', which prints a value with at
 most six significant digits.  For some applications, you might want to
 change it to specify more precision.  On most modern machines, 17
 digits is enough to capture a floating-point number's value exactly,
 most of the time.(1)
 
    Strange results can occur if you set `CONVFMT' to a string that
 doesn't tell `sprintf' how to format floating-point numbers in a useful
 way.  For example, if you forget the `%' in the format, `awk' converts
 all numbers to the same constant string.  As a special case, if a
 number is an integer, then the result of converting it to a string is
 _always_ an integer, no matter what the value of `CONVFMT' may be.
 Given the following code fragment:
 
      CONVFMT = "%2.2f"
      a = 12
      b = a ""
 
 `b' has the value `"12"', not `"12.00"'.  (d.c.)
 
    Prior to the POSIX standard, `awk' used the value of `OFMT' for
 converting numbers to strings.  `OFMT' specifies the output format to
 use when printing numbers with `print'.  `CONVFMT' was introduced in
 order to separate the semantics of conversion from the semantics of
 printing.  Both `CONVFMT' and `OFMT' have the same default value:
 `"%.6g"'.  In the vast majority of cases, old `awk' programs do not
 change their behavior.  However, these semantics for `OFMT' are
 something to keep in mind if you must port your new style program to
 older implementations of `awk'.  We recommend that instead of changing
 your programs, just port `gawk' itself.   Print, for more
 information on the `print' statement.
 
    And, once again, where you are can matter when it comes to converting
 between numbers and strings.  In  Locales, we mentioned that the
 local character set and language (the locale) can affect how `gawk'
 matches characters.  The locale also affects numeric formats.  In
 particular, for `awk' programs, it affects the decimal point character.
 The `"C"' locale, and most English-language locales, use the period
 character (`.') as the decimal point.  However, many (if not most)
 European and non-English locales use the comma (`,') as the decimal
 point character.
 
    The POSIX standard says that `awk' always uses the period as the
 decimal point when reading the `awk' program source code, and for
 command-line variable assignments ( Other Arguments).  However,
 when interpreting input data, for `print' and `printf' output, and for
 number to string conversion, the local decimal point character is used.
 Here are some examples indicating the difference in behavior, on a
 GNU/Linux system:
 
      $ gawk 'BEGIN { printf "%g\n", 3.1415927 }'
      -| 3.14159
      $  LC_ALL=en_DK gawk 'BEGIN { printf "%g\n", 3.1415927 }'
      -| 3,14159
      $ echo 4,321 | gawk '{ print $1 + 1 }'
      -| 5
      $ echo 4,321 | LC_ALL=en_DK gawk '{ print $1 + 1 }'
      -| 5,321
 
 The `en_DK' locale is for English in Denmark, where the comma acts as
 the decimal point separator.  In the normal `"C"' locale, `gawk' treats
 `4,321' as `4', while in the Danish locale, it's treated as the full
 number, `4.321'.
 
    For version 3.1.3 through 3.1.5, `gawk' fully complied with this
 aspect of the standard.  However, many users in non-English locales
 complained about this behavior, since their data used a period as the
 decimal point.  Beginning in version 3.1.6, the default behavior was
 restored to use a period as the decimal point character.  You can use
 the `--use-lc-numeric' option ( Options) to force `gawk' to use
 the locale's decimal point character.  (`gawk' also uses the locale's
 decimal point character when in POSIX mode, either via `--posix', or
 the `POSIXLY_CORRECT' environment variable.)
 
    The following table describes the cases in which the locale's decimal
 point character is used and when a period is used. Some of these
 features have not been described yet.
 
 Feature     Default        `--posix' or `--use-lc-numeric'
 ------------------------------------------------------------ 
 `%'g'       Use locale     Use locale
 `%g'        Use period     Use locale
 Input       Use period     Use locale
 `strtonum'  Use period     Use locale
 
 Table 5.1: Locale Decimal Point versus A Period
 
    Finally, modern day formal standards and IEEE standard floating point
 representation can have an unusual but important effect on the way
 `gawk' converts some special string values to numbers.  The details are
 presented in  POSIX Floating Point Problems.
 
    ---------- Footnotes ----------
 
    (1) Pathological cases can require up to 752 digits (!), but we
 doubt that you need to worry about this.
 
Info Catalog (gawk.info.gz) Variables (gawk.info.gz) Expressions (gawk.info.gz) Arithmetic Ops
automatically generated by info2html