(gawk.info.gz) Conversion

Info Catalog (gawk.info.gz) Variables (gawk.info.gz) Values
 
 6.1.4 Conversion of Strings and Numbers
 ---------------------------------------
 
 Strings are converted to numbers and numbers are converted to strings,
 if the context of the `awk' program demands it.  For example, if the
 value of either `foo' or `bar' in the expression `foo + bar' happens to
 be a string, it is converted to a number before the addition is
 performed.  If numeric values appear in string concatenation, they are
 converted to strings.  Consider the following:
 
      two = 2; three = 3
      print (two three) + 4
 
 This prints the (numeric) value 27.  The numeric values of the
 variables `two' and `three' are converted to strings and concatenated
 together.  The resulting string is converted back to the number 23, to
 which 4 is then added.
 
    If, for some reason, you need to force a number to be converted to a
 string, concatenate that number with the empty string, `""'.  To force
 a string to be converted to a number, add zero to that string.  A
 string is converted to a number by interpreting any numeric prefix of
 the string as numerals: `"2.5"' converts to 2.5, `"1e3"' converts to
 1000, and `"25fix"' has a numeric value of 25.  Strings that can't be
 interpreted as valid numbers convert to zero.
 
    The exact manner in which numbers are converted into strings is
 controlled by the `awk' built-in variable `CONVFMT' ( Built-in
 Variables).  Numbers are converted using the `sprintf()' function
 with `CONVFMT' as the format specifier ( String Functions).
 
    `CONVFMT''s default value is `"%.6g"', which prints a value with at
 most six significant digits.  For some applications, you might want to
 change it to specify more precision.  On most modern machines, 17
 digits is usually enough to capture a floating-point number's value
 exactly.(1)
 
    Strange results can occur if you set `CONVFMT' to a string that
 doesn't tell `sprintf()' how to format floating-point numbers in a
 useful way.  For example, if you forget the `%' in the format, `awk'
 converts all numbers to the same constant string.
 
    As a special case, if a number is an integer, then the result of
 converting it to a string is _always_ an integer, no matter what the
 value of `CONVFMT' may be.  Given the following code fragment:
 
      CONVFMT = "%2.2f"
      a = 12
      b = a ""
 
 `b' has the value `"12"', not `"12.00"'.  (d.c.)
 
    Prior to the POSIX standard, `awk' used the value of `OFMT' for
 converting numbers to strings.  `OFMT' specifies the output format to
 use when printing numbers with `print'.  `CONVFMT' was introduced in
 order to separate the semantics of conversion from the semantics of
 printing.  Both `CONVFMT' and `OFMT' have the same default value:
 `"%.6g"'.  In the vast majority of cases, old `awk' programs do not
 change their behavior.  However, these semantics for `OFMT' are
 something to keep in mind if you must port your new-style program to
 older implementations of `awk'.  We recommend that instead of changing
 your programs, just port `gawk' itself.   Print, for more
 information on the `print' statement.
 
    And, once again, where you are can matter when it comes to converting
 between numbers and strings.  In  Locales, we mentioned that the
 local character set and language (the locale) can affect how `gawk'
 matches characters.  The locale also affects numeric formats.  In
 particular, for `awk' programs, it affects the decimal point character.
 The `"C"' locale, and most English-language locales, use the period
 character (`.') as the decimal point.  However, many (if not most)
 European and non-English locales use the comma (`,') as the decimal
 point character.
 
    The POSIX standard says that `awk' always uses the period as the
 decimal point when reading the `awk' program source code, and for
 command-line variable assignments ( Other Arguments).  However,
 when interpreting input data, for `print' and `printf' output, and for
 number to string conversion, the local decimal point character is used.
 Here are some examples indicating the difference in behavior, on a
 GNU/Linux system:
 
      $ gawk 'BEGIN { printf "%g\n", 3.1415927 }'
      -| 3.14159
      $ LC_ALL=en_DK gawk 'BEGIN { printf "%g\n", 3.1415927 }'
      -| 3,14159
      $ echo 4,321 | gawk '{ print $1 + 1 }'
      -| 5
      $ echo 4,321 | LC_ALL=en_DK gawk '{ print $1 + 1 }'
      -| 5,321
 
 The `en_DK' locale is for English in Denmark, where the comma acts as
 the decimal point separator.  In the normal `"C"' locale, `gawk' treats
 `4,321' as `4', while in the Danish locale, it's treated as the full
 number, 4.321.
 
    Some earlier versions of `gawk' fully complied with this aspect of
 the standard.  However, many users in non-English locales complained
 about this behavior, since their data used a period as the decimal
 point, so the default behavior was restored to use a period as the
 decimal point character.  You can use the `--use-lc-numeric' option
 ( Options) to force `gawk' to use the locale's decimal point
 character.  (`gawk' also uses the locale's decimal point character when
 in POSIX mode, either via `--posix', or the `POSIXLY_CORRECT'
 environment variable.)
 
     table-locale-affects describes the cases in which the
 locale's decimal point character is used and when a period is used.
 Some of these features have not been described yet.
 
 Feature     Default        `--posix' or `--use-lc-numeric'
 ------------------------------------------------------------ 
 `%'g'       Use locale     Use locale
 `%g'        Use period     Use locale
 Input       Use period     Use locale
 `strtonum()'Use period     Use locale
 
 Table 6.1: Locale Decimal Point versus A Period
 
    Finally, modern day formal standards and IEEE standard floating point
 representation can have an unusual but important effect on the way
 `gawk' converts some special string values to numbers.  The details are
 presented in  POSIX Floating Point Problems.
 
    ---------- Footnotes ----------
 
    (1) Pathological cases can require up to 752 digits (!), but we
 doubt that you need to worry about this.
 
Info Catalog (gawk.info.gz) Variables (gawk.info.gz) Values
automatically generated by info2html