(coreutils.info.gz) Squeezing

Info Catalog (coreutils.info.gz) Translating (coreutils.info.gz) tr invocation
 
 9.1.3 Squeezing repeats and deleting
 ------------------------------------
 
 When given just the '--delete' ('-d') option, 'tr' removes any input
 characters that are in SET1.
 
    When given just the '--squeeze-repeats' ('-s') option, 'tr' replaces
 each input sequence of a repeated character that is in SET1 with a
 single occurrence of that character.
 
    When given both '--delete' and '--squeeze-repeats', 'tr' first
 performs any deletions using SET1, then squeezes repeats from any
 remaining characters using SET2.
 
    The '--squeeze-repeats' option may also be used when translating, in
 which case 'tr' first performs translation, then squeezes repeats from
 any remaining characters using SET2.
 
    Here are some examples to illustrate various combinations of options:
 
    * Remove all zero bytes:
 
           tr -d '\0'
 
    * Put all words on lines by themselves.  This converts all
      non-alphanumeric characters to newlines, then squeezes each string
      of repeated newlines into a single newline:
 
           tr -cs '[:alnum:]' '[\n*]'
 
    * Convert each sequence of repeated newlines to a single newline:
 
           tr -s '\n'
 
    * Find doubled occurrences of words in a document.  For example,
      people often write "the the" with the repeated words separated by a
      newline.  The Bourne shell script below works first by converting
      each sequence of punctuation and blank characters to a single
      newline.  That puts each "word" on a line by itself.  Next it maps
      all uppercase characters to lower case, and finally it runs 'uniq'
      with the '-d' option to print out only the words that were
      repeated.
 
           #!/bin/sh
           cat -- "$@" \
             | tr -s '[:punct:][:blank:]' '[\n*]' \
             | tr '[:upper:]' '[:lower:]' \
             | uniq -d
 
    * Deleting a small set of characters is usually straightforward.  For
      example, to remove all 'a's, 'x's, and 'M's you would do this:
 
           tr -d axM
 
      However, when '-' is one of those characters, it can be tricky
      because '-' has special meanings.  Performing the same task as
      above but also removing all '-' characters, we might try 'tr -d
      -axM', but that would fail because 'tr' would try to interpret '-a'
      as a command-line option.  Alternatively, we could try putting the
      hyphen inside the string, 'tr -d a-xM', but that wouldn't work
      either because it would make 'tr' interpret 'a-x' as the range of
      characters 'a'...'x' rather than the three.  One way to solve the
      problem is to put the hyphen at the end of the list of characters:
 
           tr -d axM-
 
      Or you can use '--' to terminate option processing:
 
           tr -d -- -axM
 
      More generally, use the character class notation '[=c=]' with '-'
      (or any other character) in place of the 'c':
 
           tr -d '[=-=]axM'
 
      Note how single quotes are used in the above example to protect the
      square brackets from interpretation by a shell.
 
Info Catalog (coreutils.info.gz) Translating (coreutils.info.gz) tr invocation
automatically generated by info2html