tr

2024-01-15

Basic Text Translation: Replacing Characters

At its core, tr replaces specific characters or sets of characters. The basic syntax is:

tr 'SET1' 'SET2'

Where SET1 is the set of characters to be replaced, and SET2 is the set of characters to replace them with. The lengths of SET1 and SET2 must be equal unless you’re using character classes (explained below).

Let’s say we have a file named input.txt containing:

Hello, World!

We can change all lowercase ’l’s to uppercase ’L’s like this:

tr 'l' 'L' < input.txt

This will output:

HeLLo, WorLd!

You can replace multiple characters simultaneously. To replace all lowercase vowels with uppercase vowels:

tr 'aeiou' 'AEIOU' < input.txt

This will output:

HeLLo, World!

Character Classes and Ranges: Simplifying Transformations

tr supports character classes, making complex transformations easier. Character classes are enclosed in square brackets []. You can specify ranges using a hyphen -.

To convert all lowercase characters to uppercase:

tr '[:lower:]' '[:upper:]' < input.txt

This utilizes character classes defined by POSIX. Similarly, to convert all uppercase characters to lowercase:

tr '[:upper:]' '[:lower:]' < input.txt

To delete specific characters, use the -d option followed by the set of characters to delete:

tr -d '[:punct:]' < input.txt

This will remove all punctuation from input.txt, outputting:

HelloWorld

Squeezing Repeated Characters: The -s Option

The -s option (squeeze-repeats) is useful for removing consecutive occurrences of the same character. For example, to remove extra spaces from a line:

echo "This   has    extra    spaces" | tr -s ' '

This outputs:

This has extra spaces

Complementing Character Sets: The -c Option

The -c option (complement) selects characters not in the specified set. Combined with the -d option, this allows you to delete all characters except those in a specified set.

To keep only alphanumeric characters:

tr -cd '[:alnum:]' < input.txt

This deletes all non-alphanumeric characters, effectively leaving only letters and numbers.

Combining Options for Advanced Transformations

You can combine multiple options to achieve complex transformations. For example, to convert to lowercase, remove punctuation, and squeeze spaces:

echo "HeLLo, WorLd!  This is a test." | tr '[:upper:]' '[:lower:]' | tr -d '[:punct:]' | tr -s ' '

This outputs:

hello world this is a test

These examples demonstrate the versatility of the tr command. Its concise syntax and powerful options make it a tool for any Linux user working with text. Remember to consult the man tr page for a complete list of options and features.