2024-08-05
comm
CommandThe comm
command compares two sorted files line by line, revealing the lines unique to each file and the lines common to both. Its output is divided into three columns:
The absence of a column indicates that no lines fall into that category. This makes comm
useful for quickly identifying differences and similarities between datasets.
The simplest form of the comm
command involves specifying the two files to compare:
comm file1.txt file2.txt
Let’s create two sample files:
file1.txt:
apple
banana
cherry
date
file2.txt:
banana
cherry
fig
grape
Running the command above yields:
apple
date
fig
grape
banana
cherry
This output shows:
apple
and date
are unique to file1.txt
.fig
and grape
are unique to file2.txt
.banana
and cherry
are common to both files.comm
Optionscomm
offers options to suppress specific columns from its output, enhancing its flexibility:
-1
: Suppresses column 1 (lines unique to the first file).-2
: Suppresses column 2 (lines unique to the second file).-3
: Suppresses column 3 (lines common to both files).Combining these options allows for highly targeted comparisons.
Example: Show only lines unique to file2.txt:
comm -1 -3 file1.txt file2.txt
Output:
fig
grape
Example: Show only lines common to both files:
comm -1 -2 file1.txt file2.txt
Output:
banana
cherry
Example: Show lines unique to either file:
comm -3 file1.txt file2.txt
Output:
apple
date
fig
grape
Remember that comm
expects its input files to be sorted. Attempting to compare unsorted files will lead to inaccurate results. Use the sort
command to sort your files before using comm
:
sort file1.txt > file1_sorted.txt
sort file2.txt > file2_sorted.txt
comm file1_sorted.txt file2_sorted.txt
The power of comm
extends beyond basic file comparisons. It can be integrated into shell scripts for more complex tasks, automating the identification of differences in data sets or logs. This facilitates efficient data analysis and system monitoring. Further exploration of combining comm
with other command-line tools like grep
, awk
, and sed
opens up a wide range of possibilities for advanced text processing.