2024-11-01
The basic awk command structure is:
awk 'program' input-fileThe program is a set of instructions written in awk’s scripting language, and input-file specifies the file to be processed. If no input file is given, awk reads from standard input (stdin).
A simple awk program consists of patterns and actions. A pattern defines which lines to process, and the action dictates what happens to those lines. If a pattern is omitted, the action is performed on every line. If an action is omitted, the matching line is printed.
Let’s start with some fundamental examples:
1. Printing every line:
This command prints every line of the file data.txt:
awk '{print}' data.txtThis is equivalent to using cat data.txt.
2. Printing specific fields:
Assuming data.txt contains comma-separated values (CSV), this command prints the second and third fields:
awk -F',' '{print $2, $3}' data.txt-F',' sets the field separator to a comma. $2 and $3 refer to the second and third fields respectively.
3. Conditional printing:
This command prints only lines where the first field is greater than 10:
awk -F',' '$1 > 10 {print}' data.txt4. Using variables:
This example sums the values in the second field:
awk -F',' '{sum += $2} END {print "Sum:", sum}' data.txtsum is an awk variable. The END block is executed after processing all lines.
awk’s power lies in its ability to handle more complex scenarios:
1. Regular expressions:
This command prints lines containing the word “error”:
awk '/error/ {print}' log.txt/error/ is a regular expression pattern.
2. Built-in functions:
awk provides numerous built-in functions. This example converts the second field to uppercase:
awk -F',' '{print toupper($2)}' data.txt3. Multiple patterns and actions:
This command prints lines starting with “INFO” and lines containing “warning”:
awk '/^INFO/{print "Info message:", $0} /warning/{print "Warning:", $0}' log.txt4. Custom functions:
awk allows you to define custom functions:
awk 'function square(x){return x*x} {print square($1)}' data.txt5. Using BEGIN block:
The BEGIN block is executed before processing any lines. This example prints a header before the data:
awk 'BEGIN {print "Data Report"} {print $1, $2}' data.txtawk can effortlessly handle multiple input files:
awk '{print FILENAME, $1}' file1.txt file2.txtThis command prints the filename and the first field from both file1.txt and file2.txt.
Imagine you have a log file with entries like: [date] [level] [message]
This awk script can summarize the log file by level:
awk '{count[$2]++} END {for (level in count) print level, count[level]}' log.txtThis script uses an associative array (count) to count the occurrences of each log level.
These examples showcase the versatility of awk. Its concise syntax and powerful features make it an indispensable tool for any Linux user working with text data. By understanding these fundamental concepts and expanding upon them, you’ll realize the true potential of awk for various text processing tasks.