awk
MASTERY

// Extract. Analyze. Report.

DATA IS POWER.

Logs, CSVs, system outputs—awk transforms raw text into meaningful data. It's not just a command, it's a complete programming language for text processing.

STRUCTURE FROM CHAOS.

awk automatically splits text into fields and lets you process them individually. Need to extract column 3 from a CSV? One command. Done.

BEGIN YOUR JOURNEY

// Your Training Path

Click a lesson to begin

LESSON 01

Introduction to awk

What is awk? Fields and records.

Beginner
LESSON 02

Field Extraction

$1, $2, $NF and field variables.

Beginner
LESSON 03

Patterns and Actions

Pattern matching with conditions.

Beginner
LESSON 04

Built-in Variables

NR, NF, FS, OFS, and more.

Beginner
LESSON 05

Print and Format

printf for formatted output.

Beginner
LESSON 06

Operators

Arithmetic and string operators.

Intermediate
LESSON 07

Variables

User-defined variables and arrays.

Intermediate
LESSON 08

Control Structures

if, while, for loops.

Intermediate
LESSON 09

Functions

String, mathematical, time functions.

Intermediate
LESSON 10

awk with Pipes

Combine awk with other commands.

Advanced
LESSON 11

Reports and Summaries

Create reports from log files.

Advanced
LESSON 12

awk in Scripts

Use awk in bash scripts.

Advanced

// Lesson 01: Introduction to awk

×

What is awk?

awk is a powerful text processing language. It reads input line by line, splits each line into fields, and lets you process or extract data based on patterns.

Basic Syntax

awk 'pattern { action }' file
awk 'pattern { action }'

Simple Examples

# Print all lines (default action when pattern matches)
awk '1' file.txt
awk '{ print }' file.txt

# Print specific fields
awk '{ print $1 }' file.txt

# Work with pipes
ps aux | awk '{ print $1, $11 }'

Fields and Records

  • Record: A line of text (separated by newline)
  • Field: A column within a record (separated by whitespace by default)
  • $0: The entire record (whole line)
  • $1, $2, ...: First, second, etc. fields
  • $NF: Last field (NF = number of fields)

Quiz

1. What does $0 represent in awk?

Show Answers
  1. The entire record (whole line)

// Lesson 02: Field Extraction

×

Accessing Fields

$1     First field
$2     Second field
$3     Third field
$NF    Last field (NF = number of fields)
$(NF-1)  Second to last field
$0     Entire line

Examples

# /etc/passwd extraction
awk -F: '{ print $1, $5 }' /etc/passwd

# Print first and last fields
ls -l | awk '{ print $1, $NF }'

# Print last field of each line
awk '{ print $NF }' file.txt

# Print everything except first field
awk '{ $1=""; print $0 }' file.txt

Changing Field Separator

# -F sets field separator
awk -F: '{ print $1 }' /etc/passwd

# CSV processing
awk -F, '{ print $2 }' data.csv

# Space as separator (default)
awk -F' ' '{ print $2 }' file.txt

Quiz

1. How do you print the last field in awk?

Show Answers
  1. $NF

// Lesson 03: Patterns and Actions

×

Pattern Matching

Patterns filter which lines are processed. Only lines matching the pattern get the action applied.

Examples

# Print lines containing 'root'
awk '/root/ { print }' /etc/passwd

# Print username and home for root
awk -F: '/root/ { print $1, $6 }' /etc/passwd

# Print lines where first field equals 'admin'
awk '$1 == "admin" { print }' file.txt

# Print lines where third field > 100
awk '$3 > 100 { print }' file.txt

# Print lines NOT containing 'nobody'
awk '!/nobody/ { print }' /etc/passwd

Regex Patterns

# Lines starting with 'a'
awk '/^a/ { print }' file.txt

# Lines ending with '.log'
awk '/\.log$/ { print }' file.txt

# Lines with 4 or more fields
awk 'NF >= 4 { print }' file.txt

Quiz

1. How do you negate a pattern in awk?

Show Answers
  1. Use ! before the pattern (e.g., !/pattern/)

// Lesson 04: Built-in Variables

×

Common awk Variables

NR    Number of current Record (line number)
NF    Number of Fields in current record
FS    Field Separator (default: whitespace)
RS    Record Separator (default: newline)
OFS   Output Field Separator (default: space)
ORS   Output Record Separator (default: newline)
$0    Entire current record
$1..$NF  Individual fields

Examples

# Add line numbers
awk '{ print NR, $0 }' file.txt

# Print number of fields per line
awk '{ print NF, $0 }' file.txt

# Change output separator to colon
awk -F: '{ print $1":"$3 }' /etc/passwd

# Same using OFS
awk -F: -v OFS=":" '{ print $1, $3 }' /etc/passwd

BEGIN and END Blocks

# BEGIN runs before processing
awk 'BEGIN { print "Processing..." } /pattern/ { print } END { print "Done" }'

# Initialize variables
awk 'BEGIN { sum=0 } { sum+=$1 } END { print sum }' file.txt

Quiz

1. What does NR represent in awk?

Show Answers
  1. Current record number (line number)

// Lesson 05: Print and Format

×

print Statement

# Basic print
awk '{ print $1 }' file.txt

# Print multiple items
awk '{ print $1, $2, $3 }' file.txt

# Concatenate without separator
awk '{ print $1 $2 }' file.txt

# Print literal text
awk '{ print "User:", $1, "Home:", $6 }' /etc/passwd

printf for Formatting

# printf doesn't add newline by default
awk '{ printf "%s ", $1 }' file.txt
awk '{ printf "%s\n", $1 }' file.txt

# Format specifiers
awk '{ printf "%-15s %10d\n", $1, $2 }' file.txt
# %-15s = left-aligned string, 15 chars
# %10d  = right-aligned integer, 10 chars

Format Specifiers

%s  String
%d  Integer
%f  Floating point
%-10s  Left-aligned, 10 chars
%10s  Right-aligned, 10 chars
%.2f  2 decimal places

Quiz

1. What does printf add that print doesn't?

Show Answers
  1. printf doesn't add newline by default

// Lesson 06: Operators

×

Arithmetic Operators

+  Addition
-  Subtraction
*  Multiplication
/  Division
%  Modulo
++ Increment
-- Decrement
+= Add and assign
-= Subtract and assign

Examples

# Add 10 to first field
awk '{ $1 = $1 + 10; print }' file.txt

# Calculate average
awk '{ sum+=$1; count++ } END { print sum/count }' file.txt

# Integer division
awk '{ print int($1 / $2) }' file.txt

String Operators

~   Matches regex
!~  Does not match regex
==  Equals
!=  Not equals
<   Less than
>   Greater than
<=  Less than or equal
>=  Greater than or equal

Quiz

1. What operator checks regex match?

Show Answers
  1. ~ (tilde)

// Lesson 07: Variables

×

User Variables

# Initialize and use
awk '{ sum = $1 + $2; print sum }' file.txt

# String variables
awk '{ name = $1; print "Hello " name }' file.txt

# Arrays
awk '{ arr[$1] = $2 } END { for (k in arr) print k, arr[k] }' file.txt

Array Examples

# Count occurrences
awk '{ count[$1]++ } END { for (item in count) print item, count[item] }' file.txt

# Sum by category
awk '{ sum[$2] += $3 } END { for (cat in sum) print cat, sum[cat] }' file.txt

# Check if key exists
if (key in array) print "exists"

Special getline

# Read from another file
awk '{ getline line < "other.txt"; print $0, line }' file.txt

Quiz

1. How do you count occurrences with awk arrays?

Show Answers
  1. arr[$1]++

// Lesson 08: Control Structures

×

if Statement

awk '{
    if ($3 > 100) {
        print "High:", $0
    } else if ($3 > 50) {
        print "Medium:", $0
    } else {
        print "Low:", $0
    }
}' file.txt

while Loop

awk '{
    i = 1
    while (i <= NF) {
        if ($i ~ /error/) print "Error in field", i
        i++
    }
}' file.txt

for Loop

# C-style for loop
awk '{
    for (i = 1; i <= NF; i++) {
        sum += $i
    }
    print sum/NF
}' file.txt

# For iterating arrays
awk '{ for (k in arr) print k, arr[k] }' file.txt

Quiz

1. What loop structure does awk support?

Show Answers
  1. while and for loops

// Lesson 09: Functions

×

String Functions

length(s)       Length of string
substr(s,i,n)   Substring from position i, length n
split(s,arr,sep)  Split string into array
gsub(r,s,t)     Global substitute
sub(r,s,t)      First substitute
tolower(s)      Convert to lowercase
toupper(s)      Convert to uppercase

Examples

# Replace first occurrence
awk '{ sub(/old/, "new"); print }' file.txt

# Replace all occurrences
awk '{ gsub(/old/, "new"); print }' file.txt

# Get substring (chars 1-5)
awk '{ print substr($1, 1, 5) }' file.txt

# Length of each line
awk '{ print length($0) }' file.txt

Math Functions

sin(x)   sine
cos(x)   cosine
sqrt(x)  square root
int(x)   integer part
rand()   random number 0-1
srand(x) seed for rand

Quiz

1. What function does global substitution?

Show Answers
  1. gsub()

// Lesson 10: awk with Pipes

×

Combining Commands

# ps with awk
ps aux | awk '{ print $1, $11 }' | sort | uniq

# df with awk
df -h | awk '{ print $5, $6 }' | sort -rn

# du with awk
du -sh /* | awk '{ print $1 }' | sort -h

# log analysis
cat app.log | awk '/ERROR/ { print $1, $2, $NF }'

Pipeline Examples

# Find top consumers
ps aux --sort=-%cpu | awk 'NR==1 { print } NR>1 && $3>10 { print }'

# Extract IP addresses
netstat -tuln | awk '/LISTEN/ { print $4 }'

# Sum column from output
grep "error" app.log | awk '{ sum+=$2 } END { print sum }'

Quiz

1. Why combine awk with other commands?

Show Answers
  1. To extract and process specific data from command output

// Lesson 11: Reports and Summaries

×

Log Report Example

awk 'BEGIN { print "=== Error Report ===" }
     /ERROR/ { errors++ }
     /WARNING/ { warnings++ }
     /INFO/ { info++ }
     END { 
         print "Errors:", errors 
         print "Warnings:", warnings
         print "Info:", info
     }' app.log

Sales Report

# sales.csv: product,region,amount
awk -F, '{
    product[$1] += $3
    region[$2] += $3
    total += $3
}
END {
    print "=== Sales by Product ==="
    for (p in product) print p, product[p]
    print "\n=== Sales by Region ==="
    for (r in region) print r, region[r]
    print "\nTotal:", total
}' sales.csv

Quiz

1. When does the END block execute?

Show Answers
  1. After all input has been processed

// Lesson 12: awk in Scripts

×

Standalone Scripts

#!/usr/bin/awk -f
# Calculate statistics

BEGIN {
    FS=","
    print "Processing data..."
}

{
    sum += $2
    count++
    if ($2 > max) max = $2
    if ($2 < min || min == 0) min = $2
}

END {
    print "Records:", count
    print "Average:", sum/count
    print "Max:", max
    print "Min:", min
}

Bash Script Example

#!/bin/bash
# Process log file

LOGFILE=${1:-app.log}

echo "=== Log Summary ==="
awk '/ERROR/ {e++} /WARNING/ {w++} /INFO/ {i++} 
     END {print "Errors:", e+0, "Warnings:", w+0, "Info:", i+0}' "$LOGFILE"

echo "=== Top 5 Error Messages ==="
awk '/ERROR/ {print $NF}' "$LOGFILE" | sort | uniq -c | sort -rn | head -5

Congratulations!

You've mastered awk! You now understand:

  • Field and record processing
  • Pattern matching and conditionals
  • Built-in variables (NR, NF, FS, OFS)
  • print and printf formatting
  • Arithmetic and string operations
  • Variables and arrays
  • Control structures (if, while, for)
  • String and math functions
  • Creating reports and summaries

// Why awk

awk is a complete text processing language. When you need to extract data, create reports, or transform structured text, awk handles it elegantly.

Master awk and you'll turn raw log files and CSV data into meaningful insights instantly.

Extract. Analyze. Report.

// Tools & References

awk Man Page

Official documentation

man awk

GAWK Manual

GNU awk guide

GAWK

awk One-liners

Common awk commands

QuickRef

awk Tutorial

Learning awk

Grymoire