"I need to bulk-replace values across config files." "I want to extract specific columns from a CSV and calculate totals." "I need to strip certain lines from a log file."
These are classic text processing scenarios where sed and awk shine. Both are filter commands designed for pipelines, but they excel at distinctly different things.
This article walks you through sed and awk fundamentals, practical patterns, and how to combine them into powerful workflows.
sed and awk: What's the difference?
Here is the quick breakdown.
| sed | awk | |
|---|---|---|
| Full name | Stream Editor | Pattern scanning and processing language |
| Strength | Line-oriented text transformation (substitution, deletion, insertion) | Column-oriented data processing (extraction, calculation, aggregation) |
| Mental model | "Rewrite text" | "Extract data from text" |
| Typical use | Config file updates, log cleanup | CSV/TSV aggregation, access log analysis |
Both are pipeline filter commands connected with |. sed excels at transformation, awk excels at extraction and calculation. Keep this distinction in mind and you will always know which tool to reach for.
sed basics
sed — the "stream editor" — reads text line by line, applies transformations, and outputs the result.
Substitution
The most common operation is the s (substitute) command.
# Replace the first match on each line
sed 's/old/new/' file.txt
# Replace all matches on each line (g flag)
sed 's/old/new/g' file.txt
# Edit the file in place (-i option)
sed -i 's/old/new/g' file.txt
Line and range selection
# Substitute only on line 3
sed '3s/old/new/' file.txt
# Substitute on lines 2 through 5
sed '2,5s/old/new/g' file.txt
# Substitute only on the last line
sed '$s/old/new/' file.txt
Deletion, insertion, and printing
# Delete lines matching a pattern
sed '/pattern/d' file.txt
# Insert a line after the match
sed '/pattern/a\new line' file.txt
# Insert a line before the match
sed '/pattern/i\new line' file.txt
# Print only a specific range of lines (-n suppresses default output)
sed -n '10,20p' file.txt
Multiple operations and custom delimiters
# Run multiple substitutions in one pass
sed -e 's/foo/bar/g' -e 's/baz/qux/g' file.txt
# Use an alternative delimiter (handy for paths and URLs)
sed 's|/usr/local|/opt|g' config.txt
The delimiter can be any character — |, #, @, and so on. This eliminates the need to escape / when working with file paths.
sed in practice
Bulk config file updates
A common scenario during server migrations or deployments.
# Update DB_HOST in a .env file
NEW_HOST="db-prod.example.com"
sed -i "s/DB_HOST=.*/DB_HOST=${NEW_HOST}/" .env
# Comment out a line
sed -i 's/^PermitRootLogin yes/# PermitRootLogin yes/' /etc/ssh/sshd_config
# Uncomment a line
sed -i 's/^# *PermitRootLogin/PermitRootLogin/' /etc/ssh/sshd_config
Log file cleanup
# Remove empty lines
sed '/^$/d' file.txt
# Strip leading and trailing whitespace
sed 's/^[[:space:]]*//;s/[[:space:]]*$//' file.txt
# Remove ANSI escape sequences (color codes)
sed 's/\x1b\[[0-9;]*m//g' colored-output.txt
Batch operations with find
# Replace the year across all .txt files in the project
find . -name "*.txt" -exec sed -i 's/2025/2026/g' {} +
# Generate .env from .env.example while replacing values
sed 's/DB_PASSWORD=changeme/DB_PASSWORD=s3cur3P@ss/' .env.example > .env
awk basics
awk processes text at the field (column) level. Each line is automatically split on whitespace, and fields are accessible as $1, $2, and so on.
Note that gawk 5.4.0 switched the default regex engine to MinRX, which is fully POSIX-compliant. Patterns relying on GNU-specific regex extensions may behave differently. To use the legacy engine, set the environment variable GAWK_GNU_MATCHERS=1.
Column extraction
# Print the 1st and 3rd columns
awk '{print $1, $3}' data.tsv
# Specify a delimiter (for CSV)
awk -F',' '{print $1, $2}' data.csv
# Print the entire line ($0 is the whole line)
awk '{print $0}' file.txt
Pattern matching and conditions
# Print only lines containing "error"
awk '/error/ {print $0}' logfile.txt
# Print lines where the 3rd column exceeds 100
awk '$3 > 100 {print $1, $3}' data.txt
# Combine multiple conditions
awk '$3 > 100 && $2 == "active" {print $1, $3}' data.txt
Built-in variables
awk provides several useful built-in variables.
| Variable | Meaning |
|---|---|
NR | Current line number (Number of Records) |
NF | Number of fields on the current line (Number of Fields) |
FS | Input field separator (Field Separator) |
OFS | Output field separator |
# Print with line numbers
awk '{print NR": "$0}' file.txt
# Show the field count for each line
awk '{print NR": "NF" fields"}' data.txt
# Print the last field
awk '{print $NF}' data.txt
BEGIN / END blocks and aggregation
# Add a header and footer
awk 'BEGIN {print "Name,Score"} {print $1","$3} END {print "---done---"}' data.txt
# Sum the 2nd column
awk '{sum += $2} END {print "Total:", sum}' sales.txt
# Calculate the average
awk '{sum += $2; count++} END {print "Average:", sum/count}' sales.txt
Formatted output with printf
# Left-align 20 chars, right-align 10 chars with 2 decimal places
awk '{printf "%-20s %10.2f\n", $1, $3}' data.txt
printf uses the same format specifiers as C. It is invaluable when you need neatly aligned tabular output.
awk in practice
CSV/TSV data aggregation
# Sales totals by category (using associative arrays)
# Input: category,product,amount CSV
awk -F',' '{
sales[$1] += $3
}
END {
for (cat in sales)
printf "%-15s %10.0f\n", cat, sales[cat]
}' sales.csv
# Calculate max, min, and average in one pass
awk 'BEGIN {max = -999999; min = 999999}
{
sum += $2
count++
if ($2 > max) max = $2
if ($2 < min) min = $2
}
END {
printf "Max: %.2f Min: %.2f Avg: %.2f\n", max, min, sum/count
}' data.txt
Access log analysis
# Top 10 IP addresses by request count
awk '{count[$1]++} END {for (ip in count) print count[ip], ip}' access.log | sort -rn | head -10
# HTTP status code breakdown
# CLF format: IP - - [date] "method path proto" status size
awk '{print $9}' access.log | sort | uniq -c | sort -rn
Formatted output
# Display user information in table format
awk -F':' 'BEGIN {
printf "%-20s %-6s %-6s %s\n", "USER", "UID", "GID", "HOME"
printf "%-20s %-6s %-6s %s\n", "----", "---", "---", "----"
}
$3 >= 1000 && $3 < 65534 {
printf "%-20s %-6s %-6s %s\n", $1, $3, $4, $6
}' /etc/passwd
Combining sed and awk
sed and awk reach their full potential when piped together. The typical pattern is sed for preprocessing (formatting, filtering) followed by awk for aggregation.
# Remove the header row from a CSV, then sum the 3rd column
sed '1d' sales.csv | awk -F',' '{sum += $3} END {print "Total:", sum}'
# Extract ERROR lines from a log, then display only the timestamp and message
sed -n '/ERROR/p' app.log | awk '{print $1, $2, substr($0, index($0,$5))}'
# Extract bash users from /etc/passwd and format the output
grep '/bash$' /etc/passwd | awk -F':' '{printf "%-15s UID=%-6s %s\n", $1, $3, $6}'
# Count today's errors in syslog by service
sed -n "/$(date '+%b %e')/p" /var/log/syslog | awk '/error|fail/ {count[$5]++} END {for (s in count) print count[s], s}' | sort -rn
Modern alternative: sd
sd is a Rust-powered sed alternative. Compared to sed's s/old/new/g syntax, sd requires less escaping and reads more naturally.
Installation
# In a WSL environment, same as Linux
cargo install sd
Basic usage
# Replace via stdin
echo "hello world" | sd 'world' 'earth'
# Edit a file in place
sd 'old' 'new' file.txt
sed vs sd comparison
| Operation | sed | sd |
|---|---|---|
| Basic replacement | sed 's/foo/bar/g' | sd 'foo' 'bar' |
| Path replacement | sed 's|/usr/local|/opt|g' | sd '/usr/local' '/opt' |
| Regex groups | sed 's/\(foo\)/[\1]/g' | sd '(foo)' '[$1]' |
| In-place edit | sed -i 's/foo/bar/g' file | sd 'foo' 'bar' file |
sd uses PCRE-style regex by default, so there is no need for \( \) escaping. Path replacements also work without changing the delimiter.
# Multi-line matching (v1.1.0+)
sd --across 'start\n.*\nend' 'replaced' file.txt
If you are already comfortable with sed, there is no need to switch. However, for one-liners where regex escaping gets messy, sd significantly reduces the chance of errors. A practical rule of thumb: use sd for simple replacements, stick with sed for line addressing and complex scripts.
Wrapping Up
sed and awk are the classic duo for text processing.
- sed — Line-oriented substitution, deletion, and insertion. Perfect for config updates and log cleanup
- awk — Column-oriented extraction, calculation, and aggregation. Ideal for CSV/TSV processing and access log analysis
- sed + awk — Pipe them together for preprocessing followed by aggregation. sed shapes, awk computes
- sd — A modern alternative for when sed syntax gets unwieldy. Less escaping, cleaner one-liners
Both tools have nearly 50 years of history and are installed on every Linux system. Start with sed's s command and awk's {print $1}, then build your pattern repertoire from there.
For text searching, pair these with grep and ripgrep. For JSON processing, check out jq. And for a bird's-eye view of all CLI tools, see the CLI Toolkit.