32blogby StudioMitsu

sed & awk Practical Guide: The Classic Text Processing Duo

Master sed substitution, deletion, and insertion alongside awk column extraction, aggregation, and conditional processing with real examples.

9 min read
sedawkCLILinuxbashtext-processing
On this page

"I need to bulk-replace values across config files." "I want to extract specific columns from a CSV and calculate totals." "I need to strip certain lines from a log file."

These are classic text processing scenarios where sed and awk shine. Both are filter commands designed for pipelines, but they excel at distinctly different things.

This article walks you through sed and awk fundamentals, practical patterns, and how to combine them into powerful workflows.

sed and awk: What's the difference?

Here is the quick breakdown.

sedawk
Full nameStream EditorPattern scanning and processing language
StrengthLine-oriented text transformation (substitution, deletion, insertion)Column-oriented data processing (extraction, calculation, aggregation)
Mental model"Rewrite text""Extract data from text"
Typical useConfig file updates, log cleanupCSV/TSV aggregation, access log analysis

Both are pipeline filter commands connected with |. sed excels at transformation, awk excels at extraction and calculation. Keep this distinction in mind and you will always know which tool to reach for.

sed basics

sed — the "stream editor" — reads text line by line, applies transformations, and outputs the result.

Substitution

The most common operation is the s (substitute) command.

bash
# Replace the first match on each line
sed 's/old/new/' file.txt

# Replace all matches on each line (g flag)
sed 's/old/new/g' file.txt

# Edit the file in place (-i option)
sed -i 's/old/new/g' file.txt

Line and range selection

bash
# Substitute only on line 3
sed '3s/old/new/' file.txt

# Substitute on lines 2 through 5
sed '2,5s/old/new/g' file.txt

# Substitute only on the last line
sed '$s/old/new/' file.txt

Deletion, insertion, and printing

bash
# Delete lines matching a pattern
sed '/pattern/d' file.txt

# Insert a line after the match
sed '/pattern/a\new line' file.txt

# Insert a line before the match
sed '/pattern/i\new line' file.txt

# Print only a specific range of lines (-n suppresses default output)
sed -n '10,20p' file.txt

Multiple operations and custom delimiters

bash
# Run multiple substitutions in one pass
sed -e 's/foo/bar/g' -e 's/baz/qux/g' file.txt

# Use an alternative delimiter (handy for paths and URLs)
sed 's|/usr/local|/opt|g' config.txt

The delimiter can be any character — |, #, @, and so on. This eliminates the need to escape / when working with file paths.

sed in practice

Bulk config file updates

A common scenario during server migrations or deployments.

bash
# Update DB_HOST in a .env file
NEW_HOST="db-prod.example.com"
sed -i "s/DB_HOST=.*/DB_HOST=${NEW_HOST}/" .env

# Comment out a line
sed -i 's/^PermitRootLogin yes/# PermitRootLogin yes/' /etc/ssh/sshd_config

# Uncomment a line
sed -i 's/^# *PermitRootLogin/PermitRootLogin/' /etc/ssh/sshd_config

Log file cleanup

bash
# Remove empty lines
sed '/^$/d' file.txt

# Strip leading and trailing whitespace
sed 's/^[[:space:]]*//;s/[[:space:]]*$//' file.txt

# Remove ANSI escape sequences (color codes)
sed 's/\x1b\[[0-9;]*m//g' colored-output.txt

Batch operations with find

bash
# Replace the year across all .txt files in the project
find . -name "*.txt" -exec sed -i 's/2025/2026/g' {} +

# Generate .env from .env.example while replacing values
sed 's/DB_PASSWORD=changeme/DB_PASSWORD=s3cur3P@ss/' .env.example > .env

awk basics

awk processes text at the field (column) level. Each line is automatically split on whitespace, and fields are accessible as $1, $2, and so on.

Note that gawk 5.4.0 switched the default regex engine to MinRX, which is fully POSIX-compliant. Patterns relying on GNU-specific regex extensions may behave differently. To use the legacy engine, set the environment variable GAWK_GNU_MATCHERS=1.

Column extraction

bash
# Print the 1st and 3rd columns
awk '{print $1, $3}' data.tsv

# Specify a delimiter (for CSV)
awk -F',' '{print $1, $2}' data.csv

# Print the entire line ($0 is the whole line)
awk '{print $0}' file.txt

Pattern matching and conditions

bash
# Print only lines containing "error"
awk '/error/ {print $0}' logfile.txt

# Print lines where the 3rd column exceeds 100
awk '$3 > 100 {print $1, $3}' data.txt

# Combine multiple conditions
awk '$3 > 100 && $2 == "active" {print $1, $3}' data.txt

Built-in variables

awk provides several useful built-in variables.

VariableMeaning
NRCurrent line number (Number of Records)
NFNumber of fields on the current line (Number of Fields)
FSInput field separator (Field Separator)
OFSOutput field separator
bash
# Print with line numbers
awk '{print NR": "$0}' file.txt

# Show the field count for each line
awk '{print NR": "NF" fields"}' data.txt

# Print the last field
awk '{print $NF}' data.txt

BEGIN / END blocks and aggregation

bash
# Add a header and footer
awk 'BEGIN {print "Name,Score"} {print $1","$3} END {print "---done---"}' data.txt

# Sum the 2nd column
awk '{sum += $2} END {print "Total:", sum}' sales.txt

# Calculate the average
awk '{sum += $2; count++} END {print "Average:", sum/count}' sales.txt

Formatted output with printf

bash
# Left-align 20 chars, right-align 10 chars with 2 decimal places
awk '{printf "%-20s %10.2f\n", $1, $3}' data.txt

printf uses the same format specifiers as C. It is invaluable when you need neatly aligned tabular output.

awk in practice

CSV/TSV data aggregation

bash
# Sales totals by category (using associative arrays)
# Input: category,product,amount CSV
awk -F',' '{
    sales[$1] += $3
}
END {
    for (cat in sales)
        printf "%-15s %10.0f\n", cat, sales[cat]
}' sales.csv
bash
# Calculate max, min, and average in one pass
awk 'BEGIN {max = -999999; min = 999999}
{
    sum += $2
    count++
    if ($2 > max) max = $2
    if ($2 < min) min = $2
}
END {
    printf "Max: %.2f  Min: %.2f  Avg: %.2f\n", max, min, sum/count
}' data.txt

Access log analysis

bash
# Top 10 IP addresses by request count
awk '{count[$1]++} END {for (ip in count) print count[ip], ip}' access.log | sort -rn | head -10
bash
# HTTP status code breakdown
# CLF format: IP - - [date] "method path proto" status size
awk '{print $9}' access.log | sort | uniq -c | sort -rn

Formatted output

bash
# Display user information in table format
awk -F':' 'BEGIN {
    printf "%-20s %-6s %-6s %s\n", "USER", "UID", "GID", "HOME"
    printf "%-20s %-6s %-6s %s\n", "----", "---", "---", "----"
}
$3 >= 1000 && $3 < 65534 {
    printf "%-20s %-6s %-6s %s\n", $1, $3, $4, $6
}' /etc/passwd

Combining sed and awk

sed and awk reach their full potential when piped together. The typical pattern is sed for preprocessing (formatting, filtering) followed by awk for aggregation.

bash
# Remove the header row from a CSV, then sum the 3rd column
sed '1d' sales.csv | awk -F',' '{sum += $3} END {print "Total:", sum}'
bash
# Extract ERROR lines from a log, then display only the timestamp and message
sed -n '/ERROR/p' app.log | awk '{print $1, $2, substr($0, index($0,$5))}'
bash
# Extract bash users from /etc/passwd and format the output
grep '/bash$' /etc/passwd | awk -F':' '{printf "%-15s UID=%-6s %s\n", $1, $3, $6}'
bash
# Count today's errors in syslog by service
sed -n "/$(date '+%b %e')/p" /var/log/syslog | awk '/error|fail/ {count[$5]++} END {for (s in count) print count[s], s}' | sort -rn

Modern alternative: sd

sd is a Rust-powered sed alternative. Compared to sed's s/old/new/g syntax, sd requires less escaping and reads more naturally.

Installation

bash
# In a WSL environment, same as Linux
cargo install sd

Basic usage

bash
# Replace via stdin
echo "hello world" | sd 'world' 'earth'

# Edit a file in place
sd 'old' 'new' file.txt

sed vs sd comparison

Operationsedsd
Basic replacementsed 's/foo/bar/g'sd 'foo' 'bar'
Path replacementsed 's|/usr/local|/opt|g'sd '/usr/local' '/opt'
Regex groupssed 's/\(foo\)/[\1]/g'sd '(foo)' '[$1]'
In-place editsed -i 's/foo/bar/g' filesd 'foo' 'bar' file

sd uses PCRE-style regex by default, so there is no need for \( \) escaping. Path replacements also work without changing the delimiter.

bash
# Multi-line matching (v1.1.0+)
sd --across 'start\n.*\nend' 'replaced' file.txt

If you are already comfortable with sed, there is no need to switch. However, for one-liners where regex escaping gets messy, sd significantly reduces the chance of errors. A practical rule of thumb: use sd for simple replacements, stick with sed for line addressing and complex scripts.

Wrapping Up

sed and awk are the classic duo for text processing.

  • sed — Line-oriented substitution, deletion, and insertion. Perfect for config updates and log cleanup
  • awk — Column-oriented extraction, calculation, and aggregation. Ideal for CSV/TSV processing and access log analysis
  • sed + awk — Pipe them together for preprocessing followed by aggregation. sed shapes, awk computes
  • sd — A modern alternative for when sed syntax gets unwieldy. Less escaping, cleaner one-liners

Both tools have nearly 50 years of history and are installed on every Linux system. Start with sed's s command and awk's {print $1}, then build your pattern repertoire from there.

For text searching, pair these with grep and ripgrep. For JSON processing, check out jq. And for a bird's-eye view of all CLI tools, see the CLI Toolkit.