wget Complete Guide: File Downloads, Batch Processing, and Automation

Q: How do I resume an interrupted wget download?

Run `wget -c URL` with the same URL. wget checks the local file size and requests only the remaining bytes from the server. This requires the server to support HTTP Range requests.

Q: How do I mirror an entire website with wget?

Use `wget --mirror --convert-links --adjust-extension --page-requisites --no-parent URL`. The `--mirror` flag enables recursive downloading with infinite depth and timestamp checking, while `--convert-links` rewrites URLs for offline browsing.

Q: What's the difference between wget2 and wget 1.x?

[wget2](https://gitlab.com/gnuwget/wget2) is the next-generation rewrite of GNU Wget. It adds HTTP/2 support, parallel downloads, multi-threading, and improved performance. However, it's not fully compatible with wget 1.x scripts, so test before migrating.

Q: How do I use wget through a proxy?

Set the `http_proxy` and `https_proxy` environment variables, or add `http_proxy` / `https_proxy` directives to `~/.wgetrc`. Use `--no-proxy` to temporarily bypass proxy settings.

Q: How do I limit wget download speed?

Use `--limit-rate=1m` (1 MB/s) or `--limit-rate=500k` (500 KB/s). To set a default limit, add `limit_rate = 2m` to your `~/.wgetrc` file.

Q: What's the difference between wget -O and -P?

`-O filename` specifies the exact output filename. `-P directory` specifies the destination directory, and the filename is derived from the URL automatically.

Q: What's the difference between wget -O and -P?

`-O filename` specifies the exact output filename. `-P directory` specifies the destination directory, and the filename is derived from the URL automatically.

wget is a CLI tool for downloading files over HTTP, HTTPS, and FTP. Run wget URL to download a file, use -c to resume interrupted transfers, -i for batch downloads from a URL list, and --mirror to clone entire websites for offline viewing.

Whether it's a single file, a batch of hundreds, or mirroring an entire website, wget handles it from the terminal. No GUI needed. Works over SSH. Can run in the background, resume interrupted downloads, and be scripted for automation.

This guide covers everything from the basics to real-world use cases, with ready-to-run examples throughout.

What is wget?

wget is a command-line utility for downloading files over HTTP, HTTPS, and FTP. It's part of the GNU Project and comes pre-installed on virtually every Linux distribution and available on macOS via Homebrew.

Key characteristics:

Non-interactive: runs completely unattended, perfect for scripts and cron jobs
Resumable: pick up where you left off after an interrupted download
Recursive: can crawl and download entire websites
Proxy-aware: works through HTTP proxies
Background-capable: detach from the terminal and download continues

Installation check

Verify wget is installed:

bash

wget --version

If it's missing:

bash

# Debian / Ubuntu
sudo apt install wget

# CentOS / RHEL
sudo yum install wget

# Fedora
sudo dnf install wget

# macOS (Homebrew)
brew install wget

Basic usage

Download a single file

bash

wget https://example.com/file.zip

The file saves to the current directory. A progress bar shows download speed, amount downloaded, and estimated time remaining.

Specify the output filename or directory

bash

# Save with a different name
wget -O myfile.zip https://example.com/file.zip

# Save to a specific directory
wget -P ~/downloads/ https://example.com/file.zip

# Custom directory and filename (use full path with -O)
wget -O ~/downloads/setup.zip https://example.com/file.zip

Run in the background

When downloading large files, detach from the terminal so you can keep working:

bash

wget -b https://example.com/largefile.iso

Output goes to wget-log. Monitor progress with:

bash

tail -f wget-log

Common options reference

Option	What it does
`-O FILE`	Save as FILE
`-P DIR`	Save into directory DIR
`-b`	Background mode
`-c`	Continue/resume interrupted download
`-q`	Quiet mode (no output)
`--limit-rate=RATE`	Limit speed (e.g. `--limit-rate=1m`)
`-r`	Recursive download
`-l DEPTH`	Set recursion depth
`--no-check-certificate`	Skip SSL verification
`-i FILE`	Download URLs listed in FILE
`--user-agent=STRING`	Set custom User-Agent
`--header=STRING`	Add HTTP header
`-N`	Only download if newer than local copy
`--tries=N`	Number of retry attempts
`--timeout=SECONDS`	Set connection timeout

Real-world use cases

Batch download from a URL list

Create a text file with one URL per line, then pass it to wget with -i:

bash

cat > urls.txt << EOF
https://releases.ubuntu.com/24.04.4/ubuntu-24.04.4-desktop-amd64.iso
https://example.com/data/january.csv
https://example.com/data/february.csv
https://example.com/data/march.csv
EOF

wget -i urls.txt -P ~/downloads/

Each URL is downloaded in sequence. Combine with -b to run the whole batch in the background.

Resume an interrupted download

If a large download gets cut off by a network hiccup, just add -c and re-run the same command:

bash

# Original download (interrupted)
wget https://example.com/bigfile.iso

# Resume from where it stopped
wget -c https://example.com/bigfile.iso

wget checks the local file size and requests only the remaining bytes. If no partial file exists, it starts fresh.

Throttle the download speed

Avoid saturating your connection or being rate-limited by the server:

bash

# Limit to 1 MB/s
wget --limit-rate=1m https://example.com/file.iso

# Limit to 500 KB/s
wget --limit-rate=500k https://example.com/file.iso

Authenticate with username and password

For HTTP Basic Auth:

bash

wget --user=myusername --password=mypassword https://example.com/protected/file.zip

Passwords in command arguments end up in shell history. If that's a concern, omit --password and wget will prompt for it interactively:

bash

wget --user=myusername https://example.com/protected/file.zip
# prompts: Password for 'myusername'@example.com:

For FTP:

bash

wget ftp://ftp.example.com/pub/file.tar.gz
wget --ftp-user=user --ftp-password=pass ftp://ftp.example.com/private/file.tar.gz

Mirror an entire website

Download a complete copy of a site for offline viewing or archiving:

bash

wget --mirror \
     --convert-links \
     --adjust-extension \
     --page-requisites \
     --no-parent \
     https://example.com/

What each flag does:

--mirror (or -m): recursive download + timestamps + infinite depth
--convert-links: rewrite links to work offline (absolute → relative)
--adjust-extension: add .html to pages that need it
--page-requisites: fetch CSS, images, JS — everything needed to render the page
--no-parent: don't go above the specified path

The downloaded site will be in a directory named after the domain.

Only download if the file has changed

Poll a URL and only download when the server's version is newer than your local copy:

bash

wget -N https://example.com/data.csv

This is great for keeping local data files in sync. Pair it with cron for scheduled updates:

bash

# crontab -e: run daily at 3 AM
0 3 * * * wget -N -q -P /var/data/ https://example.com/data.csv

Set a custom User-Agent

Some servers block requests from wget. Impersonate a browser:

bash

wget --user-agent="Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/115.0" \
     https://example.com/file.zip

Add custom HTTP headers

Useful for APIs that require an Authorization token:

bash

wget --header="Authorization: Bearer YOUR_API_TOKEN" \
     --header="Accept: application/json" \
     https://api.example.com/export/data.json

wget vs curl: which one to use?

Both wget and curl download content from URLs, but they have different strengths:

Use case	wget	curl
Simple file download	Great	Fine
Recursive / site mirroring	Great	Not supported
Resume interrupted downloads	Great	Great
API requests	Limited	Great
Handling response data	Limited	Great
Background download	Built-in (`-b`)	Needs `& disown`
Batch from URL list	Built-in (`-i`)	Needs a script
HTTP method control	Limited	Full control

Rule of thumb:

Downloading files → wget
Calling APIs, inspecting responses, complex HTTP → curl
Scripting with flexible output handling → curl

Advanced techniques

Check for broken links with --spider

Verify URLs without actually downloading anything. Useful for checking broken links on websites.

bash

# Check a single URL
wget --spider https://example.com/page.html

bash

# Batch check from a URL list
wget --spider -i urls.txt 2>&1 | grep -E "broken|200|404"

bash

# Recursively check an entire site (no downloads)
wget --spider --recursive --no-directories --level=2 \
     https://example.com/ 2>&1 | grep -B1 "broken"

--spider sends HEAD requests only, using almost no bandwidth. Useful in CI/CD pipelines for automated broken link detection on documentation sites.

Persist settings with wgetrc

Stop typing the same options every time by saving defaults in ~/.wgetrc.

text

# ~/.wgetrc
# Preserve timestamps
timestamping = on

# Retry count
tries = 3

# Connection timeout (seconds)
timeout = 30

# Default bandwidth limit
limit_rate = 2m

# Log file
logfile = /tmp/wget.log

# Respect robots.txt
robots = on

bash

# wgetrc settings apply automatically
wget https://example.com/file.zip

# Temporarily override settings
wget --no-config https://example.com/file.zip
wget --limit-rate=0 https://example.com/file.zip

Use --no-config to ignore wgetrc entirely. For team-wide settings, use /etc/wgetrc.

Log rejected URLs with --rejected-log

When downloading many files, track what failed for later review.

bash

# Log rejected URLs
wget --recursive --level=1 \
     --rejected-log=rejected.log \
     https://example.com/downloads/

# Retry only the failed URLs
cat rejected.log | awk '{print $1}' | wget -i -

Download large files in the background

bash

# -b runs in background (log goes to wget-log)
wget -b https://example.com/large-file.iso

# Monitor progress
tail -f wget-log

# Specify a custom log file
wget -b -o download.log https://example.com/large-file.iso

If you need downloads to survive SSH disconnections, combine with tmux or use nohup.

bash

nohup wget https://example.com/large-file.iso &

Scripting examples

wget pairs naturally with shell scripts. Here are some practical patterns.

Download multiple versions of a file

bash

#!/bin/bash

BASE_URL="https://example.com/releases"
VERSIONS=("1.0.0" "1.1.0" "1.2.0" "2.0.0")
DEST_DIR="./downloads"

mkdir -p "$DEST_DIR"

for VERSION in "${VERSIONS[@]}"; do
  FILE="myapp-${VERSION}.tar.gz"
  URL="${BASE_URL}/${VERSION}/${FILE}"

  echo "Downloading ${FILE}..."
  wget -q --show-progress -P "$DEST_DIR" "$URL"

  if [ $? -eq 0 ]; then
    echo "  OK: ${FILE}"
  else
    echo "  FAILED: ${FILE}"
  fi
done

echo "Done."

Download and verify checksum

bash

#!/bin/bash

# Check the official Ubuntu release page for the current version and SHA256 hash:
# https://releases.ubuntu.com/
URL="https://releases.ubuntu.com/24.04.4/ubuntu-24.04.4-desktop-amd64.iso"
EXPECTED_SHA256="<get the current hash from the official Ubuntu site>"

echo "Downloading..."
wget -q --show-progress -O ubuntu.iso "$URL"

echo "Verifying checksum..."
# Linux: sha256sum / macOS: shasum -a 256
if command -v sha256sum &>/dev/null; then
  ACTUAL_SHA256=$(sha256sum ubuntu.iso | awk '{print $1}')
else
  ACTUAL_SHA256=$(shasum -a 256 ubuntu.iso | awk '{print $1}')
fi

if [ "$ACTUAL_SHA256" = "$EXPECTED_SHA256" ]; then
  echo "Checksum OK — file is valid"
else
  echo "Checksum mismatch! File may be corrupted."
  exit 1
fi

Troubleshooting

SSL certificate errors

bash

# Skip verification (fine for testing, avoid in production)
wget --no-check-certificate https://example.com/file.zip

Connection timeouts or flaky servers

bash

# 30 second timeout, retry up to 5 times with a 10 second wait between retries
wget --timeout=30 --tries=5 --waitretry=10 https://example.com/file.zip

# Retry indefinitely (useful for very large downloads on unreliable connections)
wget --tries=0 https://example.com/file.zip

Check redirect chain without downloading

Inspect headers and see where a URL redirects to, without actually downloading anything:

bash

wget --server-response --spider https://example.com/file.zip

Download stalls at 0 bytes

Sometimes servers send a response but no data. Try adding a User-Agent or check if the URL requires authentication.

Security Considerations

When downloading files from external sources, security matters.

wget2 vulnerability (CVE-2025-69194): A high-severity vulnerability (CVSS 8.8) was found in GNU Wget2's Metalink document processing. When using --force-metalink, a path traversal flaw allows remote attackers to overwrite local files. If you use wget2, update to 2.2.1 or later.

Best practices:

Never use --no-check-certificate in production — limit it to testing with self-signed certs
Always verify checksums for files downloaded from untrusted sources (see the script example above)
User-Agent spoofing with --user-agent may violate a site's terms of service — check before scraping
When using cron for automated downloads, keep logs and monitor them periodically

curl Complete Guide — use curl for API communication and HTTP debugging, wget for file downloads
SSH & rsync Practical Guide — rsync for syncing files with remote servers
cron & systemd timer Guide — automate periodic downloads with wget -N
Shell Scripting Guide — embed wget in automation scripts
tar & Compression Guide — extract downloaded archives
CLI Toolkit — the big picture of CLI tools and when to use each

FAQ

When should I use wget instead of curl?

Use wget for straightforward file downloads, batch processing with URL lists (-i), recursive site mirroring, and background downloads. Use curl when you need fine-grained control over HTTP methods, response handling, or API interactions.

How do I resume an interrupted wget download?

Run wget -c URL with the same URL. wget checks the local file size and requests only the remaining bytes from the server. This requires the server to support HTTP Range requests.

How do I mirror an entire website with wget?

Use wget --mirror --convert-links --adjust-extension --page-requisites --no-parent URL. The --mirror flag enables recursive downloading with infinite depth and timestamp checking, while --convert-links rewrites URLs for offline browsing.

What's the difference between wget2 and wget 1.x?

wget2 is the next-generation rewrite of GNU Wget. It adds HTTP/2 support, parallel downloads, multi-threading, and improved performance. However, it's not fully compatible with wget 1.x scripts, so test before migrating.

How do I use wget through a proxy?

Set the http_proxy and https_proxy environment variables, or add http_proxy / https_proxy directives to ~/.wgetrc. Use --no-proxy to temporarily bypass proxy settings.

How do I limit wget download speed?

Use --limit-rate=1m (1 MB/s) or --limit-rate=500k (500 KB/s). To set a default limit, add limit_rate = 2m to your ~/.wgetrc file.

What's the difference between wget -O and -P?

-O filename specifies the exact output filename. -P directory specifies the destination directory, and the filename is derived from the URL automatically.

Wrapping Up

wget is one of those tools you reach for daily once you're comfortable with it. The basics take five minutes to learn, and the advanced features are there when you need them. For the full option reference, check the official GNU Wget manual.

Quick summary of the most useful options:

Basic download: wget URL
Rename output: wget -O filename URL
Save to directory: wget -P /path/ URL
Background: wget -b URL
Resume: wget -c URL
Batch from list: wget -i urls.txt
Mirror a site: wget --mirror --convert-links --page-requisites URL
Conditional update: wget -N URL

Start with the basics, then reach for the advanced flags as specific needs come up. And when you find yourself repeatedly downloading files in a workflow, consider wrapping wget in a short shell script — it compounds nicely.

What is wget?

Installation check

Basic usage

Download a single file

Specify the output filename or directory

Run in the background

Common options reference

Real-world use cases

Batch download from a URL list

Resume an interrupted download

Throttle the download speed

Authenticate with username and password

Mirror an entire website

Only download if the file has changed

Set a custom User-Agent

Add custom HTTP headers

wget vs curl: which one to use?

Advanced techniques

Check for broken links with --spider

Persist settings with wgetrc

Log rejected URLs with --rejected-log

Download large files in the background

Scripting examples

Download multiple versions of a file

Download and verify checksum

Troubleshooting

Security Considerations

Related Articles

FAQ

When should I use wget instead of curl?

How do I resume an interrupted wget download?

How do I mirror an entire website with wget?

What's the difference between wget2 and wget 1.x?

How do I use wget through a proxy?

How do I limit wget download speed?

What's the difference between wget -O and -P?

Wrapping Up