32blogby Studio Mitsu

fqmpeg Audio Processing: Levels, EQ & Dynamics (26 Verbs)

Twenty-six fqmpeg verbs for audio: levels, loudness, EQ, dynamics, pitch, sync, and silence — source-verified defaults, dry-run output, and recipes.

by omitsu23 min read
On this page

fqmpeg's C9 cluster is the audio engine — 26 verbs that touch the audio track of any media file. Five do the basics (extract, strip, mute, volume, fade). Three handle loudness measurement and normalization (normalize, audio-normalize-peak, loudness-meter). Three are dynamics processors (audio-compressor, limiter, audio-gate). Six are EQ and frequency filters (audio-eq and four single-band filters plus audio-bandpass). Four are time / pitch / sync verbs (audio-pitch, audio-speed, audio-reverse, audio-delay). Five deal with noise and silence (audio-noise-reduce, detect-silence, trim-silence, silence-insert, censor).

This guide walks each verb against its source in src/commands/ of fqmpeg 3.0.3 — the underlying FFmpeg filter, the defaults, the output filename, and the gotchas you can't see from --help alone (normalize is EBU R128 loudnorm with fixed TP/LRA; audio-pitch uses rubberband so the dependency is non-trivial on some builds; audio-delay with a negative value isn't a delay at all but an atrim that advances audio by cutting the start; censor is an audio bleep — there's no video mosaic here; silence-insert is built from apad + adelay and only inserts at the head of the segment that follows the insertion point, not at an arbitrary midpoint splice).

What you'll get out of this guide

  • A decision matrix for the 26 verbs by task (basics / loudness / dynamics / EQ / time / silence)
  • Exact FFmpeg invocation each verb generates (verified --dry-run output)
  • Defaults, ranges, units (dB vs multiplier vs Hz vs semitones), and output filenames for every command
  • Three end-to-end recipes — podcast mastering, dialogue cleanup, and music-bed ducking

The 26 Verbs at a Glance

The cluster splits into six task groups. Pick the group, then the verb.

GroupVerbsWhat they do
Basicsaudio, strip-audio, mute, volume, audio-fadeExtract MP3, remove audio, mute a range, change gain, fade in/out
Loudnessnormalize, audio-normalize-peak, loudness-meterEBU R128 loudnorm, peak-normalize to dB target, measure LUFS
Dynamicsaudio-compressor, limiter, audio-gateCompress range, prevent clipping, gate below threshold
EQ & filtersaudio-eq, audio-bass-boost, audio-treble-boost, audio-bandpass, audio-highpass, audio-lowpass3-band EQ, single-band boost, pass / cut filters
Time / pitch / syncaudio-pitch, audio-speed, audio-reverse, audio-delayShift pitch without speed, change tempo, reverse, A/V sync offset
Noise & silenceaudio-noise-reduce, detect-silence, trim-silence, silence-insert, censorFFT denoise, find silent ranges, cut them out, insert silence, bleep a range

Five things to know before reading on:

  1. Most verbs use -c:v copy to preserve video. If the input is a video file, the video stream is stream-copied without re-encoding. Only audio (extracts to MP3) and loudness-meter / detect-silence (no output file) deviate.
  2. normalize is one-pass loudnorm with fixed TP=-1.5 and LRA=11. Only the integrated loudness target (--target, default -16 LUFS) is exposed. For two-pass measurement-then-apply, use loudness-meter to read the values first, then drop to raw FFmpeg.
  3. volume accepts both multipliers and dB. 0.5 is half, 2.0 is double, 3dB and -5dB work too. The parser is parseVolumeLevel in utils.js — anything else is rejected before FFmpeg runs. Negative dB (-5dB) needs -- as end-of-options separator on the shell — see the volume section below. The same applies to negative audio-pitch semitones (-3) and negative audio-delay milliseconds (-200).
  4. audio-delay is asymmetric. Positive values use adelay to push audio later. Negative values use atrim + asetpts=PTS-STARTPTS to advance audio by cutting the start — so the output is shorter than the input by exactly that amount.
  5. audio-eq uses equalizer=f=1000:t=h:width=500:g=N for the mid band. That's a hardcoded mid-band centered at 1 kHz with a 500 Hz half-width — fine for vocal presence but not a true 3-band shelving EQ. If you need precise mid control, drop to raw FFmpeg with equalizer and pick your own center.

Basics

audio — Extract audio as MP3

Strips the video stream and re-encodes audio to MP3 at VBR quality 2 (roughly 190 kbps).

bash
$ npx fqmpeg audio input.mp4 --dry-run

  ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 input-audio.mp3

The output extension is always .mp3 regardless of the source codec — fqmpeg re-encodes. If you want a lossless copy of an AAC track, use raw FFmpeg with ffmpeg -i input.mp4 -vn -c:a copy input.m4a instead.

strip-audio — Remove the audio track

Drops audio, copies the video stream unchanged. Fast, no re-encode.

bash
$ npx fqmpeg strip-audio input.mp4 --dry-run

  ffmpeg -i input.mp4 -an -c:v copy input-noaudio.mp4

Useful for B-roll prep before stacking or concat — mixed audio tracks across clips cause container-level mismatches that strip-audio sidesteps cleanly.

mute — Mute a time range

Silences audio between two timestamps while keeping the rest at full volume. Video stream is copied.

OptionDefaultNotes
-s, --start <seconds>0Start of mute window
-e, --end <seconds>requiredEnd of mute window
-o, --output <path><input-stem>-muted.<ext>
bash
$ npx fqmpeg mute input.mp4 -s 10 -e 15 --dry-run

  ffmpeg -i input.mp4 -af volume=enable='between(t,10,15)':volume=0 -c:v copy input-muted.mp4

--end is required (errors out if missing). The values are seconds only — no HH:MM:SS parsing here.

volume — Adjust audio gain

Multiplies amplitude or applies dB gain across the whole file.

  • Source: src/commands/volume.js
  • Filter: volume=<level> where <level> is a multiplier (0.5, 2.0) or dB (3dB, -5dB)
  • Validation: parseVolumeLevel rejects anything that isn't <number> or <number>dB
  • Output: <input-stem>-vol<sanitized-level>.<ext> (e.g. input-vol2.mp4, input-vol-5dB.mp4)
bash
$ npx fqmpeg volume input.mp4 2.0 --dry-run

  ffmpeg -i input.mp4 -af volume=2.0 -c:v copy input-vol2.0.mp4

$ npx fqmpeg volume input.mp4 --dry-run -- -5dB

  ffmpeg -i input.mp4 -af volume=-5dB -c:v copy input-vol-5dB.mp4

Negative dB values start with -, which commander.js (the CLI parser fqmpeg uses) treats as an option flag, so bare volume input.mp4 -5dB errors with unknown option '-5dB'. Pass -- as the end-of-options separator and place every --* option (like --dry-run) before --; the <level> positional then accepts the dash-prefixed value. Multipliers above ~3.0 (or dB above ~+10dB) on already-loud audio will clip — chain with limiter or use normalize instead when you want headroom.

audio-fade — Fade audio in and/or out

Applies an afade=in at the start, an afade=out near the end, or both. Video stream is copied.

OptionDefaultNotes
--in <seconds>0Fade-in duration (omit or 0 for no fade-in)
--out <seconds>0Fade-out duration (requires --duration)
--duration <seconds>required if --out > 0Total audio length so the out-fade can start at the right offset
bash
$ npx fqmpeg audio-fade input.mp4 --in 2 --out 3 --duration 60 --dry-run

  ffmpeg -i input.mp4 -af afade=t=in:st=0:d=2,afade=t=out:st=57:d=3 -c:v copy input-audiofade.mp4

The duration requirement is structural — FFmpeg's afade=t=out needs an explicit start time, not "end minus X." Run npx fqmpeg duration input.mp4 first if you don't know the length.

Loudness Measurement & Normalization

normalize — One-pass EBU R128 loudnorm

Applies the loudnorm filter in single-pass mode targeting a configurable integrated loudness (LUFS). True peak and loudness range are hardcoded for safety.

  • Source: src/commands/normalize.js
  • Filter: loudnorm=I=<target>:TP=-1.5:LRA=11
  • Range: --target accepts -70 to -5 (LUFS), default -16
bash
$ npx fqmpeg normalize input.mp4 --target -14 --dry-run

  ffmpeg -i input.mp4 -af loudnorm=I=-14:TP=-1.5:LRA=11 -c:v copy input-normalized.mp4

Common targets: -23 LUFS (EBU broadcast), -16 LUFS (podcast / streaming, default), -14 LUFS (Spotify / YouTube), -10 LUFS (loud commercial reference). One-pass loudnorm is approximate — for broadcast-grade accuracy run a two-pass workflow.

audio-normalize-peak — Peak-normalize to a dB target

Uses linear loudnorm (true-peak-only mode) to scale audio so the loudest sample lands at --peak dB. Faster than loudnorm because no perceptual modeling runs.

bash
$ npx fqmpeg audio-normalize-peak input.mp4 --peak -1 --dry-run

  ffmpeg -i input.mp4 -af loudnorm=I=-24:TP=-1:LRA=7:linear=true -c:v copy input-peak-norm.mp4

Use --peak -1 or --peak -3 for safer headroom — true 0 dBFS peaks survive lossless but can clip after MP3/AAC re-encode due to inter-sample reconstruction.

loudness-meter — Measure EBU R128 loudness

Read-only verb. Runs ebur128 and prints the integrated loudness, true peak, and loudness range to stderr. No output file.

bash
$ npx fqmpeg loudness-meter input.mp4 --dry-run

  ffmpeg -i input.mp4 -af ebur128 -f null -

Read the last Summary: block on stderr after running it for real — it shows I:, Threshold:, LRA:, Threshold:, LRA low:, LRA high: and True peak:. Pipe stderr through grep -E '(I|LRA|True peak):' to keep just the numbers.

Dynamics Processing

audio-compressor — Dynamic range compression

Reduces the difference between quiet and loud parts using acompressor.

OptionDefaultNotes
--threshold <dB>-20Compression starts above this level
--ratio <n>44 = 4:1, 2 = 2:1, etc.
--attack <ms>20How fast compression engages
--release <ms>250How fast compression releases
bash
$ npx fqmpeg audio-compressor input.mp4 --threshold -18 --ratio 3 --dry-run

  ffmpeg -i input.mp4 -af acompressor=threshold=-18dB:ratio=3:attack=20:release=250 -c:v copy input-compressed-audio.mp4

Output is <input>-compressed-audio.<ext> — note the suffix differs from the video-compression compress verb's <input>-compressed.<ext> to avoid collisions when you chain both on the same file.

limiter — Brick-wall limiter (alimiter)

Prevents audio from exceeding a ceiling. Faster attack and release than the compressor; used as the last stage before delivery.

  • Source: src/commands/limiter.js
  • Filter: alimiter=limit=<l>dB:attack=<a>:release=<r>
  • Defaults: --limit -1 dB, --attack 5 ms, --release 50 ms
  • Output: <input-stem>-limited.<ext>
bash
$ npx fqmpeg limiter input.mp4 --limit -1 --dry-run

  ffmpeg -i input.mp4 -af alimiter=limit=-1dB:attack=5:release=50 -c:v copy input-limited.mp4

-1 dB is the standard delivery ceiling for streaming (gives headroom for inter-sample peaks). For broadcast TV use -2 dB; for DVD/Blu-ray you can run to -0.3.

audio-gate — Noise gate

Silences audio that falls below a threshold — useful for cleaning up dialog with constant low-level room noise.

  • Source: src/commands/audio-gate.js
  • Filter: agate=threshold=<t>dB:attack=<a>:release=<r>
  • Defaults: --threshold -30 dB, --attack 20 ms, --release 250 ms
  • Output: <input-stem>-gated.<ext>
bash
$ npx fqmpeg audio-gate input.mp4 --threshold -35 --dry-run

  ffmpeg -i input.mp4 -af agate=threshold=-35dB:attack=20:release=250 -c:v copy input-gated.mp4

Set threshold a few dB above your noise floor (run loudness-meter or look at a waveform first). Too high and you'll cut into quiet dialog; too low and the gate doesn't help.

EQ & Frequency Filters

audio-eq — 3-band equalizer

Bass / mid / treble in one shot. Internally chains bass, equalizer (mid), and treble.

  • Source: src/commands/audio-eq.js

  • Filter assembly (only nonzero bands emit a filter):

    text
    bass=g=<bass>,
    treble=g=<treble>,
    equalizer=f=1000:t=h:width=500:g=<mid>
    
OptionDefaultRange
--bass <dB>0-20 to 20
--mid <dB>0-20 to 20 (hardcoded center 1 kHz, width 500 Hz)
--treble <dB>0-20 to 20
bash
$ npx fqmpeg audio-eq input.mp4 --bass 3 --treble 2 --dry-run

  ffmpeg -i input.mp4 -af bass=g=3,treble=g=2 -c:v copy input-eq.mp4

If all three values are zero the command errors out — at least one of --bass, --mid, --treble must be non-zero.

audio-bass-boost — Single-band bass boost

Wraps bass=g=<gain>:f=<freq> for precise center-frequency control.

bash
$ npx fqmpeg audio-bass-boost input.mp4 --gain 6 --freq 80 --dry-run

  ffmpeg -i input.mp4 -af bass=g=6:f=80 -c:v copy input-bass-boost.mp4

Center frequencies in the 60-120 Hz range hit kick-drum / male-voice fundamentals; 200+ Hz starts coloring midrange.

audio-treble-boost — Single-band treble boost

Wraps treble=g=<gain>:f=<freq>.

bash
$ npx fqmpeg audio-treble-boost input.mp4 --gain 4 --freq 5000 --dry-run

  ffmpeg -i input.mp4 -af treble=g=4:f=5000 -c:v copy input-treble-boost.mp4

3-5 kHz adds vocal presence; 8-12 kHz adds "air"; above 16 kHz mostly affects perceived brightness on younger ears.

audio-bandpass — Band-pass filter

Keeps a narrow frequency band, attenuates everything outside.

  • Source: src/commands/audio-bandpass.js
  • Filter: bandpass=f=<freq>:w=<width>
  • Defaults: --freq 1000 Hz, --width 200 Hz
  • Output: <input-stem>-bandpass.<ext>
bash
$ npx fqmpeg audio-bandpass input.mp4 --freq 1500 --width 400 --dry-run

  ffmpeg -i input.mp4 -af bandpass=f=1500:w=400 -c:v copy input-bandpass.mp4

Use for telephone / walkie-talkie effects (300-3000 Hz band) or for isolating a problem frequency before notching it out.

audio-highpass — High-pass (remove low frequencies)

Cuts everything below the cutoff. Useful for removing rumble / HVAC / handling noise.

bash
$ npx fqmpeg audio-highpass input.mp4 --freq 80 --dry-run

  ffmpeg -i input.mp4 -af highpass=f=80 -c:v copy input-highpass.mp4

80 Hz is a safe default that removes most rumble without thinning male vocals; 120 Hz is aggressive (cuts low male voice); 200 Hz is "telephone bandwidth" territory.

audio-lowpass — Low-pass (remove high frequencies)

Cuts everything above the cutoff. Useful for taking the edge off sibilant / hissy material.

bash
$ npx fqmpeg audio-lowpass input.mp4 --freq 8000 --dry-run

  ffmpeg -i input.mp4 -af lowpass=f=8000 -c:v copy input-lowpass.mp4

The default 3000 Hz is dramatic (muffled / underwater feel). For gentle de-hissing on hot mics try --freq 10000 or 12000.

Time, Pitch & Sync

audio-pitch — Shift pitch without changing speed

Uses the Rubber Band library (must be compiled into your FFmpeg).

  • Source: src/commands/audio-pitch.js
  • Filter: rubberband=pitch=<factor> where factor = 2^(semitones/12)
  • Argument: <semitones> — positional, accepts positive or negative numbers (e.g. 2, -3, -1.5)
  • Output: <input-stem>-pitch+<n>.<ext> or <input-stem>-pitch<n>.<ext> (the sign is in the filename)
bash
$ npx fqmpeg audio-pitch input.mp4 2 --dry-run

  ffmpeg -i input.mp4 -af rubberband=pitch=1.122462 -c:v copy input-pitch+2.mp4

$ npx fqmpeg audio-pitch input.mp4 --dry-run -- -3

  ffmpeg -i input.mp4 -af rubberband=pitch=0.840896 -c:v copy input-pitch-3.mp4

If you get No such filter: 'rubberband', your FFmpeg was built without --enable-librubberband. Static builds from BtbN and gyan.dev include it; the Homebrew default does too.

audio-speed — Change playback tempo without changing pitch

Wraps atempo=<factor>. The native filter accepts 0.5 to 2.0 in a single instance — for stronger changes FFmpeg supports chaining (atempo=2.0,atempo=2.0 for 4x), but fqmpeg doesn't auto-chain. Stay within 0.5 to 2.0.

bash
$ npx fqmpeg audio-speed input.mp4 1.5 --dry-run

  ffmpeg -i input.mp4 -af atempo=1.5 -c:v copy input-aspeed.mp4

Note: only the audio is sped up — video timing is unchanged because -c:v copy keeps the original frame timestamps. To change both, use the C4 speed verb instead.

audio-reverse — Reverse audio only

Plays audio back-to-front while keeping video forward (creates a striking glitch effect).

bash
$ npx fqmpeg audio-reverse input.mp4 --dry-run

  ffmpeg -i input.mp4 -af areverse -c:v copy input-audio-reversed.mp4

areverse buffers the entire audio stream in RAM — keep this in mind for multi-hour files. For both audio + video reverse, use the C4 reverse verb.

audio-delay — A/V sync offset (asymmetric semantics)

Delays or advances the audio relative to the video. Positive values delay; negative values advance.

  • Source: src/commands/audio-delay.js
  • Positive (delay): adelay=<ms>|<ms> — pads silence at the start of the audio track
  • Negative (advance): atrim=start=<sec>,asetpts=PTS-STARTPTScuts audio from the start, shortening the output by exactly that amount
  • Argument: <ms> — integer milliseconds, positive or negative
  • Output: <input-stem>-synced.<ext>
bash
$ npx fqmpeg audio-delay input.mp4 200 --dry-run

  ffmpeg -i input.mp4 -af adelay=200|200 -c:v copy input-synced.mp4

$ npx fqmpeg audio-delay input.mp4 --dry-run -- -200

  ffmpeg -i input.mp4 -af atrim=start=0.2,asetpts=PTS-STARTPTS -c:v copy input-synced.mp4

The asymmetry matters: if you advance audio by 200 ms, the first 200 ms of audio is gone, not just shifted. That's fine for fixing capture-side latency where the early audio is silence anyway, but it's destructive if the start contained content.

Noise & Silence

audio-noise-reduce — FFT-based denoising

Wraps afftdn (FFT denoiser). Works best on steady-state noise (fan hum, traffic, tape hiss).

bash
$ npx fqmpeg audio-noise-reduce input.mp4 --strength 18 --dry-run

  ffmpeg -i input.mp4 -af afftdn=nr=18 -c:v copy input-denoised.mp4

--strength 12 is gentle (audible noise reduction without artifacts on speech); --strength 24 is aggressive (cleaner but starts to add "underwater" artifacts). Run on a 5-second sample first to find the right level for your material.

detect-silence — Find silent ranges (analyze-only)

Read-only verb. Runs silencedetect and prints silence_start / silence_end / silence_duration lines to stderr. No output file.

OptionDefaultNotes
--threshold <dB>-30Audio quieter than this counts as silence
--duration <sec>2Minimum gap length to report
bash
$ npx fqmpeg detect-silence input.mp4 --threshold -40 --duration 1 --dry-run

  ffmpeg -i input.mp4 -af silencedetect=noise=-40dB:d=1 -f null -

Pipe stderr through grep silence_ to extract just the timing data. Useful for scripted scene-detection before chopping interviews into segments.

trim-silence — Remove silent sections

Uses silenceremove to cut out silence above a minimum duration. Both leading silence and inter-segment silence are removed.

  • Source: src/commands/trim-silence.js

  • Filter:

    text
    silenceremove=start_periods=1:
    start_duration=0:
    start_threshold=<t>dB:
    stop_periods=-1:
    stop_duration=<d>:
    stop_threshold=<t>dB
    
  • Defaults: --threshold -30 dB, --min-duration 0.5 sec

  • Output: <input-stem>-trimmed-silence.<ext>

bash
$ npx fqmpeg trim-silence input.mp4 --threshold -35 --min-duration 1 --dry-run

  ffmpeg -i input.mp4 -af silenceremove=start_periods=1:start_duration=0:start_threshold=-35dB:stop_periods=-1:stop_duration=1:stop_threshold=-35dB -c:v copy input-trimmed-silence.mp4

stop_periods=-1 means "remove all silent sections" (not just the first one). The video stream is still copied — A/V will drift, since audio is shorter than video after removal. For an interview cut where you want video to follow, drop to raw FFmpeg with -filter_complex and re-time the video together.

silence-insert — Insert silence at a position

Pads silence at an offset from the start using apad + adelay.

  • Source: src/commands/silence-insert.js
  • Filter: apad=pad_dur=<dur>:pad_len=0,adelay=<atMs>|<atMs>
  • Arguments: <at> (seconds offset) and <duration> (seconds of silence)
  • Output: <input-stem>-silence-ins.<ext>
bash
$ npx fqmpeg silence-insert input.mp4 10 3 --dry-run

  ffmpeg -i input.mp4 -af apad=pad_dur=3:pad_len=0,adelay=10000|10000 -c:v copy input-silence-ins.mp4

Caveat: this isn't a clean "splice" at time <at>. The way apad + adelay chain, the entire audio gets shifted by <at> ms after appending the pad — so the original audio plays starting at <at>, with <duration> seconds of pad at the very end. If you need a true midpoint splice (original 0…10 + silence + original 10…end), use a multi-input filter graph in raw FFmpeg.

censor — Bleep-tone over a range (audio, not video)

Replaces audio in a time range with a sine-wave beep. Note: this is an audio censor, not a video mosaic — see blur or pixelate for visual blurring.

  • Source: src/commands/censor.js

  • Filter: muting the original audio in the range and mixing in a generated sine wave for the same range:

    text
    [0:a]volume=enable='between(t,<s>,<e>)':volume=0[a0];
    [1:a]volume=enable='between(t,<s>,<e>)':volume=1:eval=frame[a1];
    [a0][a1]amix=inputs=2:duration=first[aout]
    
OptionDefaultNotes
-s, --start <seconds>0Start of bleep window
-e, --end <seconds>requiredEnd of bleep window
--freq <Hz>1000Bleep frequency
bash
$ npx fqmpeg censor input.mp4 -s 10 -e 12 --dry-run

  ffmpeg -i input.mp4 -f lavfi -i sine=frequency=1000 -filter_complex [0:a]volume=enable='between(t,10,12)':volume=0[a0];[1:a]volume=enable='between(t,10,12)':volume=1:eval=frame[a1];[a0][a1]amix=inputs=2:duration=first[aout] -map 0:v? -map [aout] -c:v copy -shortest input-censored.mp4

The output uses -map 0:v? (optional video) so it works on audio-only inputs too. -shortest ensures the sine generator doesn't push past the source duration.

Real-world Recipes

Three end-to-end audio pipelines that chain multiple verbs.

1. Podcast mastering (gate → compress → normalize → limit)

A standard delivery chain for a recorded interview. The order matters — gate first to clean up room noise, compress to even out levels, normalize to a target loudness, then limit to prevent any final overshoot.

bash
# 1. Gate out room noise below -45 dB
npx fqmpeg audio-gate raw.wav --threshold -45 -o step1.wav

# 2. Compress dynamics (3:1, gentle attack)
npx fqmpeg audio-compressor step1.wav --threshold -18 --ratio 3 -o step2.wav

# 3. Normalize to podcast standard (-16 LUFS)
npx fqmpeg normalize step2.wav --target -16 -o step3.wav

# 4. Brick-wall limit at -1 dB
npx fqmpeg limiter step3.wav --limit -1 -o final.wav

For high-volume production, replace the intermediate .wav files with named pipes or move to a single ffmpeg -af gate,acompressor,loudnorm,alimiter invocation. fqmpeg's --dry-run on each step gives you the building blocks.

2. Dialogue cleanup (highpass → denoise → de-ess by lowpass-bandpass split)

For dialog with mic rumble, room hiss, and harsh sibilance:

bash
# 1. Remove rumble below 80 Hz
npx fqmpeg audio-highpass dialogue.wav --freq 80 -o step1.wav

# 2. Reduce broadband hiss (steady-state noise)
npx fqmpeg audio-noise-reduce step1.wav --strength 15 -o step2.wav

# 3. Inspect loudness before final stages
npx fqmpeg loudness-meter step2.wav 2>&1 | grep -E '(I|LRA|True peak):'

# 4. Tame harsh sibilance with a gentle lowpass at 10 kHz
npx fqmpeg audio-lowpass step2.wav --freq 10000 -o final.wav

loudness-meter between steps is the audio-equivalent of crop-detect in the geometry cluster — measure, decide, then act.

3. Music-bed sync (delay correction + normalize for ducking prep)

When the music track was captured separately and needs to sit "under" a voice:

bash
# 1. Fix sync drift (music starts 350 ms early relative to voice)
npx fqmpeg audio-delay music.wav 350 -o synced.wav

# 2. Normalize music to a quieter target so it doesn't fight voice (-22 LUFS)
npx fqmpeg normalize synced.wav --target -22 -o quiet-music.wav

# 3. Fade in / out cleanly (3 s tails)
npx fqmpeg audio-fade quiet-music.wav --in 3 --out 3 --duration 120 -o bed.wav

# 4. Mix with voice (drop to raw ffmpeg or the C11 mix-audio verb)
ffmpeg -i voice.wav -i bed.wav -filter_complex amix=inputs=2:duration=longest:weights=1.5\ 1 final.wav

The chain stays in fqmpeg until the final mix — at the mix stage you either step up to raw FFmpeg or use the C11 mix-audio verb (covered in the audio-routing deep dive).

Frequently Asked Questions

How is normalize different from volume?

volume is a fixed gain multiplier or dB offset — it shifts every sample by the same amount and doesn't think about loudness. normalize runs the loudnorm (EBU R128) filter and shifts gain so the measured integrated loudness matches your target in LUFS. For "make this sound like everything else on the platform" you want normalize; for "make this exactly 6 dB louder" you want volume.

Which loudness target should I use for normalize?

-23 LUFS for EBU broadcast TV (Europe). -24 LUFS for ATSC A/85 (US broadcast). -16 LUFS for podcasts (most podcast hosts target this — and it's fqmpeg's default). -14 LUFS for streaming music (Spotify, YouTube, Apple Music all normalize incoming masters to around this level). -10 LUFS for "loudness war" commercial reference (don't actually deliver this — it'll lose detail to platform normalization).

Why does audio-pitch fail with "No such filter: rubberband"?

Your FFmpeg was built without --enable-librubberband. Static builds from BtbN and gyan.dev include it. On macOS the Homebrew ffmpeg formula includes Rubber Band; on Linux the official Debian package does too. If you can't replace your FFmpeg, fall back to asetrate=44100*1.122462,aresample=44100,atempo=1/1.122462 for a 2-semitone shift — it's lower quality but uses only built-in filters.

What's the difference between audio-fade and the C4 fade verb?

The C4 fade is video-only (visual cross-fade or dip-to-black). audio-fade only touches the audio track and copies the video stream. For both at once, run them in sequence: npx fqmpeg fade ... then npx fqmpeg audio-fade ... on the result.

How do I capture loudness-meter and detect-silence output?

Both verbs print to stderr and don't write a file. Redirect with 2> to capture, or pipe with 2>&1 to filter:

bash
npx fqmpeg loudness-meter input.mp4 2>&1 | grep -E '(I|LRA|True peak):'
npx fqmpeg detect-silence input.mp4 --threshold -40 2>&1 | grep silence_

In a CI or script, parse the stderr lines for the values you need.

Does trim-silence keep audio and video in sync?

No. The video stream is copied unchanged while audio gets shorter as silent sections are removed — so the output will drift. For an "interview cut" where the video should follow the audio cuts, you need a single-pass filter_complex with matched select and aselect filters. fqmpeg doesn't expose this directly; drop to raw FFmpeg or pre-edit with the C4 trim verb at the timestamps detect-silence reports.

Is censor a video mosaic?

No. censor replaces audio with a sine-tone bleep. For visual censoring use the C6 blur or pixelate verbs on a cropped region, or compose with drawbox (C8) for an opaque rectangle.

How do I chain multiple audio filters without intermediate files?

For two or three steps, fqmpeg's intermediate-file approach is fine — disk is fast and SSDs make the I/O cost negligible. For longer chains where you want a single FFmpeg invocation, run each verb with --dry-run, copy the -af value, and combine them with commas:

bash
ffmpeg -i input.wav -af "highpass=f=80,afftdn=nr=15,loudnorm=I=-16:TP=-1.5:LRA=11,alimiter=limit=-1dB:attack=5:release=50" -c:v copy out.wav

This is exactly what fqmpeg's single-step commands do internally — combining them just removes the disk round-trips.

What audio codec does fqmpeg use for output?

For most verbs, the audio codec is whatever the input container negotiates with default re-encoding (typically AAC for MP4, Vorbis/Opus for WebM, MP3 for .mp3). fqmpeg doesn't explicitly set -c:a, so FFmpeg picks the matching codec for the output extension. The exception is the audio verb which always outputs MP3 via libmp3lame -q:a 2.

Wrapping Up

fqmpeg's C9 cluster covers the audio engineer's everyday toolbox: 26 verbs you can chain to build a podcast master, clean up dialog, or sync a music bed without writing a single filter graph. The non-obvious bits — audio-delay's asymmetric semantics, audio-pitch's Rubber Band dependency, audio-eq's hardcoded mid band, trim-silence's A/V drift, silence-insert's tail-shifting behavior — are documented above so you don't hit them in production.

Two next steps:

  1. Run npx fqmpeg <verb> --help for any verb above to see the live option list.
  2. For creative effects (reverb, echo, chorus, phaser, etc.), see the C10 deep dive (creative audio effects). For routing, channels, and visualization (waveforms, spectrograms), see C11 (audio routing & visualization).

The next time you reach for a -af filter, check the verb list first.