fqmpeg's C9 cluster is the audio engine — 26 verbs that touch the audio track of any media file. Five do the basics (extract, strip, mute, volume, fade). Three handle loudness measurement and normalization (normalize, audio-normalize-peak, loudness-meter). Three are dynamics processors (audio-compressor, limiter, audio-gate). Six are EQ and frequency filters (audio-eq and four single-band filters plus audio-bandpass). Four are time / pitch / sync verbs (audio-pitch, audio-speed, audio-reverse, audio-delay). Five deal with noise and silence (audio-noise-reduce, detect-silence, trim-silence, silence-insert, censor).
This guide walks each verb against its source in src/commands/ of fqmpeg 3.0.3 — the underlying FFmpeg filter, the defaults, the output filename, and the gotchas you can't see from --help alone (normalize is EBU R128 loudnorm with fixed TP/LRA; audio-pitch uses rubberband so the dependency is non-trivial on some builds; audio-delay with a negative value isn't a delay at all but an atrim that advances audio by cutting the start; censor is an audio bleep — there's no video mosaic here; silence-insert is built from apad + adelay and only inserts at the head of the segment that follows the insertion point, not at an arbitrary midpoint splice).
What you'll get out of this guide
- A decision matrix for the 26 verbs by task (basics / loudness / dynamics / EQ / time / silence)
- Exact FFmpeg invocation each verb generates (verified
--dry-runoutput) - Defaults, ranges, units (dB vs multiplier vs Hz vs semitones), and output filenames for every command
- Three end-to-end recipes — podcast mastering, dialogue cleanup, and music-bed ducking
The 26 Verbs at a Glance
The cluster splits into six task groups. Pick the group, then the verb.
| Group | Verbs | What they do |
|---|---|---|
| Basics | audio, strip-audio, mute, volume, audio-fade | Extract MP3, remove audio, mute a range, change gain, fade in/out |
| Loudness | normalize, audio-normalize-peak, loudness-meter | EBU R128 loudnorm, peak-normalize to dB target, measure LUFS |
| Dynamics | audio-compressor, limiter, audio-gate | Compress range, prevent clipping, gate below threshold |
| EQ & filters | audio-eq, audio-bass-boost, audio-treble-boost, audio-bandpass, audio-highpass, audio-lowpass | 3-band EQ, single-band boost, pass / cut filters |
| Time / pitch / sync | audio-pitch, audio-speed, audio-reverse, audio-delay | Shift pitch without speed, change tempo, reverse, A/V sync offset |
| Noise & silence | audio-noise-reduce, detect-silence, trim-silence, silence-insert, censor | FFT denoise, find silent ranges, cut them out, insert silence, bleep a range |
Five things to know before reading on:
- Most verbs use
-c:v copyto preserve video. If the input is a video file, the video stream is stream-copied without re-encoding. Onlyaudio(extracts to MP3) andloudness-meter/detect-silence(no output file) deviate. normalizeis one-pass loudnorm with fixedTP=-1.5andLRA=11. Only the integrated loudness target (--target, default-16LUFS) is exposed. For two-pass measurement-then-apply, useloudness-meterto read the values first, then drop to raw FFmpeg.volumeaccepts both multipliers and dB.0.5is half,2.0is double,3dBand-5dBwork too. The parser isparseVolumeLevelinutils.js— anything else is rejected before FFmpeg runs. Negative dB (-5dB) needs--as end-of-options separator on the shell — see thevolumesection below. The same applies to negativeaudio-pitchsemitones (-3) and negativeaudio-delaymilliseconds (-200).audio-delayis asymmetric. Positive values useadelayto push audio later. Negative values useatrim+asetpts=PTS-STARTPTSto advance audio by cutting the start — so the output is shorter than the input by exactly that amount.audio-equsesequalizer=f=1000:t=h:width=500:g=Nfor the mid band. That's a hardcoded mid-band centered at 1 kHz with a 500 Hz half-width — fine for vocal presence but not a true 3-band shelving EQ. If you need precise mid control, drop to raw FFmpeg withequalizerand pick your own center.
Basics
audio — Extract audio as MP3
Strips the video stream and re-encodes audio to MP3 at VBR quality 2 (roughly 190 kbps).
- Source:
src/commands/audio.js - Flags:
-vn -c:a libmp3lame -q:a 2 - Output:
<input-stem>-audio.mp3
$ npx fqmpeg audio input.mp4 --dry-run
ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 input-audio.mp3
The output extension is always .mp3 regardless of the source codec — fqmpeg re-encodes. If you want a lossless copy of an AAC track, use raw FFmpeg with ffmpeg -i input.mp4 -vn -c:a copy input.m4a instead.
strip-audio — Remove the audio track
Drops audio, copies the video stream unchanged. Fast, no re-encode.
- Source:
src/commands/strip-audio.js - Flags:
-an -c:v copy - Output:
<input-stem>-noaudio.<ext>
$ npx fqmpeg strip-audio input.mp4 --dry-run
ffmpeg -i input.mp4 -an -c:v copy input-noaudio.mp4
Useful for B-roll prep before stacking or concat — mixed audio tracks across clips cause container-level mismatches that strip-audio sidesteps cleanly.
mute — Mute a time range
Silences audio between two timestamps while keeping the rest at full volume. Video stream is copied.
- Source:
src/commands/mute.js - Filter:
volume=enable='between(t,<s>,<e>)':volume=0
| Option | Default | Notes |
|---|---|---|
-s, --start <seconds> | 0 | Start of mute window |
-e, --end <seconds> | required | End of mute window |
-o, --output <path> | <input-stem>-muted.<ext> | — |
$ npx fqmpeg mute input.mp4 -s 10 -e 15 --dry-run
ffmpeg -i input.mp4 -af volume=enable='between(t,10,15)':volume=0 -c:v copy input-muted.mp4
--end is required (errors out if missing). The values are seconds only — no HH:MM:SS parsing here.
volume — Adjust audio gain
Multiplies amplitude or applies dB gain across the whole file.
- Source:
src/commands/volume.js - Filter:
volume=<level>where<level>is a multiplier (0.5,2.0) or dB (3dB,-5dB) - Validation:
parseVolumeLevelrejects anything that isn't<number>or<number>dB - Output:
<input-stem>-vol<sanitized-level>.<ext>(e.g.input-vol2.mp4,input-vol-5dB.mp4)
$ npx fqmpeg volume input.mp4 2.0 --dry-run
ffmpeg -i input.mp4 -af volume=2.0 -c:v copy input-vol2.0.mp4
$ npx fqmpeg volume input.mp4 --dry-run -- -5dB
ffmpeg -i input.mp4 -af volume=-5dB -c:v copy input-vol-5dB.mp4
Negative dB values start with -, which commander.js (the CLI parser fqmpeg uses) treats as an option flag, so bare volume input.mp4 -5dB errors with unknown option '-5dB'. Pass -- as the end-of-options separator and place every --* option (like --dry-run) before --; the <level> positional then accepts the dash-prefixed value. Multipliers above ~3.0 (or dB above ~+10dB) on already-loud audio will clip — chain with limiter or use normalize instead when you want headroom.
audio-fade — Fade audio in and/or out
Applies an afade=in at the start, an afade=out near the end, or both. Video stream is copied.
- Source:
src/commands/audio-fade.js - Filter:
afade=t=in:st=0:d=<in>+afade=t=out:st=<dur-out>:d=<out>chained with,
| Option | Default | Notes |
|---|---|---|
--in <seconds> | 0 | Fade-in duration (omit or 0 for no fade-in) |
--out <seconds> | 0 | Fade-out duration (requires --duration) |
--duration <seconds> | required if --out > 0 | Total audio length so the out-fade can start at the right offset |
$ npx fqmpeg audio-fade input.mp4 --in 2 --out 3 --duration 60 --dry-run
ffmpeg -i input.mp4 -af afade=t=in:st=0:d=2,afade=t=out:st=57:d=3 -c:v copy input-audiofade.mp4
The duration requirement is structural — FFmpeg's afade=t=out needs an explicit start time, not "end minus X." Run npx fqmpeg duration input.mp4 first if you don't know the length.
Loudness Measurement & Normalization
normalize — One-pass EBU R128 loudnorm
Applies the loudnorm filter in single-pass mode targeting a configurable integrated loudness (LUFS). True peak and loudness range are hardcoded for safety.
- Source:
src/commands/normalize.js - Filter:
loudnorm=I=<target>:TP=-1.5:LRA=11 - Range:
--targetaccepts-70to-5(LUFS), default-16
$ npx fqmpeg normalize input.mp4 --target -14 --dry-run
ffmpeg -i input.mp4 -af loudnorm=I=-14:TP=-1.5:LRA=11 -c:v copy input-normalized.mp4
Common targets: -23 LUFS (EBU broadcast), -16 LUFS (podcast / streaming, default), -14 LUFS (Spotify / YouTube), -10 LUFS (loud commercial reference). One-pass loudnorm is approximate — for broadcast-grade accuracy run a two-pass workflow.
audio-normalize-peak — Peak-normalize to a dB target
Uses linear loudnorm (true-peak-only mode) to scale audio so the loudest sample lands at --peak dB. Faster than loudnorm because no perceptual modeling runs.
- Source:
src/commands/audio-normalize-peak.js - Filter:
loudnorm=I=-24:TP=<peak>:LRA=7:linear=true - Default:
--peak 0(peak hits 0 dBFS — risk of inter-sample peaks)
$ npx fqmpeg audio-normalize-peak input.mp4 --peak -1 --dry-run
ffmpeg -i input.mp4 -af loudnorm=I=-24:TP=-1:LRA=7:linear=true -c:v copy input-peak-norm.mp4
Use --peak -1 or --peak -3 for safer headroom — true 0 dBFS peaks survive lossless but can clip after MP3/AAC re-encode due to inter-sample reconstruction.
loudness-meter — Measure EBU R128 loudness
Read-only verb. Runs ebur128 and prints the integrated loudness, true peak, and loudness range to stderr. No output file.
- Source:
src/commands/loudness-meter.js - Filter:
-af ebur128 -f null - - No
--output, no-y— analyze only
$ npx fqmpeg loudness-meter input.mp4 --dry-run
ffmpeg -i input.mp4 -af ebur128 -f null -
Read the last Summary: block on stderr after running it for real — it shows I:, Threshold:, LRA:, Threshold:, LRA low:, LRA high: and True peak:. Pipe stderr through grep -E '(I|LRA|True peak):' to keep just the numbers.
Dynamics Processing
audio-compressor — Dynamic range compression
Reduces the difference between quiet and loud parts using acompressor.
- Source:
src/commands/audio-compressor.js - Filter:
acompressor=threshold=<t>dB:ratio=<r>:attack=<a>:release=<r>
| Option | Default | Notes |
|---|---|---|
--threshold <dB> | -20 | Compression starts above this level |
--ratio <n> | 4 | 4 = 4:1, 2 = 2:1, etc. |
--attack <ms> | 20 | How fast compression engages |
--release <ms> | 250 | How fast compression releases |
$ npx fqmpeg audio-compressor input.mp4 --threshold -18 --ratio 3 --dry-run
ffmpeg -i input.mp4 -af acompressor=threshold=-18dB:ratio=3:attack=20:release=250 -c:v copy input-compressed-audio.mp4
Output is <input>-compressed-audio.<ext> — note the suffix differs from the video-compression compress verb's <input>-compressed.<ext> to avoid collisions when you chain both on the same file.
limiter — Brick-wall limiter (alimiter)
Prevents audio from exceeding a ceiling. Faster attack and release than the compressor; used as the last stage before delivery.
- Source:
src/commands/limiter.js - Filter:
alimiter=limit=<l>dB:attack=<a>:release=<r> - Defaults:
--limit -1dB,--attack 5ms,--release 50ms - Output:
<input-stem>-limited.<ext>
$ npx fqmpeg limiter input.mp4 --limit -1 --dry-run
ffmpeg -i input.mp4 -af alimiter=limit=-1dB:attack=5:release=50 -c:v copy input-limited.mp4
-1 dB is the standard delivery ceiling for streaming (gives headroom for inter-sample peaks). For broadcast TV use -2 dB; for DVD/Blu-ray you can run to -0.3.
audio-gate — Noise gate
Silences audio that falls below a threshold — useful for cleaning up dialog with constant low-level room noise.
- Source:
src/commands/audio-gate.js - Filter:
agate=threshold=<t>dB:attack=<a>:release=<r> - Defaults:
--threshold -30dB,--attack 20ms,--release 250ms - Output:
<input-stem>-gated.<ext>
$ npx fqmpeg audio-gate input.mp4 --threshold -35 --dry-run
ffmpeg -i input.mp4 -af agate=threshold=-35dB:attack=20:release=250 -c:v copy input-gated.mp4
Set threshold a few dB above your noise floor (run loudness-meter or look at a waveform first). Too high and you'll cut into quiet dialog; too low and the gate doesn't help.
EQ & Frequency Filters
audio-eq — 3-band equalizer
Bass / mid / treble in one shot. Internally chains bass, equalizer (mid), and treble.
-
Source:
src/commands/audio-eq.js -
Filter assembly (only nonzero bands emit a filter):
textbass=g=<bass>, treble=g=<treble>, equalizer=f=1000:t=h:width=500:g=<mid>
| Option | Default | Range |
|---|---|---|
--bass <dB> | 0 | -20 to 20 |
--mid <dB> | 0 | -20 to 20 (hardcoded center 1 kHz, width 500 Hz) |
--treble <dB> | 0 | -20 to 20 |
$ npx fqmpeg audio-eq input.mp4 --bass 3 --treble 2 --dry-run
ffmpeg -i input.mp4 -af bass=g=3,treble=g=2 -c:v copy input-eq.mp4
If all three values are zero the command errors out — at least one of --bass, --mid, --treble must be non-zero.
audio-bass-boost — Single-band bass boost
Wraps bass=g=<gain>:f=<freq> for precise center-frequency control.
- Source:
src/commands/audio-bass-boost.js - Defaults:
--gain 10dB,--freq 100Hz - Output:
<input-stem>-bass-boost.<ext>
$ npx fqmpeg audio-bass-boost input.mp4 --gain 6 --freq 80 --dry-run
ffmpeg -i input.mp4 -af bass=g=6:f=80 -c:v copy input-bass-boost.mp4
Center frequencies in the 60-120 Hz range hit kick-drum / male-voice fundamentals; 200+ Hz starts coloring midrange.
audio-treble-boost — Single-band treble boost
Wraps treble=g=<gain>:f=<freq>.
- Source:
src/commands/audio-treble-boost.js - Defaults:
--gain 5dB,--freq 3000Hz - Output:
<input-stem>-treble-boost.<ext>
$ npx fqmpeg audio-treble-boost input.mp4 --gain 4 --freq 5000 --dry-run
ffmpeg -i input.mp4 -af treble=g=4:f=5000 -c:v copy input-treble-boost.mp4
3-5 kHz adds vocal presence; 8-12 kHz adds "air"; above 16 kHz mostly affects perceived brightness on younger ears.
audio-bandpass — Band-pass filter
Keeps a narrow frequency band, attenuates everything outside.
- Source:
src/commands/audio-bandpass.js - Filter:
bandpass=f=<freq>:w=<width> - Defaults:
--freq 1000Hz,--width 200Hz - Output:
<input-stem>-bandpass.<ext>
$ npx fqmpeg audio-bandpass input.mp4 --freq 1500 --width 400 --dry-run
ffmpeg -i input.mp4 -af bandpass=f=1500:w=400 -c:v copy input-bandpass.mp4
Use for telephone / walkie-talkie effects (300-3000 Hz band) or for isolating a problem frequency before notching it out.
audio-highpass — High-pass (remove low frequencies)
Cuts everything below the cutoff. Useful for removing rumble / HVAC / handling noise.
- Source:
src/commands/audio-highpass.js - Filter:
highpass=f=<freq> - Default:
--freq 200Hz - Output:
<input-stem>-highpass.<ext>
$ npx fqmpeg audio-highpass input.mp4 --freq 80 --dry-run
ffmpeg -i input.mp4 -af highpass=f=80 -c:v copy input-highpass.mp4
80 Hz is a safe default that removes most rumble without thinning male vocals; 120 Hz is aggressive (cuts low male voice); 200 Hz is "telephone bandwidth" territory.
audio-lowpass — Low-pass (remove high frequencies)
Cuts everything above the cutoff. Useful for taking the edge off sibilant / hissy material.
- Source:
src/commands/audio-lowpass.js - Filter:
lowpass=f=<freq> - Default:
--freq 3000Hz - Output:
<input-stem>-lowpass.<ext>
$ npx fqmpeg audio-lowpass input.mp4 --freq 8000 --dry-run
ffmpeg -i input.mp4 -af lowpass=f=8000 -c:v copy input-lowpass.mp4
The default 3000 Hz is dramatic (muffled / underwater feel). For gentle de-hissing on hot mics try --freq 10000 or 12000.
Time, Pitch & Sync
audio-pitch — Shift pitch without changing speed
Uses the Rubber Band library (must be compiled into your FFmpeg).
- Source:
src/commands/audio-pitch.js - Filter:
rubberband=pitch=<factor>wherefactor = 2^(semitones/12) - Argument:
<semitones>— positional, accepts positive or negative numbers (e.g.2,-3,-1.5) - Output:
<input-stem>-pitch+<n>.<ext>or<input-stem>-pitch<n>.<ext>(the sign is in the filename)
$ npx fqmpeg audio-pitch input.mp4 2 --dry-run
ffmpeg -i input.mp4 -af rubberband=pitch=1.122462 -c:v copy input-pitch+2.mp4
$ npx fqmpeg audio-pitch input.mp4 --dry-run -- -3
ffmpeg -i input.mp4 -af rubberband=pitch=0.840896 -c:v copy input-pitch-3.mp4
If you get No such filter: 'rubberband', your FFmpeg was built without --enable-librubberband. Static builds from BtbN and gyan.dev include it; the Homebrew default does too.
audio-speed — Change playback tempo without changing pitch
Wraps atempo=<factor>. The native filter accepts 0.5 to 2.0 in a single instance — for stronger changes FFmpeg supports chaining (atempo=2.0,atempo=2.0 for 4x), but fqmpeg doesn't auto-chain. Stay within 0.5 to 2.0.
- Source:
src/commands/audio-speed.js - Argument:
<factor>(positive number) - Output:
<input-stem>-aspeed.<ext>
$ npx fqmpeg audio-speed input.mp4 1.5 --dry-run
ffmpeg -i input.mp4 -af atempo=1.5 -c:v copy input-aspeed.mp4
Note: only the audio is sped up — video timing is unchanged because -c:v copy keeps the original frame timestamps. To change both, use the C4 speed verb instead.
audio-reverse — Reverse audio only
Plays audio back-to-front while keeping video forward (creates a striking glitch effect).
- Source:
src/commands/audio-reverse.js - Filter:
areverse - Output:
<input-stem>-audio-reversed.<ext>
$ npx fqmpeg audio-reverse input.mp4 --dry-run
ffmpeg -i input.mp4 -af areverse -c:v copy input-audio-reversed.mp4
areverse buffers the entire audio stream in RAM — keep this in mind for multi-hour files. For both audio + video reverse, use the C4 reverse verb.
audio-delay — A/V sync offset (asymmetric semantics)
Delays or advances the audio relative to the video. Positive values delay; negative values advance.
- Source:
src/commands/audio-delay.js - Positive (delay):
adelay=<ms>|<ms>— pads silence at the start of the audio track - Negative (advance):
atrim=start=<sec>,asetpts=PTS-STARTPTS— cuts audio from the start, shortening the output by exactly that amount - Argument:
<ms>— integer milliseconds, positive or negative - Output:
<input-stem>-synced.<ext>
$ npx fqmpeg audio-delay input.mp4 200 --dry-run
ffmpeg -i input.mp4 -af adelay=200|200 -c:v copy input-synced.mp4
$ npx fqmpeg audio-delay input.mp4 --dry-run -- -200
ffmpeg -i input.mp4 -af atrim=start=0.2,asetpts=PTS-STARTPTS -c:v copy input-synced.mp4
The asymmetry matters: if you advance audio by 200 ms, the first 200 ms of audio is gone, not just shifted. That's fine for fixing capture-side latency where the early audio is silence anyway, but it's destructive if the start contained content.
Noise & Silence
audio-noise-reduce — FFT-based denoising
Wraps afftdn (FFT denoiser). Works best on steady-state noise (fan hum, traffic, tape hiss).
- Source:
src/commands/audio-noise-reduce.js - Filter:
afftdn=nr=<strength> - Default:
--strength 12dB - Output:
<input-stem>-denoised.<ext>
$ npx fqmpeg audio-noise-reduce input.mp4 --strength 18 --dry-run
ffmpeg -i input.mp4 -af afftdn=nr=18 -c:v copy input-denoised.mp4
--strength 12 is gentle (audible noise reduction without artifacts on speech); --strength 24 is aggressive (cleaner but starts to add "underwater" artifacts). Run on a 5-second sample first to find the right level for your material.
detect-silence — Find silent ranges (analyze-only)
Read-only verb. Runs silencedetect and prints silence_start / silence_end / silence_duration lines to stderr. No output file.
- Source:
src/commands/detect-silence.js - Filter:
-af silencedetect=noise=<t>dB:d=<dur> -f null -
| Option | Default | Notes |
|---|---|---|
--threshold <dB> | -30 | Audio quieter than this counts as silence |
--duration <sec> | 2 | Minimum gap length to report |
$ npx fqmpeg detect-silence input.mp4 --threshold -40 --duration 1 --dry-run
ffmpeg -i input.mp4 -af silencedetect=noise=-40dB:d=1 -f null -
Pipe stderr through grep silence_ to extract just the timing data. Useful for scripted scene-detection before chopping interviews into segments.
trim-silence — Remove silent sections
Uses silenceremove to cut out silence above a minimum duration. Both leading silence and inter-segment silence are removed.
-
Source:
src/commands/trim-silence.js -
Filter:
textsilenceremove=start_periods=1: start_duration=0: start_threshold=<t>dB: stop_periods=-1: stop_duration=<d>: stop_threshold=<t>dB -
Defaults:
--threshold -30dB,--min-duration 0.5sec -
Output:
<input-stem>-trimmed-silence.<ext>
$ npx fqmpeg trim-silence input.mp4 --threshold -35 --min-duration 1 --dry-run
ffmpeg -i input.mp4 -af silenceremove=start_periods=1:start_duration=0:start_threshold=-35dB:stop_periods=-1:stop_duration=1:stop_threshold=-35dB -c:v copy input-trimmed-silence.mp4
stop_periods=-1 means "remove all silent sections" (not just the first one). The video stream is still copied — A/V will drift, since audio is shorter than video after removal. For an interview cut where you want video to follow, drop to raw FFmpeg with -filter_complex and re-time the video together.
silence-insert — Insert silence at a position
Pads silence at an offset from the start using apad + adelay.
- Source:
src/commands/silence-insert.js - Filter:
apad=pad_dur=<dur>:pad_len=0,adelay=<atMs>|<atMs> - Arguments:
<at>(seconds offset) and<duration>(seconds of silence) - Output:
<input-stem>-silence-ins.<ext>
$ npx fqmpeg silence-insert input.mp4 10 3 --dry-run
ffmpeg -i input.mp4 -af apad=pad_dur=3:pad_len=0,adelay=10000|10000 -c:v copy input-silence-ins.mp4
Caveat: this isn't a clean "splice" at time <at>. The way apad + adelay chain, the entire audio gets shifted by <at> ms after appending the pad — so the original audio plays starting at <at>, with <duration> seconds of pad at the very end. If you need a true midpoint splice (original 0…10 + silence + original 10…end), use a multi-input filter graph in raw FFmpeg.
censor — Bleep-tone over a range (audio, not video)
Replaces audio in a time range with a sine-wave beep. Note: this is an audio censor, not a video mosaic — see blur or pixelate for visual blurring.
-
Source:
src/commands/censor.js -
Filter: muting the original audio in the range and mixing in a generated sine wave for the same range:
text[0:a]volume=enable='between(t,<s>,<e>)':volume=0[a0]; [1:a]volume=enable='between(t,<s>,<e>)':volume=1:eval=frame[a1]; [a0][a1]amix=inputs=2:duration=first[aout]
| Option | Default | Notes |
|---|---|---|
-s, --start <seconds> | 0 | Start of bleep window |
-e, --end <seconds> | required | End of bleep window |
--freq <Hz> | 1000 | Bleep frequency |
$ npx fqmpeg censor input.mp4 -s 10 -e 12 --dry-run
ffmpeg -i input.mp4 -f lavfi -i sine=frequency=1000 -filter_complex [0:a]volume=enable='between(t,10,12)':volume=0[a0];[1:a]volume=enable='between(t,10,12)':volume=1:eval=frame[a1];[a0][a1]amix=inputs=2:duration=first[aout] -map 0:v? -map [aout] -c:v copy -shortest input-censored.mp4
The output uses -map 0:v? (optional video) so it works on audio-only inputs too. -shortest ensures the sine generator doesn't push past the source duration.
Real-world Recipes
Three end-to-end audio pipelines that chain multiple verbs.
1. Podcast mastering (gate → compress → normalize → limit)
A standard delivery chain for a recorded interview. The order matters — gate first to clean up room noise, compress to even out levels, normalize to a target loudness, then limit to prevent any final overshoot.
# 1. Gate out room noise below -45 dB
npx fqmpeg audio-gate raw.wav --threshold -45 -o step1.wav
# 2. Compress dynamics (3:1, gentle attack)
npx fqmpeg audio-compressor step1.wav --threshold -18 --ratio 3 -o step2.wav
# 3. Normalize to podcast standard (-16 LUFS)
npx fqmpeg normalize step2.wav --target -16 -o step3.wav
# 4. Brick-wall limit at -1 dB
npx fqmpeg limiter step3.wav --limit -1 -o final.wav
For high-volume production, replace the intermediate .wav files with named pipes or move to a single ffmpeg -af gate,acompressor,loudnorm,alimiter invocation. fqmpeg's --dry-run on each step gives you the building blocks.
2. Dialogue cleanup (highpass → denoise → de-ess by lowpass-bandpass split)
For dialog with mic rumble, room hiss, and harsh sibilance:
# 1. Remove rumble below 80 Hz
npx fqmpeg audio-highpass dialogue.wav --freq 80 -o step1.wav
# 2. Reduce broadband hiss (steady-state noise)
npx fqmpeg audio-noise-reduce step1.wav --strength 15 -o step2.wav
# 3. Inspect loudness before final stages
npx fqmpeg loudness-meter step2.wav 2>&1 | grep -E '(I|LRA|True peak):'
# 4. Tame harsh sibilance with a gentle lowpass at 10 kHz
npx fqmpeg audio-lowpass step2.wav --freq 10000 -o final.wav
loudness-meter between steps is the audio-equivalent of crop-detect in the geometry cluster — measure, decide, then act.
3. Music-bed sync (delay correction + normalize for ducking prep)
When the music track was captured separately and needs to sit "under" a voice:
# 1. Fix sync drift (music starts 350 ms early relative to voice)
npx fqmpeg audio-delay music.wav 350 -o synced.wav
# 2. Normalize music to a quieter target so it doesn't fight voice (-22 LUFS)
npx fqmpeg normalize synced.wav --target -22 -o quiet-music.wav
# 3. Fade in / out cleanly (3 s tails)
npx fqmpeg audio-fade quiet-music.wav --in 3 --out 3 --duration 120 -o bed.wav
# 4. Mix with voice (drop to raw ffmpeg or the C11 mix-audio verb)
ffmpeg -i voice.wav -i bed.wav -filter_complex amix=inputs=2:duration=longest:weights=1.5\ 1 final.wav
The chain stays in fqmpeg until the final mix — at the mix stage you either step up to raw FFmpeg or use the C11 mix-audio verb (covered in the audio-routing deep dive).
Frequently Asked Questions
How is normalize different from volume?
volume is a fixed gain multiplier or dB offset — it shifts every sample by the same amount and doesn't think about loudness. normalize runs the loudnorm (EBU R128) filter and shifts gain so the measured integrated loudness matches your target in LUFS. For "make this sound like everything else on the platform" you want normalize; for "make this exactly 6 dB louder" you want volume.
Which loudness target should I use for normalize?
-23 LUFS for EBU broadcast TV (Europe). -24 LUFS for ATSC A/85 (US broadcast). -16 LUFS for podcasts (most podcast hosts target this — and it's fqmpeg's default). -14 LUFS for streaming music (Spotify, YouTube, Apple Music all normalize incoming masters to around this level). -10 LUFS for "loudness war" commercial reference (don't actually deliver this — it'll lose detail to platform normalization).
Why does audio-pitch fail with "No such filter: rubberband"?
Your FFmpeg was built without --enable-librubberband. Static builds from BtbN and gyan.dev include it. On macOS the Homebrew ffmpeg formula includes Rubber Band; on Linux the official Debian package does too. If you can't replace your FFmpeg, fall back to asetrate=44100*1.122462,aresample=44100,atempo=1/1.122462 for a 2-semitone shift — it's lower quality but uses only built-in filters.
What's the difference between audio-fade and the C4 fade verb?
The C4 fade is video-only (visual cross-fade or dip-to-black). audio-fade only touches the audio track and copies the video stream. For both at once, run them in sequence: npx fqmpeg fade ... then npx fqmpeg audio-fade ... on the result.
How do I capture loudness-meter and detect-silence output?
Both verbs print to stderr and don't write a file. Redirect with 2> to capture, or pipe with 2>&1 to filter:
npx fqmpeg loudness-meter input.mp4 2>&1 | grep -E '(I|LRA|True peak):'
npx fqmpeg detect-silence input.mp4 --threshold -40 2>&1 | grep silence_
In a CI or script, parse the stderr lines for the values you need.
Does trim-silence keep audio and video in sync?
No. The video stream is copied unchanged while audio gets shorter as silent sections are removed — so the output will drift. For an "interview cut" where the video should follow the audio cuts, you need a single-pass filter_complex with matched select and aselect filters. fqmpeg doesn't expose this directly; drop to raw FFmpeg or pre-edit with the C4 trim verb at the timestamps detect-silence reports.
Is censor a video mosaic?
No. censor replaces audio with a sine-tone bleep. For visual censoring use the C6 blur or pixelate verbs on a cropped region, or compose with drawbox (C8) for an opaque rectangle.
How do I chain multiple audio filters without intermediate files?
For two or three steps, fqmpeg's intermediate-file approach is fine — disk is fast and SSDs make the I/O cost negligible. For longer chains where you want a single FFmpeg invocation, run each verb with --dry-run, copy the -af value, and combine them with commas:
ffmpeg -i input.wav -af "highpass=f=80,afftdn=nr=15,loudnorm=I=-16:TP=-1.5:LRA=11,alimiter=limit=-1dB:attack=5:release=50" -c:v copy out.wav
This is exactly what fqmpeg's single-step commands do internally — combining them just removes the disk round-trips.
What audio codec does fqmpeg use for output?
For most verbs, the audio codec is whatever the input container negotiates with default re-encoding (typically AAC for MP4, Vorbis/Opus for WebM, MP3 for .mp3). fqmpeg doesn't explicitly set -c:a, so FFmpeg picks the matching codec for the output extension. The exception is the audio verb which always outputs MP3 via libmp3lame -q:a 2.
Wrapping Up
fqmpeg's C9 cluster covers the audio engineer's everyday toolbox: 26 verbs you can chain to build a podcast master, clean up dialog, or sync a music bed without writing a single filter graph. The non-obvious bits — audio-delay's asymmetric semantics, audio-pitch's Rubber Band dependency, audio-eq's hardcoded mid band, trim-silence's A/V drift, silence-insert's tail-shifting behavior — are documented above so you don't hit them in production.
Two next steps:
- Run
npx fqmpeg <verb> --helpfor any verb above to see the live option list. - For creative effects (reverb, echo, chorus, phaser, etc.), see the C10 deep dive (creative audio effects). For routing, channels, and visualization (waveforms, spectrograms), see C11 (audio routing & visualization).
The next time you reach for a -af filter, check the verb list first.