fqmpeg Audio Processing: Levels, EQ & Dynamics (26 Verbs)

Q: How is `normalize` different from `volume`?

`volume` is a fixed gain multiplier or dB offset — it shifts every sample by the same amount and doesn't think about loudness. `normalize` runs the `loudnorm` (EBU R128) filter and shifts gain so the *measured* integrated loudness matches your target in LUFS. For "make this sound like everything else on the platform" you want `normalize`; for "make this exactly 6 dB louder" you want `volume`.

Q: Which loudness target should I use for `normalize`?

`-23` LUFS for EBU broadcast TV (Europe). `-24` LUFS for ATSC A/85 (US broadcast). `-16` LUFS for podcasts (most podcast hosts target this — and it's fqmpeg's default). `-14` LUFS for streaming music (Spotify, YouTube, Apple Music all normalize incoming masters to around this level). `-10` LUFS for "loudness war" commercial reference (don't actually deliver this — it'll lose detail to platform normalization).

Q: Why does `audio-pitch` fail with "No such filter: rubberband"?

Your FFmpeg was built without `--enable-librubberband`. Static builds from [BtbN](https://github.com/BtbN/FFmpeg-Builds) and [gyan.dev](https://www.gyan.dev/ffmpeg/builds/) include it. On macOS the Homebrew `ffmpeg` formula includes Rubber Band; on Linux the official Debian package does too. If you can't replace your FFmpeg, fall back to `asetrate=44100*1.122462,aresample=44100,atempo=1/1.122462` for a 2-semitone shift — it's lower quality but uses only built-in filters.

Q: What's the difference between `audio-fade` and the C4 `fade` verb?

The C4 `fade` is video-only (visual cross-fade or dip-to-black). `audio-fade` only touches the audio track and copies the video stream. For both at once, run them in sequence: `npx fqmpeg fade ...` then `npx fqmpeg audio-fade ...` on the result.

Q: Is `censor` a video mosaic?

No. `censor` replaces audio with a sine-tone bleep. For visual censoring use the C6 `blur` or `pixelate` verbs on a cropped region, or compose with `drawbox` (C8) for an opaque rectangle.

fqmpeg's C9 cluster is the audio engine — 26 verbs that touch the audio track of any media file. Five do the basics (extract, strip, mute, volume, fade). Three handle loudness measurement and normalization (normalize, audio-normalize-peak, loudness-meter). Three are dynamics processors (audio-compressor, limiter, audio-gate). Six are EQ and frequency filters (audio-eq and four single-band filters plus audio-bandpass). Four are time / pitch / sync verbs (audio-pitch, audio-speed, audio-reverse, audio-delay). Five deal with noise and silence (audio-noise-reduce, detect-silence, trim-silence, silence-insert, censor).

This guide walks each verb against its source in src/commands/ of fqmpeg 3.0.3 — the underlying FFmpeg filter, the defaults, the output filename, and the gotchas you can't see from --help alone (normalize is EBU R128 loudnorm with fixed TP/LRA; audio-pitch uses rubberband so the dependency is non-trivial on some builds; audio-delay with a negative value isn't a delay at all but an atrim that advances audio by cutting the start; censor is an audio bleep — there's no video mosaic here; silence-insert is built from apad + adelay and only inserts at the head of the segment that follows the insertion point, not at an arbitrary midpoint splice).

What you'll get out of this guide

A decision matrix for the 26 verbs by task (basics / loudness / dynamics / EQ / time / silence)
Exact FFmpeg invocation each verb generates (verified --dry-run output)
Defaults, ranges, units (dB vs multiplier vs Hz vs semitones), and output filenames for every command
Three end-to-end recipes — podcast mastering, dialogue cleanup, and music-bed ducking

The 26 Verbs at a Glance

The cluster splits into six task groups. Pick the group, then the verb.

Group	Verbs	What they do
Basics	`audio`, `strip-audio`, `mute`, `volume`, `audio-fade`	Extract MP3, remove audio, mute a range, change gain, fade in/out
Loudness	`normalize`, `audio-normalize-peak`, `loudness-meter`	EBU R128 loudnorm, peak-normalize to dB target, measure LUFS
Dynamics	`audio-compressor`, `limiter`, `audio-gate`	Compress range, prevent clipping, gate below threshold
EQ & filters	`audio-eq`, `audio-bass-boost`, `audio-treble-boost`, `audio-bandpass`, `audio-highpass`, `audio-lowpass`	3-band EQ, single-band boost, pass / cut filters
Time / pitch / sync	`audio-pitch`, `audio-speed`, `audio-reverse`, `audio-delay`	Shift pitch without speed, change tempo, reverse, A/V sync offset
Noise & silence	`audio-noise-reduce`, `detect-silence`, `trim-silence`, `silence-insert`, `censor`	FFT denoise, find silent ranges, cut them out, insert silence, bleep a range

Five things to know before reading on:

Most verbs use -c:v copy to preserve video. If the input is a video file, the video stream is stream-copied without re-encoding. Only audio (extracts to MP3) and loudness-meter / detect-silence (no output file) deviate.
normalize is one-pass loudnorm with fixed TP=-1.5 and LRA=11. Only the integrated loudness target (--target, default -16 LUFS) is exposed. For two-pass measurement-then-apply, use loudness-meter to read the values first, then drop to raw FFmpeg.
volume accepts both multipliers and dB. 0.5 is half, 2.0 is double, 3dB and -5dB work too. The parser is parseVolumeLevel in utils.js — anything else is rejected before FFmpeg runs. Negative dB (-5dB) needs -- as end-of-options separator on the shell — see the volume section below. The same applies to negative audio-pitch semitones (-3) and negative audio-delay milliseconds (-200).
audio-delay is asymmetric. Positive values use adelay to push audio later. Negative values use atrim + asetpts=PTS-STARTPTS to advance audio by cutting the start — so the output is shorter than the input by exactly that amount.
audio-eq uses equalizer=f=1000:t=h:width=500:g=N for the mid band. That's a hardcoded mid-band centered at 1 kHz with a 500 Hz half-width — fine for vocal presence but not a true 3-band shelving EQ. If you need precise mid control, drop to raw FFmpeg with equalizer and pick your own center.

Basics

`audio` — Extract audio as MP3

Strips the video stream and re-encodes audio to MP3 at VBR quality 2 (roughly 190 kbps).

Source: src/commands/audio.js
Flags: -vn -c:a libmp3lame -q:a 2
Output: <input-stem>-audio.mp3

bash

$ npx fqmpeg audio input.mp4 --dry-run

  ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 input-audio.mp3

The output extension is always .mp3 regardless of the source codec — fqmpeg re-encodes. If you want a lossless copy of an AAC track, use raw FFmpeg with ffmpeg -i input.mp4 -vn -c:a copy input.m4a instead.

`strip-audio` — Remove the audio track

Drops audio, copies the video stream unchanged. Fast, no re-encode.

Source: src/commands/strip-audio.js
Flags: -an -c:v copy
Output: <input-stem>-noaudio.<ext>

bash

$ npx fqmpeg strip-audio input.mp4 --dry-run

  ffmpeg -i input.mp4 -an -c:v copy input-noaudio.mp4

Useful for B-roll prep before stacking or concat — mixed audio tracks across clips cause container-level mismatches that strip-audio sidesteps cleanly.

`mute` — Mute a time range

Silences audio between two timestamps while keeping the rest at full volume. Video stream is copied.

Source: src/commands/mute.js
Filter: volume=enable='between(t,<s>,<e>)':volume=0

Option	Default	Notes
`-s, --start <seconds>`	`0`	Start of mute window
`-e, --end <seconds>`	required	End of mute window
`-o, --output <path>`	`<input-stem>-muted.<ext>`	—

bash

$ npx fqmpeg mute input.mp4 -s 10 -e 15 --dry-run

  ffmpeg -i input.mp4 -af volume=enable='between(t,10,15)':volume=0 -c:v copy input-muted.mp4

--end is required (errors out if missing). The values are seconds only — no HH:MM:SS parsing here.

`volume` — Adjust audio gain

Multiplies amplitude or applies dB gain across the whole file.

Source: src/commands/volume.js
Filter: volume=<level> where <level> is a multiplier (0.5, 2.0) or dB (3dB, -5dB)
Validation: parseVolumeLevel rejects anything that isn't <number> or <number>dB
Output: <input-stem>-vol<sanitized-level>.<ext> (e.g. input-vol2.mp4, input-vol-5dB.mp4)

bash

$ npx fqmpeg volume input.mp4 2.0 --dry-run

  ffmpeg -i input.mp4 -af volume=2.0 -c:v copy input-vol2.0.mp4

$ npx fqmpeg volume input.mp4 --dry-run -- -5dB

  ffmpeg -i input.mp4 -af volume=-5dB -c:v copy input-vol-5dB.mp4

Negative dB values start with -, which commander.js (the CLI parser fqmpeg uses) treats as an option flag, so bare volume input.mp4 -5dB errors with unknown option '-5dB'. Pass -- as the end-of-options separator and place every --* option (like --dry-run) before --; the <level> positional then accepts the dash-prefixed value. Multipliers above ~3.0 (or dB above ~+10dB) on already-loud audio will clip — chain with limiter or use normalize instead when you want headroom.

`audio-fade` — Fade audio in and/or out

Applies an afade=in at the start, an afade=out near the end, or both. Video stream is copied.

Source: src/commands/audio-fade.js
Filter: afade=t=in:st=0:d=<in> + afade=t=out:st=<dur-out>:d=<out> chained with ,

Option	Default	Notes
`--in <seconds>`	`0`	Fade-in duration (omit or `0` for no fade-in)
`--out <seconds>`	`0`	Fade-out duration (requires `--duration`)
`--duration <seconds>`	required if `--out > 0`	Total audio length so the out-fade can start at the right offset

bash

$ npx fqmpeg audio-fade input.mp4 --in 2 --out 3 --duration 60 --dry-run

  ffmpeg -i input.mp4 -af afade=t=in:st=0:d=2,afade=t=out:st=57:d=3 -c:v copy input-audiofade.mp4

The duration requirement is structural — FFmpeg's afade=t=out needs an explicit start time, not "end minus X." Run npx fqmpeg duration input.mp4 first if you don't know the length.

Loudness Measurement & Normalization

`normalize` — One-pass EBU R128 loudnorm

Applies the loudnorm filter in single-pass mode targeting a configurable integrated loudness (LUFS). True peak and loudness range are hardcoded for safety.

Source: src/commands/normalize.js
Filter: loudnorm=I=<target>:TP=-1.5:LRA=11
Range: --target accepts -70 to -5 (LUFS), default -16

bash

$ npx fqmpeg normalize input.mp4 --target -14 --dry-run

  ffmpeg -i input.mp4 -af loudnorm=I=-14:TP=-1.5:LRA=11 -c:v copy input-normalized.mp4

Common targets: -23 LUFS (EBU broadcast), -16 LUFS (podcast / streaming, default), -14 LUFS (Spotify / YouTube), -10 LUFS (loud commercial reference). One-pass loudnorm is approximate — for broadcast-grade accuracy run a two-pass workflow.

`audio-normalize-peak` — Peak-normalize to a dB target

Uses linear loudnorm (true-peak-only mode) to scale audio so the loudest sample lands at --peak dB. Faster than loudnorm because no perceptual modeling runs.

Source: src/commands/audio-normalize-peak.js
Filter: loudnorm=I=-24:TP=<peak>:LRA=7:linear=true
Default: --peak 0 (peak hits 0 dBFS — risk of inter-sample peaks)

bash

$ npx fqmpeg audio-normalize-peak input.mp4 --peak -1 --dry-run

  ffmpeg -i input.mp4 -af loudnorm=I=-24:TP=-1:LRA=7:linear=true -c:v copy input-peak-norm.mp4

Use --peak -1 or --peak -3 for safer headroom — true 0 dBFS peaks survive lossless but can clip after MP3/AAC re-encode due to inter-sample reconstruction.

`loudness-meter` — Measure EBU R128 loudness

Read-only verb. Runs ebur128 and prints the integrated loudness, true peak, and loudness range to stderr. No output file.

Source: src/commands/loudness-meter.js
Filter: -af ebur128 -f null -
No --output, no -y — analyze only

bash

$ npx fqmpeg loudness-meter input.mp4 --dry-run

  ffmpeg -i input.mp4 -af ebur128 -f null -

Read the last Summary: block on stderr after running it for real — it shows I:, Threshold:, LRA:, Threshold:, LRA low:, LRA high: and True peak:. Pipe stderr through grep -E '(I|LRA|True peak):' to keep just the numbers.

Dynamics Processing

`audio-compressor` — Dynamic range compression

Reduces the difference between quiet and loud parts using acompressor.

Source: src/commands/audio-compressor.js
Filter: acompressor=threshold=<t>dB:ratio=<r>:attack=<a>:release=<r>

Option	Default	Notes
`--threshold <dB>`	`-20`	Compression starts above this level
`--ratio <n>`	`4`	`4` = 4:1, `2` = 2:1, etc.
`--attack <ms>`	`20`	How fast compression engages
`--release <ms>`	`250`	How fast compression releases

bash

$ npx fqmpeg audio-compressor input.mp4 --threshold -18 --ratio 3 --dry-run

  ffmpeg -i input.mp4 -af acompressor=threshold=-18dB:ratio=3:attack=20:release=250 -c:v copy input-compressed-audio.mp4

Output is <input>-compressed-audio.<ext> — note the suffix differs from the video-compression compress verb's <input>-compressed.<ext> to avoid collisions when you chain both on the same file.

`limiter` — Brick-wall limiter (`alimiter`)

Prevents audio from exceeding a ceiling. Faster attack and release than the compressor; used as the last stage before delivery.

Source: src/commands/limiter.js
Filter: alimiter=limit=<l>dB:attack=<a>:release=<r>
Defaults: --limit -1 dB, --attack 5 ms, --release 50 ms
Output: <input-stem>-limited.<ext>

bash

$ npx fqmpeg limiter input.mp4 --limit -1 --dry-run

  ffmpeg -i input.mp4 -af alimiter=limit=-1dB:attack=5:release=50 -c:v copy input-limited.mp4

-1 dB is the standard delivery ceiling for streaming (gives headroom for inter-sample peaks). For broadcast TV use -2 dB; for DVD/Blu-ray you can run to -0.3.

`audio-gate` — Noise gate

Silences audio that falls below a threshold — useful for cleaning up dialog with constant low-level room noise.

Source: src/commands/audio-gate.js
Filter: agate=threshold=<t>dB:attack=<a>:release=<r>
Defaults: --threshold -30 dB, --attack 20 ms, --release 250 ms
Output: <input-stem>-gated.<ext>

bash

$ npx fqmpeg audio-gate input.mp4 --threshold -35 --dry-run

  ffmpeg -i input.mp4 -af agate=threshold=-35dB:attack=20:release=250 -c:v copy input-gated.mp4

Set threshold a few dB above your noise floor (run loudness-meter or look at a waveform first). Too high and you'll cut into quiet dialog; too low and the gate doesn't help.

EQ & Frequency Filters

`audio-eq` — 3-band equalizer

Bass / mid / treble in one shot. Internally chains bass, equalizer (mid), and treble.

Source: src/commands/audio-eq.js

Filter assembly (only nonzero bands emit a filter):

text

bass=g=<bass>,
treble=g=<treble>,
equalizer=f=1000:t=h:width=500:g=<mid>

Option	Default	Range
`--bass <dB>`	`0`	`-20` to `20`
`--mid <dB>`	`0`	`-20` to `20` (hardcoded center 1 kHz, width 500 Hz)
`--treble <dB>`	`0`	`-20` to `20`

bash

$ npx fqmpeg audio-eq input.mp4 --bass 3 --treble 2 --dry-run

  ffmpeg -i input.mp4 -af bass=g=3,treble=g=2 -c:v copy input-eq.mp4

If all three values are zero the command errors out — at least one of --bass, --mid, --treble must be non-zero.

`audio-bass-boost` — Single-band bass boost

Wraps bass=g=<gain>:f=<freq> for precise center-frequency control.

Source: src/commands/audio-bass-boost.js
Defaults: --gain 10 dB, --freq 100 Hz
Output: <input-stem>-bass-boost.<ext>

bash

$ npx fqmpeg audio-bass-boost input.mp4 --gain 6 --freq 80 --dry-run

  ffmpeg -i input.mp4 -af bass=g=6:f=80 -c:v copy input-bass-boost.mp4

Center frequencies in the 60-120 Hz range hit kick-drum / male-voice fundamentals; 200+ Hz starts coloring midrange.

`audio-treble-boost` — Single-band treble boost

Wraps treble=g=<gain>:f=<freq>.

Source: src/commands/audio-treble-boost.js
Defaults: --gain 5 dB, --freq 3000 Hz
Output: <input-stem>-treble-boost.<ext>

bash

$ npx fqmpeg audio-treble-boost input.mp4 --gain 4 --freq 5000 --dry-run

  ffmpeg -i input.mp4 -af treble=g=4:f=5000 -c:v copy input-treble-boost.mp4

3-5 kHz adds vocal presence; 8-12 kHz adds "air"; above 16 kHz mostly affects perceived brightness on younger ears.

`audio-bandpass` — Band-pass filter

Keeps a narrow frequency band, attenuates everything outside.

Source: src/commands/audio-bandpass.js
Filter: bandpass=f=<freq>:w=<width>
Defaults: --freq 1000 Hz, --width 200 Hz
Output: <input-stem>-bandpass.<ext>

bash

$ npx fqmpeg audio-bandpass input.mp4 --freq 1500 --width 400 --dry-run

  ffmpeg -i input.mp4 -af bandpass=f=1500:w=400 -c:v copy input-bandpass.mp4

Use for telephone / walkie-talkie effects (300-3000 Hz band) or for isolating a problem frequency before notching it out.

`audio-highpass` — High-pass (remove low frequencies)

Cuts everything below the cutoff. Useful for removing rumble / HVAC / handling noise.

Source: src/commands/audio-highpass.js
Filter: highpass=f=<freq>
Default: --freq 200 Hz
Output: <input-stem>-highpass.<ext>

bash

$ npx fqmpeg audio-highpass input.mp4 --freq 80 --dry-run

  ffmpeg -i input.mp4 -af highpass=f=80 -c:v copy input-highpass.mp4

80 Hz is a safe default that removes most rumble without thinning male vocals; 120 Hz is aggressive (cuts low male voice); 200 Hz is "telephone bandwidth" territory.

`audio-lowpass` — Low-pass (remove high frequencies)

Cuts everything above the cutoff. Useful for taking the edge off sibilant / hissy material.

Source: src/commands/audio-lowpass.js
Filter: lowpass=f=<freq>
Default: --freq 3000 Hz
Output: <input-stem>-lowpass.<ext>

bash

$ npx fqmpeg audio-lowpass input.mp4 --freq 8000 --dry-run

  ffmpeg -i input.mp4 -af lowpass=f=8000 -c:v copy input-lowpass.mp4

The default 3000 Hz is dramatic (muffled / underwater feel). For gentle de-hissing on hot mics try --freq 10000 or 12000.

Time, Pitch & Sync

`audio-pitch` — Shift pitch without changing speed

Uses the Rubber Band library (must be compiled into your FFmpeg).

Source: src/commands/audio-pitch.js
Filter: rubberband=pitch=<factor> where factor = 2^(semitones/12)
Argument: <semitones> — positional, accepts positive or negative numbers (e.g. 2, -3, -1.5)
Output: <input-stem>-pitch+<n>.<ext> or <input-stem>-pitch<n>.<ext> (the sign is in the filename)

bash

$ npx fqmpeg audio-pitch input.mp4 2 --dry-run

  ffmpeg -i input.mp4 -af rubberband=pitch=1.122462 -c:v copy input-pitch+2.mp4

$ npx fqmpeg audio-pitch input.mp4 --dry-run -- -3

  ffmpeg -i input.mp4 -af rubberband=pitch=0.840896 -c:v copy input-pitch-3.mp4

If you get No such filter: 'rubberband', your FFmpeg was built without --enable-librubberband. Static builds from BtbN and gyan.dev include it; the Homebrew default does too.

`audio-speed` — Change playback tempo without changing pitch

Wraps atempo=<factor>. The native filter accepts 0.5 to 2.0 in a single instance — for stronger changes FFmpeg supports chaining (atempo=2.0,atempo=2.0 for 4x), but fqmpeg doesn't auto-chain. Stay within 0.5 to 2.0.

Source: src/commands/audio-speed.js
Argument: <factor> (positive number)
Output: <input-stem>-aspeed.<ext>

bash

$ npx fqmpeg audio-speed input.mp4 1.5 --dry-run

  ffmpeg -i input.mp4 -af atempo=1.5 -c:v copy input-aspeed.mp4

Note: only the audio is sped up — video timing is unchanged because -c:v copy keeps the original frame timestamps. To change both, use the C4 speed verb instead.

`audio-reverse` — Reverse audio only

Plays audio back-to-front while keeping video forward (creates a striking glitch effect).

Source: src/commands/audio-reverse.js
Filter: areverse
Output: <input-stem>-audio-reversed.<ext>

bash

$ npx fqmpeg audio-reverse input.mp4 --dry-run

  ffmpeg -i input.mp4 -af areverse -c:v copy input-audio-reversed.mp4

areverse buffers the entire audio stream in RAM — keep this in mind for multi-hour files. For both audio + video reverse, use the C4 reverse verb.

`audio-delay` — A/V sync offset (asymmetric semantics)

Delays or advances the audio relative to the video. Positive values delay; negative values advance.

Source: src/commands/audio-delay.js
Positive (delay): adelay=<ms>|<ms> — pads silence at the start of the audio track
Negative (advance): atrim=start=<sec>,asetpts=PTS-STARTPTS — cuts audio from the start, shortening the output by exactly that amount
Argument: <ms> — integer milliseconds, positive or negative
Output: <input-stem>-synced.<ext>

bash

$ npx fqmpeg audio-delay input.mp4 200 --dry-run

  ffmpeg -i input.mp4 -af adelay=200|200 -c:v copy input-synced.mp4

$ npx fqmpeg audio-delay input.mp4 --dry-run -- -200

  ffmpeg -i input.mp4 -af atrim=start=0.2,asetpts=PTS-STARTPTS -c:v copy input-synced.mp4

The asymmetry matters: if you advance audio by 200 ms, the first 200 ms of audio is gone, not just shifted. That's fine for fixing capture-side latency where the early audio is silence anyway, but it's destructive if the start contained content.

Noise & Silence

`audio-noise-reduce` — FFT-based denoising

Wraps afftdn (FFT denoiser). Works best on steady-state noise (fan hum, traffic, tape hiss).

Source: src/commands/audio-noise-reduce.js
Filter: afftdn=nr=<strength>
Default: --strength 12 dB
Output: <input-stem>-denoised.<ext>

bash

$ npx fqmpeg audio-noise-reduce input.mp4 --strength 18 --dry-run

  ffmpeg -i input.mp4 -af afftdn=nr=18 -c:v copy input-denoised.mp4

--strength 12 is gentle (audible noise reduction without artifacts on speech); --strength 24 is aggressive (cleaner but starts to add "underwater" artifacts). Run on a 5-second sample first to find the right level for your material.

`detect-silence` — Find silent ranges (analyze-only)

Read-only verb. Runs silencedetect and prints silence_start / silence_end / silence_duration lines to stderr. No output file.

Source: src/commands/detect-silence.js
Filter: -af silencedetect=noise=<t>dB:d=<dur> -f null -

Option	Default	Notes
`--threshold <dB>`	`-30`	Audio quieter than this counts as silence
`--duration <sec>`	`2`	Minimum gap length to report

bash

$ npx fqmpeg detect-silence input.mp4 --threshold -40 --duration 1 --dry-run

  ffmpeg -i input.mp4 -af silencedetect=noise=-40dB:d=1 -f null -

Pipe stderr through grep silence_ to extract just the timing data. Useful for scripted scene-detection before chopping interviews into segments.

`trim-silence` — Remove silent sections

Uses silenceremove to cut out silence above a minimum duration. Both leading silence and inter-segment silence are removed.

Source: src/commands/trim-silence.js

Filter:

text

silenceremove=start_periods=1:
start_duration=0:
start_threshold=<t>dB:
stop_periods=-1:
stop_duration=<d>:
stop_threshold=<t>dB

Defaults: --threshold -30 dB, --min-duration 0.5 sec
Output: <input-stem>-trimmed-silence.<ext>

bash

$ npx fqmpeg trim-silence input.mp4 --threshold -35 --min-duration 1 --dry-run

  ffmpeg -i input.mp4 -af silenceremove=start_periods=1:start_duration=0:start_threshold=-35dB:stop_periods=-1:stop_duration=1:stop_threshold=-35dB -c:v copy input-trimmed-silence.mp4

stop_periods=-1 means "remove all silent sections" (not just the first one). The video stream is still copied — A/V will drift, since audio is shorter than video after removal. For an interview cut where you want video to follow, drop to raw FFmpeg with -filter_complex and re-time the video together.

`silence-insert` — Insert silence at a position

Pads silence at an offset from the start using apad + adelay.

Source: src/commands/silence-insert.js
Filter: apad=pad_dur=<dur>:pad_len=0,adelay=<atMs>|<atMs>
Arguments: <at> (seconds offset) and <duration> (seconds of silence)
Output: <input-stem>-silence-ins.<ext>

bash

$ npx fqmpeg silence-insert input.mp4 10 3 --dry-run

  ffmpeg -i input.mp4 -af apad=pad_dur=3:pad_len=0,adelay=10000|10000 -c:v copy input-silence-ins.mp4

Caveat: this isn't a clean "splice" at time <at>. The way apad + adelay chain, the entire audio gets shifted by <at> ms after appending the pad — so the original audio plays starting at <at>, with <duration> seconds of pad at the very end. If you need a true midpoint splice (original 0…10 + silence + original 10…end), use a multi-input filter graph in raw FFmpeg.

`censor` — Bleep-tone over a range (audio, not video)

Replaces audio in a time range with a sine-wave beep. Note: this is an audio censor, not a video mosaic — see blur or pixelate for visual blurring.

Source: src/commands/censor.js

Filter: muting the original audio in the range and mixing in a generated sine wave for the same range:

text

[0:a]volume=enable='between(t,<s>,<e>)':volume=0[a0];
[1:a]volume=enable='between(t,<s>,<e>)':volume=1:eval=frame[a1];
[a0][a1]amix=inputs=2:duration=first[aout]

Option	Default	Notes
`-s, --start <seconds>`	`0`	Start of bleep window
`-e, --end <seconds>`	required	End of bleep window
`--freq <Hz>`	`1000`	Bleep frequency

bash

$ npx fqmpeg censor input.mp4 -s 10 -e 12 --dry-run

  ffmpeg -i input.mp4 -f lavfi -i sine=frequency=1000 -filter_complex [0:a]volume=enable='between(t,10,12)':volume=0[a0];[1:a]volume=enable='between(t,10,12)':volume=1:eval=frame[a1];[a0][a1]amix=inputs=2:duration=first[aout] -map 0:v? -map [aout] -c:v copy -shortest input-censored.mp4

The output uses -map 0:v? (optional video) so it works on audio-only inputs too. -shortest ensures the sine generator doesn't push past the source duration.

Real-world Recipes

Three end-to-end audio pipelines that chain multiple verbs.

1. Podcast mastering (gate → compress → normalize → limit)

A standard delivery chain for a recorded interview. The order matters — gate first to clean up room noise, compress to even out levels, normalize to a target loudness, then limit to prevent any final overshoot.

bash

# 1. Gate out room noise below -45 dB
npx fqmpeg audio-gate raw.wav --threshold -45 -o step1.wav

# 2. Compress dynamics (3:1, gentle attack)
npx fqmpeg audio-compressor step1.wav --threshold -18 --ratio 3 -o step2.wav

# 3. Normalize to podcast standard (-16 LUFS)
npx fqmpeg normalize step2.wav --target -16 -o step3.wav

# 4. Brick-wall limit at -1 dB
npx fqmpeg limiter step3.wav --limit -1 -o final.wav

For high-volume production, replace the intermediate .wav files with named pipes or move to a single ffmpeg -af gate,acompressor,loudnorm,alimiter invocation. fqmpeg's --dry-run on each step gives you the building blocks.

2. Dialogue cleanup (highpass → denoise → de-ess by lowpass-bandpass split)

For dialog with mic rumble, room hiss, and harsh sibilance:

bash

# 1. Remove rumble below 80 Hz
npx fqmpeg audio-highpass dialogue.wav --freq 80 -o step1.wav

# 2. Reduce broadband hiss (steady-state noise)
npx fqmpeg audio-noise-reduce step1.wav --strength 15 -o step2.wav

# 3. Inspect loudness before final stages
npx fqmpeg loudness-meter step2.wav 2>&1 | grep -E '(I|LRA|True peak):'

# 4. Tame harsh sibilance with a gentle lowpass at 10 kHz
npx fqmpeg audio-lowpass step2.wav --freq 10000 -o final.wav

loudness-meter between steps is the audio-equivalent of crop-detect in the geometry cluster — measure, decide, then act.

3. Music-bed sync (delay correction + normalize for ducking prep)

When the music track was captured separately and needs to sit "under" a voice:

bash

# 1. Fix sync drift (music starts 350 ms early relative to voice)
npx fqmpeg audio-delay music.wav 350 -o synced.wav

# 2. Normalize music to a quieter target so it doesn't fight voice (-22 LUFS)
npx fqmpeg normalize synced.wav --target -22 -o quiet-music.wav

# 3. Fade in / out cleanly (3 s tails)
npx fqmpeg audio-fade quiet-music.wav --in 3 --out 3 --duration 120 -o bed.wav

# 4. Mix with voice (drop to raw ffmpeg or the C11 mix-audio verb)
ffmpeg -i voice.wav -i bed.wav -filter_complex amix=inputs=2:duration=longest:weights=1.5\ 1 final.wav

The chain stays in fqmpeg until the final mix — at the mix stage you either step up to raw FFmpeg or use the C11 mix-audio verb (covered in the audio-routing deep dive).

Frequently Asked Questions

How is `normalize` different from `volume`?

volume is a fixed gain multiplier or dB offset — it shifts every sample by the same amount and doesn't think about loudness. normalize runs the loudnorm (EBU R128) filter and shifts gain so the measured integrated loudness matches your target in LUFS. For "make this sound like everything else on the platform" you want normalize; for "make this exactly 6 dB louder" you want volume.

Which loudness target should I use for `normalize`?

-23 LUFS for EBU broadcast TV (Europe). -24 LUFS for ATSC A/85 (US broadcast). -16 LUFS for podcasts (most podcast hosts target this — and it's fqmpeg's default). -14 LUFS for streaming music (Spotify, YouTube, Apple Music all normalize incoming masters to around this level). -10 LUFS for "loudness war" commercial reference (don't actually deliver this — it'll lose detail to platform normalization).

Why does `audio-pitch` fail with "No such filter: rubberband"?

Your FFmpeg was built without --enable-librubberband. Static builds from BtbN and gyan.dev include it. On macOS the Homebrew ffmpeg formula includes Rubber Band; on Linux the official Debian package does too. If you can't replace your FFmpeg, fall back to asetrate=44100*1.122462,aresample=44100,atempo=1/1.122462 for a 2-semitone shift — it's lower quality but uses only built-in filters.

What's the difference between `audio-fade` and the C4 `fade` verb?

The C4 fade is video-only (visual cross-fade or dip-to-black). audio-fade only touches the audio track and copies the video stream. For both at once, run them in sequence: npx fqmpeg fade ... then npx fqmpeg audio-fade ... on the result.

How do I capture `loudness-meter` and `detect-silence` output?

Both verbs print to stderr and don't write a file. Redirect with 2> to capture, or pipe with 2>&1 to filter:

bash

npx fqmpeg loudness-meter input.mp4 2>&1 | grep -E '(I|LRA|True peak):'
npx fqmpeg detect-silence input.mp4 --threshold -40 2>&1 | grep silence_

In a CI or script, parse the stderr lines for the values you need.

Does `trim-silence` keep audio and video in sync?

No. The video stream is copied unchanged while audio gets shorter as silent sections are removed — so the output will drift. For an "interview cut" where the video should follow the audio cuts, you need a single-pass filter_complex with matched select and aselect filters. fqmpeg doesn't expose this directly; drop to raw FFmpeg or pre-edit with the C4 trim verb at the timestamps detect-silence reports.

Is `censor` a video mosaic?

No. censor replaces audio with a sine-tone bleep. For visual censoring use the C6 blur or pixelate verbs on a cropped region, or compose with drawbox (C8) for an opaque rectangle.

How do I chain multiple audio filters without intermediate files?

For two or three steps, fqmpeg's intermediate-file approach is fine — disk is fast and SSDs make the I/O cost negligible. For longer chains where you want a single FFmpeg invocation, run each verb with --dry-run, copy the -af value, and combine them with commas:

bash

ffmpeg -i input.wav -af "highpass=f=80,afftdn=nr=15,loudnorm=I=-16:TP=-1.5:LRA=11,alimiter=limit=-1dB:attack=5:release=50" -c:v copy out.wav

This is exactly what fqmpeg's single-step commands do internally — combining them just removes the disk round-trips.

What audio codec does fqmpeg use for output?

For most verbs, the audio codec is whatever the input container negotiates with default re-encoding (typically AAC for MP4, Vorbis/Opus for WebM, MP3 for .mp3). fqmpeg doesn't explicitly set -c:a, so FFmpeg picks the matching codec for the output extension. The exception is the audio verb which always outputs MP3 via libmp3lame -q:a 2.

Wrapping Up

fqmpeg's C9 cluster covers the audio engineer's everyday toolbox: 26 verbs you can chain to build a podcast master, clean up dialog, or sync a music bed without writing a single filter graph. The non-obvious bits — audio-delay's asymmetric semantics, audio-pitch's Rubber Band dependency, audio-eq's hardcoded mid band, trim-silence's A/V drift, silence-insert's tail-shifting behavior — are documented above so you don't hit them in production.

Two next steps:

Run npx fqmpeg <verb> --help for any verb above to see the live option list.
For creative effects (reverb, echo, chorus, phaser, etc.), see the C10 deep dive (creative audio effects). For routing, channels, and visualization (waveforms, spectrograms), see C11 (audio routing & visualization).

The next time you reach for a -af filter, check the verb list first.

The 26 Verbs at a Glance

Basics

audio — Extract audio as MP3

strip-audio — Remove the audio track

mute — Mute a time range

volume — Adjust audio gain

audio-fade — Fade audio in and/or out

Loudness Measurement & Normalization

normalize — One-pass EBU R128 loudnorm

audio-normalize-peak — Peak-normalize to a dB target

loudness-meter — Measure EBU R128 loudness

Dynamics Processing

audio-compressor — Dynamic range compression

limiter — Brick-wall limiter (alimiter)

audio-gate — Noise gate

EQ & Frequency Filters

audio-eq — 3-band equalizer

audio-bass-boost — Single-band bass boost

audio-treble-boost — Single-band treble boost

audio-bandpass — Band-pass filter

audio-highpass — High-pass (remove low frequencies)

audio-lowpass — Low-pass (remove high frequencies)

Time, Pitch & Sync

audio-pitch — Shift pitch without changing speed

audio-speed — Change playback tempo without changing pitch

audio-reverse — Reverse audio only

audio-delay — A/V sync offset (asymmetric semantics)

Noise & Silence

audio-noise-reduce — FFT-based denoising

detect-silence — Find silent ranges (analyze-only)

trim-silence — Remove silent sections

silence-insert — Insert silence at a position

censor — Bleep-tone over a range (audio, not video)

Real-world Recipes

1. Podcast mastering (gate → compress → normalize → limit)

2. Dialogue cleanup (highpass → denoise → de-ess by lowpass-bandpass split)

3. Music-bed sync (delay correction + normalize for ducking prep)

Frequently Asked Questions

How is normalize different from volume?

Which loudness target should I use for normalize?

Why does audio-pitch fail with "No such filter: rubberband"?

What's the difference between audio-fade and the C4 fade verb?

How do I capture loudness-meter and detect-silence output?

Does trim-silence keep audio and video in sync?

Is censor a video mosaic?

How do I chain multiple audio filters without intermediate files?

What audio codec does fqmpeg use for output?

Wrapping Up

`audio` — Extract audio as MP3

`strip-audio` — Remove the audio track

`mute` — Mute a time range

`volume` — Adjust audio gain

`audio-fade` — Fade audio in and/or out

`normalize` — One-pass EBU R128 loudnorm

`audio-normalize-peak` — Peak-normalize to a dB target

`loudness-meter` — Measure EBU R128 loudness

`audio-compressor` — Dynamic range compression

`limiter` — Brick-wall limiter (`alimiter`)

`audio-gate` — Noise gate

`audio-eq` — 3-band equalizer

`audio-bass-boost` — Single-band bass boost

`audio-treble-boost` — Single-band treble boost

`audio-bandpass` — Band-pass filter

`audio-highpass` — High-pass (remove low frequencies)

`audio-lowpass` — Low-pass (remove high frequencies)

`audio-pitch` — Shift pitch without changing speed

`audio-speed` — Change playback tempo without changing pitch

`audio-reverse` — Reverse audio only

`audio-delay` — A/V sync offset (asymmetric semantics)

`audio-noise-reduce` — FFT-based denoising

`detect-silence` — Find silent ranges (analyze-only)

`trim-silence` — Remove silent sections

`silence-insert` — Insert silence at a position

`censor` — Bleep-tone over a range (audio, not video)

How is `normalize` different from `volume`?

Which loudness target should I use for `normalize`?

Why does `audio-pitch` fail with "No such filter: rubberband"?

What's the difference between `audio-fade` and the C4 `fade` verb?

How do I capture `loudness-meter` and `detect-silence` output?

Does `trim-silence` keep audio and video in sync?

Is `censor` a video mosaic?