fqmpeg's C10 cluster is the effects-pedal box of the toolkit — nine verbs that color audio rather than fix it. Two are time-based echoes (reverb, echo-effect). Five are LFO modulation effects (chorus, phaser, flanger, tremolo, vibrato). Two are stereo-field tricks (audio-karaoke, audio-stereo-widen).
Compared with the C9 dynamics/EQ verbs, C10 is small and the implementations are short — but every one of them wraps a multi-parameter FFmpeg filter behind a 2-or-3-option surface. This guide walks each verb against src/commands/ of fqmpeg 3.0.3 and is honest about what's hardcoded and why. (Some hardcodings are sensible — they hide DSP coefficients that would only invite footguns. Others are limitations that you should know about before you ship a render.)
What you'll get out of this guide
- A decision matrix for the 9 verbs by sonic effect (time-based / modulation / stereo)
- Exact FFmpeg invocation each verb generates (verified
--dry-runoutput) - Defaults, units, output filenames — and the filter coefficients fqmpeg fixes for you
- Three recipes — vocal warm-up, lo-fi guitar, AM-radio dialogue — and the escape hatches when the simplified surface isn't enough
The 9 Verbs at a Glance
All nine verbs preserve video with -c:v copy — drop a video file in and you get the same video with the processed audio.
| Group | Verbs | What they do |
|---|---|---|
| Time-based | reverb, echo-effect | Delay/reflection simulation via aecho |
| Modulation | chorus, phaser, flanger, tremolo, vibrato | LFO-driven pitch / time / amplitude modulation |
| Stereo | audio-karaoke, audio-stereo-widen | Center-channel removal, Haas-style widening |
Five things to know before reading on:
reverbis not a true reverb. It's a single-tapaechofilter with hardcodedin_gain=0.8andout_gain=0.88. Real reverb (impulse-response convolution or a Schroeder network) needsafiror thefreeverbfilter — for that, drop to raw FFmpeg.reverbhere is "a touch of room" depth, not a cathedral.chorusships a 3-voice preset that is not configurable. The voices have hardcoded delays50|60|70ms and decays0.4|0.32|0.28. Only the modulation depth and speed are exposed. The rationale: well-tuned chorus needs intuition about voice spacing, and exposing all 6 parameters in a CLI would invite settings that sound broken. If you need a custom multi-voice arrangement, run the FFmpegchorusfilter directly (it accepts up to 32 voices).flanger --mixis internally remapped towidth. FFmpeg'sflangerfilter useswidth=0-100for wet/dry blend, notmix=0-1. fqmpeg accepts the more conventional--mix 0.0-1.0and multiplies by 100 before passing it in. So--mix 0.7becomeswidth=70. (This was a B7 bugfix — earlier fqmpeg released a brokenmix=form that the filter silently ignored.)tremoloandvibratohave identical option surfaces but completely different effects. Tremolo modulates volume (an LFO multiplies amplitude). Vibrato modulates pitch (an LFO shifts frequency). Same--freq/--depthflags, same defaults (5 Hz, 0.5), same range — but you cannot substitute one for the other.audio-karaokeonly works on dead-center, dry vocals. The filter is the classicpan=stereo|c0=c0-c1|c1=c1-c0phase-cancellation trick. It assumes the vocal is panned identically into both channels with no stereo widening, reverb, or chorus on the vocal bus. Modern pop mixes break all three of those assumptions. Expect residual artifacts.
Time-Based: Reverb & Echo
Both verbs use the same FFmpeg filter (aecho) — the difference is configuration. reverb is one short tap (typically 40 ms) used as ambience. echo-effect is a chain of decaying repeats used as a distinct musical effect.
reverb — Add reverb-like ambience to audio
A single-tap echo masquerading as reverb. Good for adding a touch of space, not for emulating a hall.
- Source:
src/commands/reverb.js - Filter:
aecho=0.8:0.88:<delay>:<decay> - Output:
<input-stem>-reverb.<ext>
| Option | Default | Notes |
|---|---|---|
--delay <ms> | 40 | Delay between dry and wet tap |
--decay <n> | 0.5 | Decay factor (0.0-1.0) |
-o, --output <path> | <input-stem>-reverb.<ext> | — |
$ npx fqmpeg reverb input.mp4 --dry-run
ffmpeg -i input.mp4 -af aecho=0.8:0.88:40:0.5 -c:v copy input-reverb.mp4
What's hardcoded and why: in_gain=0.8 and out_gain=0.88 are fixed in the aecho filter string. These are dry/wet attenuation — they don't change the character of the echo, only its loudness relative to the source. fqmpeg's choice is a safe mid-blend that doesn't clip on typical inputs.
When you outgrow it: real reverb is multiple decorrelated delay lines (a Schroeder network) or impulse-response convolution. For a credible hall/plate sound, switch to freeverb (if your FFmpeg build has it) or afir with an impulse-response WAV:
ffmpeg -i input.mp4 -i hall_ir.wav -filter_complex "[0:a][1:a]afir=dry=10:wet=10[a]" \
-map 0:v -map "[a]" -c:v copy hall-reverb.mp4
echo-effect — Add echo / delay with multiple taps
Generates a geometric chain of echoes: delay, 2×delay, 3×delay, ..., each one quieter than the last by a factor of decay^i.
- Source:
src/commands/echo-effect.js - Filter:
aecho=0.8:0.88:<d>|<2d>|<3d>...:<c>|<c²>|<c³>... - Output:
<input-stem>-echo.<ext>
| Option | Default | Notes |
|---|---|---|
--delay <ms> | 500 | Base delay; subsequent taps are 2×, 3×, ... |
--decay <n> | 0.3 | Decay factor; subsequent taps decay geometrically |
--repeats <n> | 3 | Number of echo taps |
-o, --output <path> | <input-stem>-echo.<ext> | — |
$ npx fqmpeg echo-effect input.mp4 --dry-run
ffmpeg -i input.mp4 -af aecho=0.8:0.88:500|1000|1500:0.300|0.090|0.027 -c:v copy input-echo.mp4
What's hardcoded and why: like reverb, the in_gain/out_gain are fixed at 0.8:0.88. The geometric decay (decay^i for tap i) is not "hardcoded" in a footgun sense — it's the natural physical model for a single reflective surface losing energy on each bounce. Exposing per-tap delays/decays would let you build pathological combs, so fqmpeg ties them together.
When you outgrow it: if you want irregular tap spacing (a slap-back into a long tail, or stereo ping-pong), drop straight to aecho with |-separated lists, or use adelay for an exact-millisecond stereo offset:
ffmpeg -i input.mp3 -af "aecho=0.8:0.9:60|300|800:0.5|0.3|0.15" out.mp3
Modulation Effects
All five modulation verbs are driven by a low-frequency oscillator (LFO) that varies some property of the signal over time. The shared mental model: pick a speed (how fast the LFO cycles, in Hz) and a depth (how much it varies).
chorus — Add a chorus / thickening effect
Layers 3 slightly detuned, slightly delayed copies on top of the dry signal. Sounds like multiple performers playing the same line.
- Source:
src/commands/chorus.js - Filter:
chorus=0.5:0.9:50|60|70:0.4|0.32|0.28:<depth>|<depth>|<depth>:<speed>|<speed>|<speed> - Output:
<input-stem>-chorus.<ext>
| Option | Default | Notes |
|---|---|---|
--depth <ms> | 2 | Modulation depth (sweep range, applied to all 3 voices) |
--speed <Hz> | 0.5 | Modulation speed (applied to all 3 voices) |
-o, --output <path> | <input-stem>-chorus.<ext> | — |
$ npx fqmpeg chorus input.mp4 --dry-run
ffmpeg -i input.mp4 -af chorus=0.5:0.9:50|60|70:0.4|0.32|0.28:2|2|2:0.5|0.5|0.5 -c:v copy input-chorus.mp4
What's hardcoded and why: quite a lot. The 3-voice configuration is fixed: per-voice delays 50|60|70 ms, per-voice decays 0.4|0.32|0.28, in/out gain 0.5:0.9. Only --depth and --speed are surfaced, and both are applied uniformly to all three voices.
This is a deliberate "preset" choice. Well-tuned chorus depends on the spread between voices — if all three have the same delay, you get a single thicker echo, not chorus. If the delays are too close (e.g. 50|51|52), it sounds like a comb filter. The fqmpeg preset (50|60|70 ms with descending decays) is a "warm pop chorus" that works on vocals, electric guitar, and synth pads. It will not give you ethereal pads with wide stereo spread — that needs different voice spacing and different per-voice modulation rates.
When you outgrow it: invoke the chorus filter directly. It accepts arbitrary voice counts via |-separated lists:
ffmpeg -i input.wav -af "chorus=0.6:0.9:30|45|60|80:0.3|0.25|0.2|0.15:1.5|2|2.5|3:0.3|0.4|0.5|0.6" wide-chorus.wav
phaser — Apply a sweeping phaser effect
Combines the signal with a phase-shifted copy of itself, producing the classic "whoosh" sweep.
- Source:
src/commands/phaser.js - Filter:
aphaser=speed=<speed>:decay=<decay> - Output:
<input-stem>-phaser.<ext>
| Option | Default | Notes |
|---|---|---|
--speed <Hz> | 0.5 | LFO speed |
--decay <n> | 0.4 | Decay factor (0.0-1.0) controls feedback intensity |
-o, --output <path> | <input-stem>-phaser.<ext> | — |
$ npx fqmpeg phaser input.mp4 --dry-run
ffmpeg -i input.mp4 -af aphaser=speed=0.5:decay=0.4 -c:v copy input-phaser.mp4
The FFmpeg aphaser filter has additional parameters (in_gain, out_gain, delay, type for sinusoidal vs triangular LFO) that fqmpeg leaves at the filter's own defaults — pass them directly to raw aphaser=... if you want triangular sweep or a different stage count.
flanger — Apply a flanger effect
Like phaser but with a much shorter, modulating delay — gives the metallic "jet engine" sweep familiar from late-70s rock.
- Source:
src/commands/flanger.js - Filter:
flanger=speed=<speed>:depth=<depth>:width=<mix×100> - Output:
<input-stem>-flanger.<ext>
| Option | Default | Notes |
|---|---|---|
--speed <Hz> | 0.5 | LFO speed |
--depth <ms> | 2 | Modulation depth |
--mix <n> | 0.7 | Dry/wet mix (0.0-1.0) — fqmpeg multiplies by 100 internally |
-o, --output <path> | <input-stem>-flanger.<ext> | — |
$ npx fqmpeg flanger input.mp4 --dry-run
ffmpeg -i input.mp4 -af flanger=speed=0.5:depth=2:width=70 -c:v copy input-flanger.mp4
The --mix → width mapping: FFmpeg's flanger filter calls its wet/dry parameter width and accepts 0-100. fqmpeg uses the more conventional --mix 0.0-1.0 and silently multiplies by 100. This was a 3.0 bugfix — earlier fqmpeg passed mix=0.7 directly to the filter, which the filter ignored, so the effect was applied at the filter's own default (100% wet — far too much). Run with --dry-run to confirm your --mix 0.7 produces width=70.
tremolo — Apply tremolo (volume oscillation)
Modulates output volume by an LFO. Classic surf-rock guitar amp effect.
- Source:
src/commands/tremolo.js - Filter:
tremolo=f=<freq>:d=<depth> - Output:
<input-stem>-tremolo.<ext>
| Option | Default | Notes |
|---|---|---|
--freq <Hz> | 5 | LFO frequency |
--depth <n> | 0.5 | Modulation depth (0-1); higher = more pronounced volume swell |
-o, --output <path> | <input-stem>-tremolo.<ext> | — |
$ npx fqmpeg tremolo input.mp4 --dry-run
ffmpeg -i input.mp4 -af tremolo=f=5:d=0.5 -c:v copy input-tremolo.mp4
vibrato — Apply vibrato (pitch oscillation)
Modulates pitch (not volume) by an LFO. Same option surface as tremolo — be careful not to confuse them.
- Source:
src/commands/vibrato.js - Filter:
vibrato=f=<freq>:d=<depth> - Output:
<input-stem>-vibrato.<ext>
| Option | Default | Notes |
|---|---|---|
--freq <Hz> | 5 | LFO frequency |
--depth <n> | 0.5 | Modulation depth (0-1); higher = wider pitch swing |
-o, --output <path> | <input-stem>-vibrato.<ext> | — |
$ npx fqmpeg vibrato input.mp4 --dry-run
ffmpeg -i input.mp4 -af vibrato=f=5:d=0.5 -c:v copy input-vibrato.mp4
Tremolo vs vibrato: identical CLI, opposite effect. If you ran tremolo and the result sounds like the source went seasick (pitch wobbling) instead of swelling (volume rising and falling), you accidentally called vibrato. Quick test: at --depth 1.0 --freq 0.5, tremolo cycles between silent and loud once every 2 seconds; vibrato cycles between low and high pitch.
Stereo Manipulation
audio-karaoke — Remove center-panned vocals
Subtracts the right channel from the left and vice versa, cancelling anything panned identically into both channels. Classic karaoke trick.
- Source:
src/commands/audio-karaoke.js - Filter:
pan=stereo|c0=c0-c1|c1=c1-c0 - Options: none — just an input and optional
-o - Output:
<input-stem>-karaoke.<ext>
$ npx fqmpeg audio-karaoke song.mp3 --dry-run
ffmpeg -i song.mp3 -af pan=stereo|c0=c0-c1|c1=c1-c0 -c:v copy song-karaoke.mp3
The honest limitations:
- Works only on dead-center, dry vocals. If the vocal has reverb, doubler, chorus, or any stereo widening on its own bus, those wet components survive the subtraction.
- Kills anything center-panned, including kick drum, bass, and snare. Most pop mixes pan all four of those center, so you lose the rhythm section along with the vocal.
- Modern streaming masters are heavily processed, and the "center channel" assumption breaks down — you'll typically hear residual vocal at -10 to -15 dB rather than full removal.
For credible vocal isolation/removal on modern tracks, the only reliable approach is ML-based source separation (Spleeter, Demucs) — that's outside FFmpeg's scope.
audio-stereo-widen — Widen the stereo image
Adds a Haas-style short delay between channels to push perceived width out past the speakers.
- Source:
src/commands/audio-stereo-widen.js - Filter:
stereowiden=delay=<delay> - Output:
<input-stem>-wide.<ext>
| Option | Default | Notes |
|---|---|---|
--delay <ms> | 20 | Inter-channel delay; higher = wider but more phasey |
-o, --output <path> | <input-stem>-wide.<ext> | — |
$ npx fqmpeg audio-stereo-widen input.mp4 --dry-run
ffmpeg -i input.mp4 -af stereowiden=delay=20 -c:v copy input-wide.mp4
Mono summation warning: the Haas trick relies on small inter-channel delays, which means if your output is summed to mono (radio broadcast, phone speaker, Bluetooth headset on mono mode), the delay becomes a comb filter and the audio sounds thin and hollow. Check mono compatibility — use a downmix preview: ffmpeg -i input-wide.mp4 -ac 1 -t 10 -f null - and listen with -filter_complex amerge. If the widening is for an online video and you don't care about mono playback, ignore this.
Real-World Recipes
Vocal warm-up: lift a dry voice track
A dry voice recording sounds clinical. A touch of reverb and a light chorus adds the production polish typical of podcast intros and YouTube voice-overs — without sounding processed.
# Step 1: subtle chorus for body (very light depth/speed)
npx fqmpeg chorus voice.wav --depth 1.5 --speed 0.3 -o voice-chorus.wav
# Step 2: short reverb tail for room sense
npx fqmpeg reverb voice-chorus.wav --delay 60 --decay 0.3 -o voice-ready.wav
Why this order: chorus first thickens the source, then reverb places the thickened result in a small room. Reverse the order and the chorus voices each get their own reverb tail — muddier.
Lo-fi guitar layer: phaser + tremolo
For a chillhop guitar bed, layer phaser sweep onto tremolo pulse:
# Slow phaser sweep (long cycle)
npx fqmpeg phaser guitar.wav --speed 0.2 --decay 0.5 -o guitar-phaser.wav
# Slow tremolo pulse on top (1 cycle per second)
npx fqmpeg tremolo guitar-phaser.wav --freq 1 --depth 0.4 -o guitar-lofi.wav
The phaser provides the textural movement; the tremolo provides the rhythmic pulse. Both at slow rates — fast modulation pushes this from "lo-fi" to "broken cassette."
AM-radio dialogue effect
The classic "voice through a telephone" effect needs bandpass filtering (in C9, not here) plus distortion or echo. Combining audio-bandpass with echo-effect is a credible quick approximation:
# Step 1: telephone-band filter (300-3400 Hz) — C9 verb
npx fqmpeg audio-bandpass voice.wav --low 300 --high 3400 -o voice-band.wav
# Step 2: short tinny echo
npx fqmpeg echo-effect voice-band.wav --delay 60 --decay 0.5 --repeats 2 -o voice-radio.wav
Real AM-radio dialogue also adds amplitude clipping and noise — for those, you'd reach for raw FFmpeg's acrusher and anoisesrc. fqmpeg doesn't currently expose either.
Frequently Asked Questions
Why is reverb so different from a real DAW reverb plugin?
Because under the hood it's a single-tap aecho filter, not an impulse-response or Schroeder network reverb. With --delay 40 --decay 0.5 you get one discrete reflection at 40 ms, attenuated to 50% — that's enough to suggest "small room" if mixed lightly, but it lacks the dense early-reflection cluster and diffuse tail that defines a real space. For credible reverb, switch to ffmpeg ... -af freeverb=... (if your build has it) or convolution via afir with an impulse-response WAV.
Can I tune chorus to sound less "warm" and more "ethereal"?
Not via fqmpeg — the 3-voice configuration (delays 50|60|70 ms, decays 0.4|0.32|0.28) is hardcoded. You can change only the LFO depth and speed, which controls how much the existing 3 voices wobble, not how they're spaced. To get an ethereal/wide chorus (e.g. 8 voices spread 20-200 ms with low decay), call FFmpeg's chorus filter directly with custom |-separated lists. Use npx fqmpeg chorus input --dry-run to see the format, then edit the lists in a manual FFmpeg invocation.
flanger --mix 0.7 looks like it's producing width=70 — is that a bug?
No, that's the intended behavior. FFmpeg's underlying flanger filter expects width=0-100 for wet/dry, but the conventional CLI convention for "mix" is 0-1. fqmpeg accepts the 0-1 form and multiplies by 100. The earlier (pre-v3.0) versions passed mix=0.7 directly to the filter, which the filter silently ignored and ran at its default (100% wet). The multiplication is the fix.
How is vibrato different from audio-pitch (in C9)?
audio-pitch shifts pitch by a fixed number of semitones, applied uniformly to the whole track. vibrato oscillates pitch up and down around the original at a chosen rate — the average pitch is unchanged. Use audio-pitch to transpose a melody to a new key; use vibrato to make a sustained note "shimmer."
Why does audio-karaoke leave the vocal partly audible?
It assumes vocals are panned identically into both stereo channels (which makes them cancel when you subtract one from the other). Modern pop production breaks this assumption: vocals often have stereo widening, doublers, reverb, and chorus on a stereo bus — none of which cancel. Drums, bass, and other center-panned elements are also cancelled, so what's left is a thin instrumental + ghosted vocal. For real vocal removal, use ML-based source separation tools (Spleeter, Demucs, Moises) outside FFmpeg.
Will audio-stereo-widen break on mono playback?
Yes, that's its main risk. The Haas-style inter-channel delay (default 20 ms) creates phase relationships that summed to mono become a comb filter — the audio sounds hollow and notched. If your final output might be played on a single speaker (smart speakers, phone speakerphone, mono Bluetooth, AM radio simulcast), test the mono downmix first: ffmpeg -i input-wide.mp4 -ac 1 -t 10 mono-test.mp3. If it sounds significantly worse than the source, lower --delay (try 8-12 ms) or skip widening for that delivery.
Can I chain multiple C10 verbs in a single FFmpeg pass to avoid generation loss?
Not via fqmpeg directly — each verb produces its own output and re-encodes. For lossless intermediates between verbs, pass -c:a copy won't help because the filter has to re-encode audio; instead, encode each step to a lossless format like FLAC or WAV with raw FFmpeg, or copy the filter strings from each --dry-run and combine them into one FFmpeg invocation:
# Combine chorus + reverb in one pass (filters from --dry-run)
ffmpeg -i voice.wav -af "chorus=0.5:0.9:50|60|70:0.4|0.32|0.28:1.5|1.5|1.5:0.3|0.3|0.3,aecho=0.8:0.88:60:0.3" voice-warm.wav
Tremolo and vibrato have the same flags — how do I remember which is which?
A mnemonic: tremolo modulates volume (think of "trembling loudness" — a held note that pulses); vibrato modulates pitch (think of "vibrating string" — a held note that wobbles in tone). Same --freq and --depth, opposite musical sensation.
Wrapping Up
The nine C10 verbs cover the most common creative-effect operations you'd reach for between EQ/dynamics (C9) and final delivery:
reverb,echo-effectfor time-based depth (single-tap reverb is honest about being a 1-tapaecho; echo geometric decay is the natural physical model)chorus,phaser,flanger,tremolo,vibratofor LFO modulation (chorus has the most hidden machinery — the 3-voice preset is hardcoded and intentional; tremolo and vibrato share an option surface but do completely different things)audio-karaoke,audio-stereo-widenfor stereo-field tricks (karaoke only works on dry, center-panned vocals; widen breaks on mono playback)
Every verb prints its underlying FFmpeg invocation under --dry-run, so when the simplified surface isn't enough (custom 8-voice chorus, ping-pong stereo echoes, triangular phaser sweep), copy the filter, edit the parameters, and call FFmpeg directly. For the broader fqmpeg map, see the fqmpeg complete guide.