fqmpeg's C8 cluster is the composition toolbox — 13 verbs that put something on top of (or next to) your video. Five draw static graphics or text (watermark, text, drawbox, timecode, border). Four combine multiple videos into one canvas (pip, pip-grid, blend, stack). One creates a video from a still image and audio (picture). Three burn analytical information into the frame for review (video-info-overlay, histogram-overlay, progress).
This guide walks each verb against its source in src/commands/ of fqmpeg 3.0.3 — the underlying FFmpeg filter, the defaults, the output filename, and the gotchas you can't see from --help alone (text auto-escapes : and ' so you can pass them literally; pip-grid auto-detects 2x1/2x2/3x3 from the input count; progress needs a duration argument because FFmpeg can't reference the source duration inside a filter expression; histogram-overlay's position and size are hardcoded).
What you'll get out of this guide
- A decision matrix for the 13 verbs by task (graphics / multi-video / image-to-video / analytical)
- Exact FFmpeg invocation each verb generates (verified
--dry-runoutput) - Defaults, ranges, position keywords, and output filenames for every command
- Three end-to-end recipes — brand a YouTube clip, build a comparison reel, document a screen recording
The 13 Verbs at a Glance
The cluster splits into four task groups. Pick the group, then the verb.
| Group | Verbs | What they do |
|---|---|---|
| Static graphics & text | watermark, text, drawbox, timecode, border | Burn a logo image, text string, rectangle, running timestamp, or solid frame around your video |
| Multi-video composition | pip, pip-grid, blend, stack | Picture-in-picture, 2-9-input grids, opacity blends, or side-by-side / over-under stacks |
| Image-to-video | picture | Promote a still image + audio into a playable MP4 (podcast, song-art video) |
| Analytical overlays | video-info-overlay, histogram-overlay, progress | Burn frame number / PTS / picture-type, a histogram or waveform, or a time-driven progress bar |
Three things to know before reading on:
textauto-escapes:and'. Thedrawtextfilter treats:as an option separator and'as a string delimiter, so a literalTime: 12:34would normally need manual escaping. fqmpeg pre-escapes these in the JS wrapper ('→'\\\\'',:→\:), so you can pass the string as-is. The flip side: if you tried to injectdrawtextoptions through the string, fqmpeg blocks that — by design.pip-gridauto-detects layout from input count. Pass 2 inputs and you get2x1; 3-4 gives2x2; 5-9 gives3x3. Empty grid slots are filled by repeating the last input — pass 5 videos to a3x3and slots 6-9 are clones of input 5. Override with--layout 2x2(etc.) when you want to force a shape.progressneeds a<duration>positional argument. FFmpeg'sdrawboxcan't referencetotal_durationinside an expression — onlyt(the running time). fqmpeg works around this by taking the duration as an explicit argument and bakes it into the formulaiw*t/<duration>. If you don't know the duration, runnpx fqmpeg duration input.mp4first.
Static Graphics & Text
watermark — Image overlay (PNG logo, etc.)
Overlays an image on a video at one of nine preset positions. Audio is preserved unchanged.
- Source:
src/commands/watermark.js - Filter:
overlay=<x>:<y>where<x>:<y>is computed from--posand--margin - Audio:
-c:a copy(untouched)
| Argument / Option | Default | Choices / Notes |
|---|---|---|
<input> | required | Input video |
<image> | required | Watermark image (PNG with alpha recommended) |
--pos <position> | bottom-right | top-left, top, top-right, left, center, right, bottom-left, bottom, bottom-right |
--margin <n> | 10 | Pixels from the chosen edge |
-o, --output <path> | <input-stem>-watermarked.<ext> | — |
$ npx fqmpeg watermark input.mp4 logo.png --pos bottom-right --margin 10 --dry-run
ffmpeg -i input.mp4 -i logo.png -filter_complex overlay=W-w-10:H-h-10 -c:a copy input-watermarked.mp4
The image's native size is used — there's no --scale here. If your logo is too large, resize it first with an image editor or pre-process with ffmpeg -i logo.png -vf scale=120:-1 logo-small.png. Center positions (top, bottom, center, left, right) compute the offset from the frame center, so --margin only affects the perpendicular axis (e.g. bottom uses margin from the bottom edge but is horizontally centered regardless of margin).
text — Draw text / title onto video
Burns a text string at one of seven preset positions, with optional background box and time-window enable.
- Source:
src/commands/text.js - Filter:
drawtext=text='<escaped>':fontsize=N:fontcolor=C:<pos>[:box=1:boxcolor=...][:enable='...'] - Auto-escaping:
'and:in the input string are escaped before being inlined
| Argument / Option | Default | Notes |
|---|---|---|
<input> | required | Input video |
<string> | required | Text to draw (positional, quote shell-special characters yourself) |
--pos <position> | center | top-left, top, top-right, center, bottom-left, bottom, bottom-right |
--font-size <n> | 48 | Positive integer |
--color <name> | white | Any FFmpeg color name or 0xRRGGBB |
--bg <color> | (none) | Background box color, e.g. black@0.5 for 50% opacity |
--start <sec> | (none) | Show text from this time |
--end <sec> | (none) | Hide text after this time |
-o, --output <path> | <input-stem>-text.<ext> | — |
$ npx fqmpeg text input.mp4 "Hello, World" --pos top --font-size 64 --color yellow --bg black@0.5 --dry-run
ffmpeg -i input.mp4 -vf "drawtext=text='Hello, World':fontsize=64:fontcolor=yellow:x=(w-text_w)/2:y=20:box=1:boxcolor=black@0.5:boxborderw=8" -c:a copy input-text.mp4
Pass strings with colons and apostrophes literally — --bg "black@0.5" and text "Time: 12:34" both work without manual escaping. The boxborderw=8 (8 pixels of padding inside the box) is hardcoded; if you need a tighter or looser box, copy the --dry-run and edit. For multi-line text, FFmpeg's drawtext doesn't wrap — use \n literally inside the string and the filter will honor it on most builds, but spacing between lines may need a manually-tuned line_spacing=N appended (again, copy the --dry-run).
--start and --end enable time-windowed display via drawtext's enable expression. Both alone use gte(t,N) or lte(t,N); both together produce between(t,A,B).
drawbox — Draw a rectangle / border onto video
Draws an outline (or filled) rectangle at a fixed position. Useful for highlighting an area in a tutorial, censoring a fixed region, or marking up a screen recording.
- Source:
src/commands/drawbox.js - Filter:
drawbox=x=X:y=Y:w=W:h=H:color=C:t=T[:enable='...'] - Region format:
x:y:w:hstrict regex — anything else exits withError: region must be x:y:w:h
| Argument / Option | Default | Notes |
|---|---|---|
<input> | required | Input video |
<region> | required | x:y:w:h in pixels (top-left origin) |
--color <name> | red | FFmpeg color name |
--thickness <n> | 3 | Positive integer, or the literal string fill for a solid-filled box |
--start <sec> | (none) | Show from this time |
--end <sec> | (none) | Hide after this time |
-o, --output <path> | <input-stem>-boxed.<ext> | — |
$ npx fqmpeg drawbox input.mp4 100:50:200:150 --color yellow --thickness fill --dry-run
ffmpeg -i input.mp4 -vf drawbox=x=100:y=50:w=200:h=150:color=yellow:t=fill -c:a copy input-boxed.mp4
--thickness fill is the censorship pattern — solid color box covers the region. For an outline-only highlight, use a positive integer (3–8 reads well at 1080p). Combine with --start/--end to highlight only the seconds where something matters in a tutorial. The region uses absolute pixel coordinates — for a coordinate-percentage approach you'd need to know the frame size first (npx fqmpeg info input.mp4).
timecode — Burn timecode / running timestamp
Overlays the running play position as HH:MM:SS.mmm text in a black semi-transparent box at a corner.
- Source:
src/commands/timecode.js - Filter:
drawtext=text='%{pts\\:hms}':fontsize=N:fontcolor=C:<pos>:box=1:boxcolor=black@0.5:boxborderw=4 - Format:
%{pts:hms}— FFmpeg's built-in HMS time formatter (hardcoded — no other formats exposed)
| Argument / Option | Default | Notes |
|---|---|---|
<input> | required | Input video |
--pos <position> | top-left | top-left, top-right, bottom-left, bottom-right (corners only) |
--font-size <n> | 24 | Positive integer |
--color <name> | white | FFmpeg color |
-o, --output <path> | <input-stem>-timecode.<ext> | — |
$ npx fqmpeg timecode input.mp4 --pos top-right --font-size 32 --dry-run
ffmpeg -i input.mp4 -vf drawtext=text='%{pts\:hms}':fontsize=32:fontcolor=white:x=w-text_w-10:y=10:box=1:boxcolor=black@0.5:boxborderw=4 -c:a copy input-timecode.mp4
Useful for review reels — clients reference times in the burned-in code so feedback like "at 0:01:22.500 the cut is hard" lands precisely. The black box is hardcoded for legibility on bright sources. If you want a different format (frame number, source timecode, custom prefix), use text with FFmpeg's %{pts} / %{n} / %{localtime} expressions — see video-info-overlay below for the multi-field variant.
border — Add a decorative frame around video
Pads the frame on all four sides with a solid color, enlarging the output dimensions by 2 * width on each axis.
- Source:
src/commands/border.js - Filter:
pad=iw+W*2:ih+W*2:W:W:color=C
| Argument / Option | Default | Notes |
|---|---|---|
<input> | required | Input video |
--width <px> | 20 | Border thickness in pixels (added to each side) |
--color <name> | white | FFmpeg color |
-o, --output <path> | <input-stem>-bordered.<ext> | — |
$ npx fqmpeg border input.mp4 --width 30 --color black --dry-run
ffmpeg -i input.mp4 -vf pad=iw+30*2:ih+30*2:30:30:color=black -c:a copy input-bordered.mp4
Output dimensions grow — a 1920x1080 source with --width 30 becomes 1980x1140. Some platforms (Instagram square posts, YouTube Shorts at fixed aspect) re-crop nonstandard sizes, so verify the target spec before adding a border. For a square frame around a portrait clip, set --width to half the difference between source width and the desired square side, then add pad's --color to set the fill — combining border with crop first usually gives more control.
Multi-Video Composition
pip — Picture-in-picture overlay
Scales a second "small" video and overlays it on a "main" video at one of four corners.
- Source:
src/commands/pip.js - Filter:
[1]scale=iw*S:ih*S[pip];[0][pip]overlay=<xy> - Audio: Not handled explicitly — the main video's audio passes through (FFmpeg's default first-input behavior)
| Argument / Option | Default | Range | Notes |
|---|---|---|---|
<main> | required | — | Background video |
<small> | required | — | Overlay video |
--pos <position> | bottom-right | — | top-left, top-right, bottom-left, bottom-right |
--scale <n> | 0.25 | 0.1–1.0 | Overlay size relative to main |
--margin <n> | 10 | positive integer | Pixels from edge |
-o, --output <path> | <main-stem>-pip.<ext> | — | — |
$ npx fqmpeg pip main.mp4 webcam.mp4 --pos top-right --scale 0.3 --dry-run
ffmpeg -i main.mp4 -i webcam.mp4 -filter_complex [1]scale=iw*0.3:ih*0.3[pip];[0][pip]overlay=main_w-overlay_w-10:10 main-pip.mp4
The output filename is derived from the main input — outputName(main, "pip") = main-pip.mp4. The classic use is a screen recording main with a face-cam small at 25% scale in a corner.
The output uses the main video's duration — if webcam.mp4 is shorter, it freezes on its last frame; if longer, it gets cut. To extend with the shorter clip's last frame, pre-process with tpad=stop_mode=clone or use the repeat-frame verb on the small input first.
pip-grid — Multi-input grid layout
Arranges 2-9 videos in a grid (2x1, 1x2, 2x2, or 3x3). Layout is auto-detected from input count if --layout is omitted.
- Source:
src/commands/pip-grid.js - Filter: Per-input
scale=iw/cols:ih/rows, thenhstackper row, thenvstackof rows - Padding: Empty slots are filled by repeating the last input
- Audio: Copied from input 0 (
-c:a copy)
| Argument / Option | Default | Notes |
|---|---|---|
<inputs...> | required | 2-9 input video files (positional, variadic) |
--layout <grid> | auto | 2x1, 1x2, 2x2, 3x3. Auto: 2→2x1, 3-4→2x2, 5-9→3x3 |
-o, --output <path> | <input0-stem>-grid<layout>.<ext> | — |
$ npx fqmpeg pip-grid clip1.mp4 clip2.mp4 clip3.mp4 clip4.mp4 --layout 2x2 --dry-run
ffmpeg -i clip1.mp4 -i clip2.mp4 -i clip3.mp4 -i clip4.mp4 \
-filter_complex '[0:v]scale=iw/2:ih/2[v0];
[1:v]scale=iw/2:ih/2[v1];
[2:v]scale=iw/2:ih/2[v2];
[3:v]scale=iw/2:ih/2[v3];
[v0][v1]hstack=inputs=2[row0];
[v2][v3]hstack=inputs=2[row1];
[row0][row1]vstack=inputs=2[out]' \
-map '[out]' -c:a copy clip1-grid2x2.mp4
Pass 5 inputs to 3x3 and slots 6-9 are clones of input 5 — useful for "5 takes of a single line, 9-up grid for review" but unintended on accidental over-/under-supply. Use --layout explicitly if you want 3x3 with deliberate empty corners (you'd add black-screen videos to fill them; fqmpeg won't synthesize blanks for you). All inputs are scaled by iw/cols:ih/rows, so different aspect ratios end up stretched non-uniformly — pre-normalize with resize or crop if that matters.
blend — Opacity blend two videos
Combines two videos by drawing the second on top of the first with a configurable alpha.
- Source:
src/commands/blend.js - Filter:
[1]format=yuva420p,colorchannelmixer=aa=O[over];[0][over]overlay - Audio: Re-encoded (no explicit
-c:a copy)
| Argument / Option | Default | Range | Notes |
|---|---|---|---|
<input1> | required | — | Base (under) video |
<input2> | required | — | Overlay (top) video |
--opacity <n> | 0.5 | 0–1 | Alpha of input 2 |
-o, --output <path> | <input1-stem>-blend.<ext> | — | — |
$ npx fqmpeg blend input1.mp4 input2.mp4 --opacity 0.5 --dry-run
ffmpeg -i input1.mp4 -i input2.mp4 -filter_complex [1]format=yuva420p,colorchannelmixer=aa=0.5[over];[0][over]overlay input1-blend.mp4
The two videos must align spatially — blend overlays at 0:0, so different resolutions will produce a misaligned composite. Resize the overlay to match the base first if needed. The duration of the output follows FFmpeg's overlay default — usually the shorter of the two. For "ghost trail" effects, set --opacity 0.3-0.4. Higher than 0.7 and the base video is barely visible.
stack — Side-by-side or top/bottom
Joins two videos into one frame, either horizontally (side-by-side) or vertically (over/under). The two videos must have matching dimensions on the joining axis (matching height for horizontal, matching width for vertical).
- Source:
src/commands/stack.js - Filter:
[0:v][1:v]hstack=inputs=2orvstack=inputs=2 - Audio:
-c:a copy(input 0's track)
| Argument / Option | Default | Choices | Notes |
|---|---|---|---|
<input1> | required | — | Left/top video |
<input2> | required | — | Right/bottom video |
--direction <dir> | horizontal | horizontal, vertical | — |
-o, --output <path> | <input1-stem>-stacked.<ext> | — | — |
$ npx fqmpeg stack original.mp4 graded.mp4 --direction horizontal --dry-run
ffmpeg -i original.mp4 -i graded.mp4 -filter_complex [0:v][1:v]hstack=inputs=2 -c:a copy original-stacked.mp4
The classic before/after comparison reel — color-graded vs ungraded, stabilized vs source. If the two clips don't match in height (for hstack), FFmpeg errors with Input 1 height 720 does not match input 0 height 1080. Pre-resize the smaller one with npx fqmpeg resize input.mp4 --height 1080. Audio comes from input1 only — the second video's audio is dropped.
Image-to-Video
picture — Still image + audio → video
Loops a still image for the duration of an audio file and encodes the result as MP4. Common use: turning a podcast episode or song into a YouTube-uploadable video.
- Source:
src/commands/picture.js - Codec:
libx264+aac@192k+yuv420p+-tune stillimage+-shortest - Output stem: Derived from the audio filename (not the image), as
<audio-stem>-video.mp4
| Argument / Option | Default | Notes |
|---|---|---|
<image> | required | JPG / PNG cover art |
<audio> | required | MP3 / WAV / etc. |
-o, --output <path> | <audio-dir>/<audio-stem>-video.mp4 | Note the stem source |
$ npx fqmpeg picture cover.png episode.mp3 --dry-run
ffmpeg -loop 1 -i cover.png -i episode.mp3 -c:v libx264 -tune stillimage -c:a aac -b:a 192k -pix_fmt yuv420p -shortest episode-video.mp4
-tune stillimage tells x264 to optimize for the static-content case (very high quality at very low bitrate, since only one frame matters). -shortest ends the output when the shorter input ends — since the looped image is infinite, that's always the audio length. The output stem mirrors the audio filename, which matches the typical workflow ("convert this episode into a video"). If you need a different name, override with -o.
For a podcast cover with chapter markers, run picture first then chain embed-thumbnail and metadata work — fqmpeg keeps these as separate verbs because they're independent decisions.
Analytical Overlays
These verbs burn diagnostic data into the frame — useful for QC review, color-correction sessions, or making a tutorial that shows what's happening on a frame-by-frame basis. They're not intended for final-delivery output.
video-info-overlay — Frame number, PTS, picture type
Burns a per-frame text overlay showing the frame index (%{n}), presentation timestamp (%{pts:hms}), and picture type (%{pict_type} — I/P/B for inter/predicted/bidirectional frames in H.264).
- Source:
src/commands/video-info-overlay.js - Filter:
drawtext=text='frame %{n} | pts %{pts\\:hms} | type %{pict_type}':x=10:y=10:fontsize=16:fontcolor=white:box=1:boxcolor=black@0.5 - Position, font, color: All hardcoded (no options)
| Argument / Option | Default | Notes |
|---|---|---|
<input> | required | Input video |
-o, --output <path> | <input-stem>-info-overlay.<ext> | — |
$ npx fqmpeg video-info-overlay input.mp4 --dry-run
ffmpeg -i input.mp4 -vf "drawtext=text='frame %{n} | pts %{pts\:hms} | type %{pict_type}':x=10:y=10:fontsize=16:fontcolor=white:box=1:boxcolor=black@0.5" -c:a copy input-info-overlay.mp4
Useful for QC ("this glitch is at frame 4523, type P") and for showing GOP structure visually in tutorials. The overlay is fixed top-left at 16px font — small but legible at 1080p. For larger or repositioned variants, copy the --dry-run and edit, or build a custom invocation with text and the same %{...} expressions.
histogram-overlay — RGB / waveform / parade overlay
Adds a 320x240 histogram (or waveform/parade) display in the bottom-right corner — useful for color-correction work where you want to see the levels distribution alongside the frame.
- Source:
src/commands/histogram-overlay.js - Filter (levels):
split[main][hist];[hist]histogram,scale=320:240[h];[main][h]overlay=W-w-10:H-h-10 - Filter (waveform):
split[main][wave];[wave]waveform=mode=column,scale=320:240[w];[main][w]overlay=W-w-10:H-h-10 - Filter (parade):
split[main][par];[par]waveform=mode=column:display=parade,scale=320:240[p];[main][p]overlay=W-w-10:H-h-10 - Position & size: Hardcoded — bottom-right with 10px margin, 320x240 panel
| Argument / Option | Default | Choices | Notes |
|---|---|---|---|
<input> | required | — | Input video |
--mode <mode> | levels | levels, waveform, parade | — |
-o, --output <path> | <input-stem>-histogram.<ext> | — | — |
$ npx fqmpeg histogram-overlay input.mp4 --mode parade --dry-run
ffmpeg -i input.mp4 -filter_complex split[main][par];[par]waveform=mode=column:display=parade,scale=320:240[p];[main][p]overlay=W-w-10:H-h-10 -c:a copy input-histogram.mp4
levels is the classic 256-bin RGB histogram. waveform shows the luma vs horizontal position as a column scope (broadcast-style). parade is waveform split per channel — three columns (R/G/B) side by side. For DIY color grading review where you want to see the distribution change as you scrub, parade is the most informative.
progress — Time-driven progress bar
Burns a horizontal bar at the top or bottom of the frame whose width grows linearly from 0 to full as time advances.
- Source:
src/commands/progress.js - Filter:
drawbox=x=0:y=<y>:w='iw*t/<duration>':h=H:color=C:t=fill - Why duration is positional: FFmpeg's
drawboxexpressions can referencet(current time) but not the total duration of the source — fqmpeg compensates by baking the duration into the formula
| Argument / Option | Default | Notes |
|---|---|---|
<input> | required | Input video |
<duration> | required | Total video duration in seconds (find with npx fqmpeg duration input.mp4) |
--pos <position> | bottom | top, bottom |
--color <name> | red | FFmpeg color |
--height <px> | 5 | Positive integer |
-o, --output <path> | <input-stem>-progress.<ext> | — |
$ npx fqmpeg progress input.mp4 90 --pos bottom --color cyan --height 8 --dry-run
ffmpeg -i input.mp4 -vf drawbox=x=0:y=ih-8:w='iw*t/90':h=8:color=cyan:t=fill -c:a copy input-progress.mp4
Pass the duration in seconds — 90 for 1m30s, 3600 for an hour. The bar reaches full width exactly at t = duration; if you pass a wrong duration the bar over- or under-shoots. A typical wrapper script: dur=$(npx fqmpeg duration input.mp4) && npx fqmpeg progress input.mp4 $dur. Useful for tutorial videos where viewers want a visual cue of how much remains.
Real-World Recipes
Each recipe chains multiple verbs into a workflow you'd actually use.
Recipe 1: Brand a YouTube clip
You have a finished video and want to add a corner watermark, an opening title that fades after 3 seconds, and a thin progress bar at the bottom.
# Step 1: corner watermark (logo PNG with alpha)
npx fqmpeg watermark final.mp4 logo.png --pos bottom-right --margin 20
# → final-watermarked.mp4
# Step 2: opening title shown for the first 3 seconds
npx fqmpeg text final-watermarked.mp4 "Episode 12 — Color Grading" \
--pos top --font-size 56 --color white --bg "black@0.5" \
--start 0 --end 3
# → final-watermarked-text.mp4
# Step 3: thin progress bar (assuming 8-minute episode = 480s)
npx fqmpeg progress final-watermarked-text.mp4 480 --pos bottom --color "white@0.6" --height 4
# → final-watermarked-text-progress.mp4
Three passes means three rounds of H.264 re-encoding, which is tolerable but adds generation loss. If you want one pass, copy the three filter strings from the --dry-run outputs and assemble them with ;/, in a single FFmpeg invocation. For frequent re-runs (a weekly podcast), the three-step workflow is fine — the convenience beats the negligible quality drop.
Recipe 2: Side-by-side comparison reel with timecode
You want to show a "before/after" of a color-graded clip, with a timecode overlay so reviewers can reference exact moments.
# Step 1: stack the two clips horizontally (must match height)
npx fqmpeg stack original.mp4 graded.mp4 --direction horizontal
# → original-stacked.mp4 (3840x1080 for two 1920x1080 sources)
# Step 2: burn timecode in the top-left for reference
npx fqmpeg timecode original-stacked.mp4 --pos top-left --font-size 32
# → original-stacked-timecode.mp4
# Step 3 (optional): label each side with a static text overlay
npx fqmpeg text original-stacked-timecode.mp4 "ORIGINAL GRADED" \
--pos bottom --font-size 48 --color white --bg "black@0.5"
The label trick in step 3 uses spaces to push "GRADED" to the right side — crude but effective. For pixel-perfect labels you'd run two text invocations with custom positions (copy the --dry-run and replace the --pos-derived x= with absolute coordinates).
Recipe 3: Document a screen recording with face-cam and progress
You have a 12-minute screen recording (screen.mp4) and a webcam (webcam.mp4) and want a tutorial deliverable: face-cam in the corner, time-driven progress bar, and a half-transparent title for the first 5 seconds.
# Step 1: face-cam picture-in-picture
npx fqmpeg pip screen.mp4 webcam.mp4 --pos bottom-right --scale 0.2 --margin 20
# → screen-pip.mp4
# Step 2: progress bar (12 minutes = 720 seconds)
npx fqmpeg progress screen-pip.mp4 720 --pos top --color "red@0.7" --height 6
# → screen-pip-progress.mp4
# Step 3: opening title
npx fqmpeg text screen-pip-progress.mp4 "Setting up Vercel + Next.js" \
--pos center --font-size 64 --color white --bg "black@0.7" \
--start 0 --end 5
# → screen-pip-progress-text.mp4
If the webcam is a different aspect ratio than the screen capture (square webcam, 16:9 screen), the pip overlay scales relative to the main video's dimensions — a 0.2 scale on a 1920-wide screen produces a 384-wide overlay regardless of the webcam's source dimensions. To preserve the webcam's own aspect, pre-process it with npx fqmpeg crop webcam.mp4 ... before the pip step.
Frequently Asked Questions
How do I pass colons or apostrophes inside a text string?
Just include them — npx fqmpeg text input.mp4 "Time: it's 12:34" works. fqmpeg's wrapper escapes : and ' before inlining into the FFmpeg drawtext filter (' becomes '\\\\'' and : becomes \:). You don't need to escape anything yourself except shell-special characters (the surrounding quotes in your shell already handle that).
pip-grid accepted my 5 inputs and the bottom row is duplicated — is that intended?
Yes — when input count doesn't fill the grid, fqmpeg pads empty slots with the last input. Pass 5 to a 3x3 (auto-detected) and slots 6-9 are clones of input 5. To use 2x2 for 4 inputs (and skip the padding), pass 4 inputs (auto-detects 2x2) or specify --layout 2x2 explicitly. To leave deliberate blank slots, supply a black-screen input — fqmpeg won't synthesize one for you.
progress requires a duration — why doesn't it auto-detect?
FFmpeg's drawbox filter expression can reference t (the running time of the current frame) but has no built-in expression for "total duration of the source." The only ways to get total duration into the filter are (1) bake it in as a literal number, which is what fqmpeg does; (2) probe the source first, which would mean running ffprobe from inside the JS wrapper — fqmpeg keeps verbs single-purpose and doesn't probe transparently. Use dur=$(npx fqmpeg duration input.mp4) then pass $dur as the argument.
What's the difference between watermark and picture?
watermark overlays a static image on top of an existing video — the input is a video and the watermark is a small image. picture does the opposite — it creates a video from a still image and an audio track (looping the image for the audio's full duration). They share libavfilter plumbing internally but solve different problems: branding (watermark) vs single-image-video creation (picture).
Can I change histogram-overlay's position or panel size?
Not via fqmpeg's flags — both are hardcoded (overlay=W-w-10:H-h-10 for bottom-right with 10px margin, size=320x240). For a custom layout, copy the --dry-run filter string and edit the overlay=... and size=... parameters before running FFmpeg directly. Common edits: overlay=10:H-h-10 for bottom-left, size=480x270 for a larger panel.
stack errored with "Input 1 height ... does not match" — what now?
hstack requires all inputs to share the same height; vstack requires the same width. Pre-resize the mismatched input: npx fqmpeg resize smaller.mp4 --height 1080 (matching the other clip's height). If the aspect ratios differ and you don't want to crop, pad with pad first to match dimensions, then stack — using border is a quick hack since it adds a frame of configurable width on each side.
Adding a border makes my output non-standard size — does that matter?
Yes, on platforms with strict aspect-ratio requirements (Instagram square posts at 1:1, YouTube Shorts at 9:16). A 1920x1080 source with --width 30 becomes 1980x1140, which most players will letterbox and some social platforms will re-crop. Two workarounds: (1) use crop first to shrink the inner video, then border to bring it back to the original size; (2) let the platform do its thing and accept that the border may be cropped or letterboxed.
Can I avoid the H.264 generation loss when chaining multiple overlay verbs?
Each fqmpeg invocation re-encodes — the loss accumulates. Two ways to mitigate: (1) bump intermediate quality with -crf 18 or lossless by editing the --dry-run output; (2) bypass fqmpeg for the chain by combining filter strings from each verb's --dry-run into one FFmpeg invocation with -vf "filter1,filter2,filter3" (or -filter_complex if any verb uses multiple inputs). For 2-3 passes the loss is rarely noticeable on real content; for 5+ passes it adds up.
Wrapping Up
The 13 C8 verbs cover the composition operations you reach for after editing but before final output:
- Static graphics & text (
watermark,text,drawbox,timecode,border) — burn logos, titles, highlights, timestamps, frames - Multi-video composition (
pip,pip-grid,blend,stack) — picture-in-picture, 2-9-input grids, opacity blends, before/after stacks - Image-to-video (
picture) — promote a still image + audio into a playable MP4 - Analytical overlays (
video-info-overlay,histogram-overlay,progress) — burn frame metadata, color scopes, and time progress for review
Every verb prints its underlying FFmpeg invocation under --dry-run, so when fqmpeg's defaults don't fit (a custom histogram-overlay position, a custom drawtext format, a single-pass multi-filter chain to skip generation loss), copy the command, customize, and run FFmpeg directly. For the broader fqmpeg map, see the fqmpeg complete guide.