fqmpeg Overlays & Watermarks: 13 Verbs for Logos & Text

Q: How do I pass colons or apostrophes inside a `text` string?

Just include them — `npx fqmpeg text input.mp4 "Time: it's 12:34"` works. fqmpeg's wrapper escapes `:` and `'` before inlining into the FFmpeg `drawtext` filter (`'` becomes `'\\\\''` and `:` becomes `\:`). You don't need to escape anything yourself except shell-special characters (the surrounding quotes in your shell already handle that).

Q: `progress` requires a duration — why doesn't it auto-detect?

FFmpeg's `drawbox` filter expression can reference `t` (the running time of the current frame) but has no built-in expression for "total duration of the source." The only ways to get total duration into the filter are (1) bake it in as a literal number, which is what fqmpeg does; (2) probe the source first, which would mean running `ffprobe` from inside the JS wrapper — fqmpeg keeps verbs single-purpose and doesn't probe transparently. Use `dur=$(npx fqmpeg duration input.mp4)` then pass `$dur` as the argument.

Q: What's the difference between `watermark` and `picture`?

`watermark` overlays a static image on top of an existing video — the input is a video and the watermark is a small image. `picture` does the opposite — it *creates* a video from a still image and an audio track (looping the image for the audio's full duration). They share `libavfilter` plumbing internally but solve different problems: branding (watermark) vs single-image-video creation (picture).

Q: Can I change `histogram-overlay`'s position or panel size?

Not via fqmpeg's flags — both are hardcoded (`overlay=W-w-10:H-h-10` for bottom-right with 10px margin, `size=320x240`). For a custom layout, copy the `--dry-run` filter string and edit the `overlay=...` and `size=...` parameters before running FFmpeg directly. Common edits: `overlay=10:H-h-10` for bottom-left, `size=480x270` for a larger panel.

Q: `stack` errored with "Input 1 height ... does not match" — what now?

`hstack` requires all inputs to share the same height; `vstack` requires the same width. Pre-resize the mismatched input: `npx fqmpeg resize smaller.mp4 --height 1080` (matching the other clip's height). If the aspect ratios differ and you don't want to crop, pad with `pad` first to match dimensions, then stack — using `border` is a quick hack since it adds a frame of configurable width on each side.

fqmpeg's C8 cluster is the composition toolbox — 13 verbs that put something on top of (or next to) your video. Five draw static graphics or text (watermark, text, drawbox, timecode, border). Four combine multiple videos into one canvas (pip, pip-grid, blend, stack). One creates a video from a still image and audio (picture). Three burn analytical information into the frame for review (video-info-overlay, histogram-overlay, progress).

This guide walks each verb against its source in src/commands/ of fqmpeg 3.0.3 — the underlying FFmpeg filter, the defaults, the output filename, and the gotchas you can't see from --help alone (text auto-escapes : and ' so you can pass them literally; pip-grid auto-detects 2x1/2x2/3x3 from the input count; progress needs a duration argument because FFmpeg can't reference the source duration inside a filter expression; histogram-overlay's position and size are hardcoded).

What you'll get out of this guide

A decision matrix for the 13 verbs by task (graphics / multi-video / image-to-video / analytical)
Exact FFmpeg invocation each verb generates (verified --dry-run output)
Defaults, ranges, position keywords, and output filenames for every command
Three end-to-end recipes — brand a YouTube clip, build a comparison reel, document a screen recording

The 13 Verbs at a Glance

The cluster splits into four task groups. Pick the group, then the verb.

Group	Verbs	What they do
Static graphics & text	`watermark`, `text`, `drawbox`, `timecode`, `border`	Burn a logo image, text string, rectangle, running timestamp, or solid frame around your video
Multi-video composition	`pip`, `pip-grid`, `blend`, `stack`	Picture-in-picture, 2-9-input grids, opacity blends, or side-by-side / over-under stacks
Image-to-video	`picture`	Promote a still image + audio into a playable MP4 (podcast, song-art video)
Analytical overlays	`video-info-overlay`, `histogram-overlay`, `progress`	Burn frame number / PTS / picture-type, a histogram or waveform, or a time-driven progress bar

Three things to know before reading on:

text auto-escapes : and '. The drawtext filter treats : as an option separator and ' as a string delimiter, so a literal Time: 12:34 would normally need manual escaping. fqmpeg pre-escapes these in the JS wrapper (' → '\\\\'', : → \:), so you can pass the string as-is. The flip side: if you tried to inject drawtext options through the string, fqmpeg blocks that — by design.
pip-grid auto-detects layout from input count. Pass 2 inputs and you get 2x1; 3-4 gives 2x2; 5-9 gives 3x3. Empty grid slots are filled by repeating the last input — pass 5 videos to a 3x3 and slots 6-9 are clones of input 5. Override with --layout 2x2 (etc.) when you want to force a shape.
progress needs a <duration> positional argument. FFmpeg's drawbox can't reference total_duration inside an expression — only t (the running time). fqmpeg works around this by taking the duration as an explicit argument and bakes it into the formula iw*t/<duration>. If you don't know the duration, run npx fqmpeg duration input.mp4 first.

Static Graphics & Text

`watermark` — Image overlay (PNG logo, etc.)

Overlays an image on a video at one of nine preset positions. Audio is preserved unchanged.

Source: src/commands/watermark.js
Filter: overlay=<x>:<y> where <x>:<y> is computed from --pos and --margin
Audio: -c:a copy (untouched)

Argument / Option	Default	Choices / Notes
`<input>`	required	Input video
`<image>`	required	Watermark image (PNG with alpha recommended)
`--pos <position>`	`bottom-right`	`top-left`, `top`, `top-right`, `left`, `center`, `right`, `bottom-left`, `bottom`, `bottom-right`
`--margin <n>`	`10`	Pixels from the chosen edge
`-o, --output <path>`	`<input-stem>-watermarked.<ext>`	—

bash

$ npx fqmpeg watermark input.mp4 logo.png --pos bottom-right --margin 10 --dry-run

  ffmpeg -i input.mp4 -i logo.png -filter_complex overlay=W-w-10:H-h-10 -c:a copy input-watermarked.mp4

The image's native size is used — there's no --scale here. If your logo is too large, resize it first with an image editor or pre-process with ffmpeg -i logo.png -vf scale=120:-1 logo-small.png. Center positions (top, bottom, center, left, right) compute the offset from the frame center, so --margin only affects the perpendicular axis (e.g. bottom uses margin from the bottom edge but is horizontally centered regardless of margin).

`text` — Draw text / title onto video

Burns a text string at one of seven preset positions, with optional background box and time-window enable.

Source: src/commands/text.js
Filter: drawtext=text='<escaped>':fontsize=N:fontcolor=C:<pos>[:box=1:boxcolor=...][:enable='...']
Auto-escaping: ' and : in the input string are escaped before being inlined

Argument / Option	Default	Notes
`<input>`	required	Input video
`<string>`	required	Text to draw (positional, quote shell-special characters yourself)
`--pos <position>`	`center`	`top-left`, `top`, `top-right`, `center`, `bottom-left`, `bottom`, `bottom-right`
`--font-size <n>`	`48`	Positive integer
`--color <name>`	`white`	Any FFmpeg color name or `0xRRGGBB`
`--bg <color>`	(none)	Background box color, e.g. `black@0.5` for 50% opacity
`--start <sec>`	(none)	Show text from this time
`--end <sec>`	(none)	Hide text after this time
`-o, --output <path>`	`<input-stem>-text.<ext>`	—

bash

$ npx fqmpeg text input.mp4 "Hello, World" --pos top --font-size 64 --color yellow --bg black@0.5 --dry-run

  ffmpeg -i input.mp4 -vf "drawtext=text='Hello, World':fontsize=64:fontcolor=yellow:x=(w-text_w)/2:y=20:box=1:boxcolor=black@0.5:boxborderw=8" -c:a copy input-text.mp4

Pass strings with colons and apostrophes literally — --bg "black@0.5" and text "Time: 12:34" both work without manual escaping. The boxborderw=8 (8 pixels of padding inside the box) is hardcoded; if you need a tighter or looser box, copy the --dry-run and edit. For multi-line text, FFmpeg's drawtext doesn't wrap — use \n literally inside the string and the filter will honor it on most builds, but spacing between lines may need a manually-tuned line_spacing=N appended (again, copy the --dry-run).

--start and --end enable time-windowed display via drawtext's enable expression. Both alone use gte(t,N) or lte(t,N); both together produce between(t,A,B).

`drawbox` — Draw a rectangle / border onto video

Draws an outline (or filled) rectangle at a fixed position. Useful for highlighting an area in a tutorial, censoring a fixed region, or marking up a screen recording.

Source: src/commands/drawbox.js
Filter: drawbox=x=X:y=Y:w=W:h=H:color=C:t=T[:enable='...']
Region format: x:y:w:h strict regex — anything else exits with Error: region must be x:y:w:h

Argument / Option	Default	Notes
`<input>`	required	Input video
`<region>`	required	`x:y:w:h` in pixels (top-left origin)
`--color <name>`	`red`	FFmpeg color name
`--thickness <n>`	`3`	Positive integer, or the literal string `fill` for a solid-filled box
`--start <sec>`	(none)	Show from this time
`--end <sec>`	(none)	Hide after this time
`-o, --output <path>`	`<input-stem>-boxed.<ext>`	—

bash

$ npx fqmpeg drawbox input.mp4 100:50:200:150 --color yellow --thickness fill --dry-run

  ffmpeg -i input.mp4 -vf drawbox=x=100:y=50:w=200:h=150:color=yellow:t=fill -c:a copy input-boxed.mp4

--thickness fill is the censorship pattern — solid color box covers the region. For an outline-only highlight, use a positive integer (3–8 reads well at 1080p). Combine with --start/--end to highlight only the seconds where something matters in a tutorial. The region uses absolute pixel coordinates — for a coordinate-percentage approach you'd need to know the frame size first (npx fqmpeg info input.mp4).

`timecode` — Burn timecode / running timestamp

Overlays the running play position as HH:MM:SS.mmm text in a black semi-transparent box at a corner.

Source: src/commands/timecode.js
Filter: drawtext=text='%{pts\\:hms}':fontsize=N:fontcolor=C:<pos>:box=1:boxcolor=black@0.5:boxborderw=4
Format: %{pts:hms} — FFmpeg's built-in HMS time formatter (hardcoded — no other formats exposed)

Argument / Option	Default	Notes
`<input>`	required	Input video
`--pos <position>`	`top-left`	`top-left`, `top-right`, `bottom-left`, `bottom-right` (corners only)
`--font-size <n>`	`24`	Positive integer
`--color <name>`	`white`	FFmpeg color
`-o, --output <path>`	`<input-stem>-timecode.<ext>`	—

bash

$ npx fqmpeg timecode input.mp4 --pos top-right --font-size 32 --dry-run

  ffmpeg -i input.mp4 -vf drawtext=text='%{pts\:hms}':fontsize=32:fontcolor=white:x=w-text_w-10:y=10:box=1:boxcolor=black@0.5:boxborderw=4 -c:a copy input-timecode.mp4

Useful for review reels — clients reference times in the burned-in code so feedback like "at 0:01:22.500 the cut is hard" lands precisely. The black box is hardcoded for legibility on bright sources. If you want a different format (frame number, source timecode, custom prefix), use text with FFmpeg's %{pts} / %{n} / %{localtime} expressions — see video-info-overlay below for the multi-field variant.

`border` — Add a decorative frame around video

Pads the frame on all four sides with a solid color, enlarging the output dimensions by 2 * width on each axis.

Source: src/commands/border.js
Filter: pad=iw+W*2:ih+W*2:W:W:color=C

Argument / Option	Default	Notes
`<input>`	required	Input video
`--width <px>`	`20`	Border thickness in pixels (added to each side)
`--color <name>`	`white`	FFmpeg color
`-o, --output <path>`	`<input-stem>-bordered.<ext>`	—

bash

$ npx fqmpeg border input.mp4 --width 30 --color black --dry-run

  ffmpeg -i input.mp4 -vf pad=iw+30*2:ih+30*2:30:30:color=black -c:a copy input-bordered.mp4

Output dimensions grow — a 1920x1080 source with --width 30 becomes 1980x1140. Some platforms (Instagram square posts, YouTube Shorts at fixed aspect) re-crop nonstandard sizes, so verify the target spec before adding a border. For a square frame around a portrait clip, set --width to half the difference between source width and the desired square side, then add pad's --color to set the fill — combining border with crop first usually gives more control.

Multi-Video Composition

`pip` — Picture-in-picture overlay

Scales a second "small" video and overlays it on a "main" video at one of four corners.

Source: src/commands/pip.js
Filter: [1]scale=iw*S:ih*S[pip];[0][pip]overlay=<xy>
Audio: Not handled explicitly — the main video's audio passes through (FFmpeg's default first-input behavior)

Argument / Option	Default	Range	Notes
`<main>`	required	—	Background video
`<small>`	required	—	Overlay video
`--pos <position>`	`bottom-right`	—	`top-left`, `top-right`, `bottom-left`, `bottom-right`
`--scale <n>`	`0.25`	`0.1`–`1.0`	Overlay size relative to main
`--margin <n>`	`10`	positive integer	Pixels from edge
`-o, --output <path>`	`<main-stem>-pip.<ext>`	—	—

bash

$ npx fqmpeg pip main.mp4 webcam.mp4 --pos top-right --scale 0.3 --dry-run

  ffmpeg -i main.mp4 -i webcam.mp4 -filter_complex [1]scale=iw*0.3:ih*0.3[pip];[0][pip]overlay=main_w-overlay_w-10:10 main-pip.mp4

The output filename is derived from the main input — outputName(main, "pip") = main-pip.mp4. The classic use is a screen recording main with a face-cam small at 25% scale in a corner.

The output uses the main video's duration — if webcam.mp4 is shorter, it freezes on its last frame; if longer, it gets cut. To extend with the shorter clip's last frame, pre-process with tpad=stop_mode=clone or use the repeat-frame verb on the small input first.

`pip-grid` — Multi-input grid layout

Arranges 2-9 videos in a grid (2x1, 1x2, 2x2, or 3x3). Layout is auto-detected from input count if --layout is omitted.

Source: src/commands/pip-grid.js
Filter: Per-input scale=iw/cols:ih/rows, then hstack per row, then vstack of rows
Padding: Empty slots are filled by repeating the last input
Audio: Copied from input 0 (-c:a copy)

Argument / Option	Default	Notes
`<inputs...>`	required	2-9 input video files (positional, variadic)
`--layout <grid>`	auto	`2x1`, `1x2`, `2x2`, `3x3`. Auto: 2→`2x1`, 3-4→`2x2`, 5-9→`3x3`
`-o, --output <path>`	`<input0-stem>-grid<layout>.<ext>`	—

bash

$ npx fqmpeg pip-grid clip1.mp4 clip2.mp4 clip3.mp4 clip4.mp4 --layout 2x2 --dry-run

  ffmpeg -i clip1.mp4 -i clip2.mp4 -i clip3.mp4 -i clip4.mp4 \
    -filter_complex '[0:v]scale=iw/2:ih/2[v0];
                     [1:v]scale=iw/2:ih/2[v1];
                     [2:v]scale=iw/2:ih/2[v2];
                     [3:v]scale=iw/2:ih/2[v3];
                     [v0][v1]hstack=inputs=2[row0];
                     [v2][v3]hstack=inputs=2[row1];
                     [row0][row1]vstack=inputs=2[out]' \
    -map '[out]' -c:a copy clip1-grid2x2.mp4

Pass 5 inputs to 3x3 and slots 6-9 are clones of input 5 — useful for "5 takes of a single line, 9-up grid for review" but unintended on accidental over-/under-supply. Use --layout explicitly if you want 3x3 with deliberate empty corners (you'd add black-screen videos to fill them; fqmpeg won't synthesize blanks for you). All inputs are scaled by iw/cols:ih/rows, so different aspect ratios end up stretched non-uniformly — pre-normalize with resize or crop if that matters.

`blend` — Opacity blend two videos

Combines two videos by drawing the second on top of the first with a configurable alpha.

Source: src/commands/blend.js
Filter: [1]format=yuva420p,colorchannelmixer=aa=O[over];[0][over]overlay
Audio: Re-encoded (no explicit -c:a copy)

Argument / Option	Default	Range	Notes
`<input1>`	required	—	Base (under) video
`<input2>`	required	—	Overlay (top) video
`--opacity <n>`	`0.5`	`0`–`1`	Alpha of input 2
`-o, --output <path>`	`<input1-stem>-blend.<ext>`	—	—

bash

$ npx fqmpeg blend input1.mp4 input2.mp4 --opacity 0.5 --dry-run

  ffmpeg -i input1.mp4 -i input2.mp4 -filter_complex [1]format=yuva420p,colorchannelmixer=aa=0.5[over];[0][over]overlay input1-blend.mp4

The two videos must align spatially — blend overlays at 0:0, so different resolutions will produce a misaligned composite. Resize the overlay to match the base first if needed. The duration of the output follows FFmpeg's overlay default — usually the shorter of the two. For "ghost trail" effects, set --opacity 0.3-0.4. Higher than 0.7 and the base video is barely visible.

`stack` — Side-by-side or top/bottom

Joins two videos into one frame, either horizontally (side-by-side) or vertically (over/under). The two videos must have matching dimensions on the joining axis (matching height for horizontal, matching width for vertical).

Source: src/commands/stack.js
Filter: [0:v][1:v]hstack=inputs=2 or vstack=inputs=2
Audio: -c:a copy (input 0's track)

Argument / Option	Default	Choices	Notes
`<input1>`	required	—	Left/top video
`<input2>`	required	—	Right/bottom video
`--direction <dir>`	`horizontal`	`horizontal`, `vertical`	—
`-o, --output <path>`	`<input1-stem>-stacked.<ext>`	—	—

bash

$ npx fqmpeg stack original.mp4 graded.mp4 --direction horizontal --dry-run

  ffmpeg -i original.mp4 -i graded.mp4 -filter_complex [0:v][1:v]hstack=inputs=2 -c:a copy original-stacked.mp4

The classic before/after comparison reel — color-graded vs ungraded, stabilized vs source. If the two clips don't match in height (for hstack), FFmpeg errors with Input 1 height 720 does not match input 0 height 1080. Pre-resize the smaller one with npx fqmpeg resize input.mp4 --height 1080. Audio comes from input1 only — the second video's audio is dropped.

Image-to-Video

`picture` — Still image + audio → video

Loops a still image for the duration of an audio file and encodes the result as MP4. Common use: turning a podcast episode or song into a YouTube-uploadable video.

Source: src/commands/picture.js
Codec: libx264 + aac@192k + yuv420p + -tune stillimage + -shortest
Output stem: Derived from the audio filename (not the image), as <audio-stem>-video.mp4

Argument / Option	Default	Notes
`<image>`	required	JPG / PNG cover art
`<audio>`	required	MP3 / WAV / etc.
`-o, --output <path>`	`<audio-dir>/<audio-stem>-video.mp4`	Note the stem source

bash

$ npx fqmpeg picture cover.png episode.mp3 --dry-run

  ffmpeg -loop 1 -i cover.png -i episode.mp3 -c:v libx264 -tune stillimage -c:a aac -b:a 192k -pix_fmt yuv420p -shortest episode-video.mp4

-tune stillimage tells x264 to optimize for the static-content case (very high quality at very low bitrate, since only one frame matters). -shortest ends the output when the shorter input ends — since the looped image is infinite, that's always the audio length. The output stem mirrors the audio filename, which matches the typical workflow ("convert this episode into a video"). If you need a different name, override with -o.

For a podcast cover with chapter markers, run picture first then chain embed-thumbnail and metadata work — fqmpeg keeps these as separate verbs because they're independent decisions.

Analytical Overlays

These verbs burn diagnostic data into the frame — useful for QC review, color-correction sessions, or making a tutorial that shows what's happening on a frame-by-frame basis. They're not intended for final-delivery output.

`video-info-overlay` — Frame number, PTS, picture type

Burns a per-frame text overlay showing the frame index (%{n}), presentation timestamp (%{pts:hms}), and picture type (%{pict_type} — I/P/B for inter/predicted/bidirectional frames in H.264).

Source: src/commands/video-info-overlay.js
Filter: drawtext=text='frame %{n} | pts %{pts\\:hms} | type %{pict_type}':x=10:y=10:fontsize=16:fontcolor=white:box=1:boxcolor=black@0.5
Position, font, color: All hardcoded (no options)

Argument / Option	Default	Notes
`<input>`	required	Input video
`-o, --output <path>`	`<input-stem>-info-overlay.<ext>`	—

bash

$ npx fqmpeg video-info-overlay input.mp4 --dry-run

  ffmpeg -i input.mp4 -vf "drawtext=text='frame %{n} | pts %{pts\:hms} | type %{pict_type}':x=10:y=10:fontsize=16:fontcolor=white:box=1:boxcolor=black@0.5" -c:a copy input-info-overlay.mp4

Useful for QC ("this glitch is at frame 4523, type P") and for showing GOP structure visually in tutorials. The overlay is fixed top-left at 16px font — small but legible at 1080p. For larger or repositioned variants, copy the --dry-run and edit, or build a custom invocation with text and the same %{...} expressions.

`histogram-overlay` — RGB / waveform / parade overlay

Adds a 320x240 histogram (or waveform/parade) display in the bottom-right corner — useful for color-correction work where you want to see the levels distribution alongside the frame.

Source: src/commands/histogram-overlay.js
Filter (levels): split[main][hist];[hist]histogram,scale=320:240[h];[main][h]overlay=W-w-10:H-h-10
Filter (waveform): split[main][wave];[wave]waveform=mode=column,scale=320:240[w];[main][w]overlay=W-w-10:H-h-10
Filter (parade): split[main][par];[par]waveform=mode=column:display=parade,scale=320:240[p];[main][p]overlay=W-w-10:H-h-10
Position & size: Hardcoded — bottom-right with 10px margin, 320x240 panel

Argument / Option	Default	Choices	Notes
`<input>`	required	—	Input video
`--mode <mode>`	`levels`	`levels`, `waveform`, `parade`	—
`-o, --output <path>`	`<input-stem>-histogram.<ext>`	—	—

bash

$ npx fqmpeg histogram-overlay input.mp4 --mode parade --dry-run

  ffmpeg -i input.mp4 -filter_complex split[main][par];[par]waveform=mode=column:display=parade,scale=320:240[p];[main][p]overlay=W-w-10:H-h-10 -c:a copy input-histogram.mp4

levels is the classic 256-bin RGB histogram. waveform shows the luma vs horizontal position as a column scope (broadcast-style). parade is waveform split per channel — three columns (R/G/B) side by side. For DIY color grading review where you want to see the distribution change as you scrub, parade is the most informative.

`progress` — Time-driven progress bar

Burns a horizontal bar at the top or bottom of the frame whose width grows linearly from 0 to full as time advances.

Source: src/commands/progress.js
Filter: drawbox=x=0:y=<y>:w='iw*t/<duration>':h=H:color=C:t=fill
Why duration is positional: FFmpeg's drawbox expressions can reference t (current time) but not the total duration of the source — fqmpeg compensates by baking the duration into the formula

Argument / Option	Default	Notes
`<input>`	required	Input video
`<duration>`	required	Total video duration in seconds (find with `npx fqmpeg duration input.mp4`)
`--pos <position>`	`bottom`	`top`, `bottom`
`--color <name>`	`red`	FFmpeg color
`--height <px>`	`5`	Positive integer
`-o, --output <path>`	`<input-stem>-progress.<ext>`	—

bash

$ npx fqmpeg progress input.mp4 90 --pos bottom --color cyan --height 8 --dry-run

  ffmpeg -i input.mp4 -vf drawbox=x=0:y=ih-8:w='iw*t/90':h=8:color=cyan:t=fill -c:a copy input-progress.mp4

Pass the duration in seconds — 90 for 1m30s, 3600 for an hour. The bar reaches full width exactly at t = duration; if you pass a wrong duration the bar over- or under-shoots. A typical wrapper script: dur=$(npx fqmpeg duration input.mp4) && npx fqmpeg progress input.mp4 $dur. Useful for tutorial videos where viewers want a visual cue of how much remains.

Real-World Recipes

Each recipe chains multiple verbs into a workflow you'd actually use.

Recipe 1: Brand a YouTube clip

You have a finished video and want to add a corner watermark, an opening title that fades after 3 seconds, and a thin progress bar at the bottom.

bash

# Step 1: corner watermark (logo PNG with alpha)
npx fqmpeg watermark final.mp4 logo.png --pos bottom-right --margin 20
# → final-watermarked.mp4

# Step 2: opening title shown for the first 3 seconds
npx fqmpeg text final-watermarked.mp4 "Episode 12 — Color Grading" \
  --pos top --font-size 56 --color white --bg "black@0.5" \
  --start 0 --end 3
# → final-watermarked-text.mp4

# Step 3: thin progress bar (assuming 8-minute episode = 480s)
npx fqmpeg progress final-watermarked-text.mp4 480 --pos bottom --color "white@0.6" --height 4
# → final-watermarked-text-progress.mp4

Three passes means three rounds of H.264 re-encoding, which is tolerable but adds generation loss. If you want one pass, copy the three filter strings from the --dry-run outputs and assemble them with ;/, in a single FFmpeg invocation. For frequent re-runs (a weekly podcast), the three-step workflow is fine — the convenience beats the negligible quality drop.

Recipe 2: Side-by-side comparison reel with timecode

You want to show a "before/after" of a color-graded clip, with a timecode overlay so reviewers can reference exact moments.

bash

# Step 1: stack the two clips horizontally (must match height)
npx fqmpeg stack original.mp4 graded.mp4 --direction horizontal
# → original-stacked.mp4 (3840x1080 for two 1920x1080 sources)

# Step 2: burn timecode in the top-left for reference
npx fqmpeg timecode original-stacked.mp4 --pos top-left --font-size 32
# → original-stacked-timecode.mp4

# Step 3 (optional): label each side with a static text overlay
npx fqmpeg text original-stacked-timecode.mp4 "ORIGINAL          GRADED" \
  --pos bottom --font-size 48 --color white --bg "black@0.5"

The label trick in step 3 uses spaces to push "GRADED" to the right side — crude but effective. For pixel-perfect labels you'd run two text invocations with custom positions (copy the --dry-run and replace the --pos-derived x= with absolute coordinates).

Recipe 3: Document a screen recording with face-cam and progress

You have a 12-minute screen recording (screen.mp4) and a webcam (webcam.mp4) and want a tutorial deliverable: face-cam in the corner, time-driven progress bar, and a half-transparent title for the first 5 seconds.

bash

# Step 1: face-cam picture-in-picture
npx fqmpeg pip screen.mp4 webcam.mp4 --pos bottom-right --scale 0.2 --margin 20
# → screen-pip.mp4

# Step 2: progress bar (12 minutes = 720 seconds)
npx fqmpeg progress screen-pip.mp4 720 --pos top --color "red@0.7" --height 6
# → screen-pip-progress.mp4

# Step 3: opening title
npx fqmpeg text screen-pip-progress.mp4 "Setting up Vercel + Next.js" \
  --pos center --font-size 64 --color white --bg "black@0.7" \
  --start 0 --end 5
# → screen-pip-progress-text.mp4

If the webcam is a different aspect ratio than the screen capture (square webcam, 16:9 screen), the pip overlay scales relative to the main video's dimensions — a 0.2 scale on a 1920-wide screen produces a 384-wide overlay regardless of the webcam's source dimensions. To preserve the webcam's own aspect, pre-process it with npx fqmpeg crop webcam.mp4 ... before the pip step.

Frequently Asked Questions

How do I pass colons or apostrophes inside a `text` string?

Just include them — npx fqmpeg text input.mp4 "Time: it's 12:34" works. fqmpeg's wrapper escapes : and ' before inlining into the FFmpeg drawtext filter (' becomes '\\\\'' and : becomes \:). You don't need to escape anything yourself except shell-special characters (the surrounding quotes in your shell already handle that).

`pip-grid` accepted my 5 inputs and the bottom row is duplicated — is that intended?

Yes — when input count doesn't fill the grid, fqmpeg pads empty slots with the last input. Pass 5 to a 3x3 (auto-detected) and slots 6-9 are clones of input 5. To use 2x2 for 4 inputs (and skip the padding), pass 4 inputs (auto-detects 2x2) or specify --layout 2x2 explicitly. To leave deliberate blank slots, supply a black-screen input — fqmpeg won't synthesize one for you.

`progress` requires a duration — why doesn't it auto-detect?

FFmpeg's drawbox filter expression can reference t (the running time of the current frame) but has no built-in expression for "total duration of the source." The only ways to get total duration into the filter are (1) bake it in as a literal number, which is what fqmpeg does; (2) probe the source first, which would mean running ffprobe from inside the JS wrapper — fqmpeg keeps verbs single-purpose and doesn't probe transparently. Use dur=$(npx fqmpeg duration input.mp4) then pass $dur as the argument.

What's the difference between `watermark` and `picture`?

watermark overlays a static image on top of an existing video — the input is a video and the watermark is a small image. picture does the opposite — it creates a video from a still image and an audio track (looping the image for the audio's full duration). They share libavfilter plumbing internally but solve different problems: branding (watermark) vs single-image-video creation (picture).

Can I change `histogram-overlay`'s position or panel size?

Not via fqmpeg's flags — both are hardcoded (overlay=W-w-10:H-h-10 for bottom-right with 10px margin, size=320x240). For a custom layout, copy the --dry-run filter string and edit the overlay=... and size=... parameters before running FFmpeg directly. Common edits: overlay=10:H-h-10 for bottom-left, size=480x270 for a larger panel.

`stack` errored with "Input 1 height ... does not match" — what now?

hstack requires all inputs to share the same height; vstack requires the same width. Pre-resize the mismatched input: npx fqmpeg resize smaller.mp4 --height 1080 (matching the other clip's height). If the aspect ratios differ and you don't want to crop, pad with pad first to match dimensions, then stack — using border is a quick hack since it adds a frame of configurable width on each side.

Adding a `border` makes my output non-standard size — does that matter?

Yes, on platforms with strict aspect-ratio requirements (Instagram square posts at 1:1, YouTube Shorts at 9:16). A 1920x1080 source with --width 30 becomes 1980x1140, which most players will letterbox and some social platforms will re-crop. Two workarounds: (1) use crop first to shrink the inner video, then border to bring it back to the original size; (2) let the platform do its thing and accept that the border may be cropped or letterboxed.

Can I avoid the H.264 generation loss when chaining multiple overlay verbs?

Each fqmpeg invocation re-encodes — the loss accumulates. Two ways to mitigate: (1) bump intermediate quality with -crf 18 or lossless by editing the --dry-run output; (2) bypass fqmpeg for the chain by combining filter strings from each verb's --dry-run into one FFmpeg invocation with -vf "filter1,filter2,filter3" (or -filter_complex if any verb uses multiple inputs). For 2-3 passes the loss is rarely noticeable on real content; for 5+ passes it adds up.

Wrapping Up

The 13 C8 verbs cover the composition operations you reach for after editing but before final output:

Static graphics & text (watermark, text, drawbox, timecode, border) — burn logos, titles, highlights, timestamps, frames
Multi-video composition (pip, pip-grid, blend, stack) — picture-in-picture, 2-9-input grids, opacity blends, before/after stacks
Image-to-video (picture) — promote a still image + audio into a playable MP4
Analytical overlays (video-info-overlay, histogram-overlay, progress) — burn frame metadata, color scopes, and time progress for review

Every verb prints its underlying FFmpeg invocation under --dry-run, so when fqmpeg's defaults don't fit (a custom histogram-overlay position, a custom drawtext format, a single-pass multi-filter chain to skip generation loss), copy the command, customize, and run FFmpeg directly. For the broader fqmpeg map, see the fqmpeg complete guide.

The 13 Verbs at a Glance

Static Graphics & Text

watermark — Image overlay (PNG logo, etc.)

text — Draw text / title onto video

drawbox — Draw a rectangle / border onto video

timecode — Burn timecode / running timestamp

border — Add a decorative frame around video

Multi-Video Composition

pip — Picture-in-picture overlay

pip-grid — Multi-input grid layout

blend — Opacity blend two videos

stack — Side-by-side or top/bottom

Image-to-Video

picture — Still image + audio → video

Analytical Overlays

video-info-overlay — Frame number, PTS, picture type

histogram-overlay — RGB / waveform / parade overlay

progress — Time-driven progress bar

Real-World Recipes

Recipe 1: Brand a YouTube clip

Recipe 2: Side-by-side comparison reel with timecode

Recipe 3: Document a screen recording with face-cam and progress

Frequently Asked Questions

How do I pass colons or apostrophes inside a text string?

pip-grid accepted my 5 inputs and the bottom row is duplicated — is that intended?

progress requires a duration — why doesn't it auto-detect?

What's the difference between watermark and picture?

Can I change histogram-overlay's position or panel size?

stack errored with "Input 1 height ... does not match" — what now?

Adding a border makes my output non-standard size — does that matter?

Can I avoid the H.264 generation loss when chaining multiple overlay verbs?

Wrapping Up

`watermark` — Image overlay (PNG logo, etc.)

`text` — Draw text / title onto video

`drawbox` — Draw a rectangle / border onto video

`timecode` — Burn timecode / running timestamp

`border` — Add a decorative frame around video

`pip` — Picture-in-picture overlay

`pip-grid` — Multi-input grid layout

`blend` — Opacity blend two videos

`stack` — Side-by-side or top/bottom

`picture` — Still image + audio → video

`video-info-overlay` — Frame number, PTS, picture type

`histogram-overlay` — RGB / waveform / parade overlay

`progress` — Time-driven progress bar

How do I pass colons or apostrophes inside a `text` string?

`pip-grid` accepted my 5 inputs and the bottom row is duplicated — is that intended?

`progress` requires a duration — why doesn't it auto-detect?

What's the difference between `watermark` and `picture`?

Can I change `histogram-overlay`'s position or panel size?

`stack` errored with "Input 1 height ... does not match" — what now?

Adding a `border` makes my output non-standard size — does that matter?