32blogby Studio Mitsu

Automate Video Processing with FFmpeg and Python

Learn how to call FFmpeg from Python to automate batch video processing. Covers subprocess, ffmpeg-python, progress bars, parallel encoding, and error handling.

by omitsu12 min read

This article contains affiliate links.

On this page

The simplest way to call FFmpeg from Python is subprocess.run(["ffmpeg", "-i", "input.mov", "-c:v", "libx264", "output.mp4"]). For batch processing, combine this with pathlib.glob() to iterate over files, tqdm for progress bars, and concurrent.futures to encode in parallel across CPU cores.

If you're converting videos by hand, automating the process with Python is well worth the effort. Whether you need to batch-convert 100 MP4 files to WebM or resize a folder of recordings all at once, a single script can handle it. Once you set it up, you never go back to doing it manually.

This article walks through calling FFmpeg from Python — starting with the basics using subprocess, then moving to the ffmpeg-python library, a full folder batch converter, progress bars, parallel encoding, and a robust error handling pattern with retries.

Why Use Python with FFmpeg

FFmpeg is a complete and powerful CLI tool on its own. But when you're dealing with large numbers of files or complex conditional logic, shell scripts start to show their limits.

Combining Python gives you several advantages.

  • File enumeration and filtering: target only files matching certain extensions, sizes, or naming patterns
  • Error handling and retries: log failures and reprocess them automatically
  • Progress display: know in real time how many files are done and how many remain
  • Structured logging: save conversion results in a format you can review later

Shell scripts can handle some of this, but Python keeps the logic readable as complexity grows. The FFmpeg documentation covers the CLI flags, while Python handles the orchestration layer around it.

Input Folderglob filterIteratePythonsubprocessCallFFmpegEncodeWriteOutput FolderDone / Failed

The Basics with subprocess

The simplest way to call FFmpeg from Python is subprocess.run(). Since FFmpeg is a CLI tool, you pass the command-line arguments as a list and it just works.

python
import subprocess
from pathlib import Path


def convert_to_mp4(input_path: Path, output_path: Path) -> bool:
    """Convert a file to MP4. Returns True on success, False on failure."""
    cmd = [
        "ffmpeg",
        "-i", str(input_path),
        "-c:v", "libx264",
        "-crf", "23",
        "-preset", "medium",
        "-c:a", "aac",
        "-b:a", "128k",
        "-y",               # overwrite without asking
        str(output_path),
    ]

    result = subprocess.run(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )

    return result.returncode == 0


if __name__ == "__main__":
    src = Path("input.mov")
    dst = Path("output.mp4")

    if convert_to_mp4(src, dst):
        print(f"Success: {dst}")
    else:
        print("Conversion failed")

A few things worth noting.

  • The "-y" flag overwrites the output file if it already exists — useful for batch processing
  • stdout=subprocess.PIPE and stderr=subprocess.PIPE capture FFmpeg's output instead of letting it flood the terminal
  • FFmpeg writes its logs to stderr even on success. Access error output via result.stderr.decode()

Using the ffmpeg-python Library

Calling subprocess directly is straightforward, but the command list gets unwieldy when you add many options. The ffmpeg-python library offers a cleaner approach.

Install it with pip.

python
# pip install ffmpeg-python

You build the FFmpeg pipeline using method chaining.

python
import ffmpeg


def transcode(input_path: str, output_path: str) -> None:
    """Transcode using ffmpeg-python."""
    (
        ffmpeg
        .input(input_path)
        .output(
            output_path,
            vcodec="libx264",
            crf=23,
            preset="medium",
            acodec="aac",
            audio_bitrate="128k",
        )
        .overwrite_output()
        .run(capture_stdout=True, capture_stderr=True)
    )


if __name__ == "__main__":
    transcode("input.mov", "output.mp4")

When an error occurs, ffmpeg.Error is raised. The e.stderr attribute contains FFmpeg's standard error output, which is useful for debugging.

python
import ffmpeg


def transcode_safe(input_path: str, output_path: str) -> None:
    try:
        (
            ffmpeg
            .input(input_path)
            .output(output_path, vcodec="libx264", crf=23)
            .overwrite_output()
            .run(capture_stdout=True, capture_stderr=True)
        )
    except ffmpeg.Error as e:
        print("FFmpeg error:")
        print(e.stderr.decode())
        raise

Batch Converting a Folder of Videos

Now for the main event. The script below converts all MOV files in a folder to MP4. It uses pathlib.glob() to enumerate files and processes them one by one.

python
import subprocess
from pathlib import Path
from typing import List


INPUT_DIR = Path("./originals")
OUTPUT_DIR = Path("./converted")
INPUT_EXT = ".mov"
OUTPUT_EXT = ".mp4"


def convert_file(src: Path, dst: Path) -> bool:
    """Convert a single file to MP4."""
    dst.parent.mkdir(parents=True, exist_ok=True)

    cmd = [
        "ffmpeg",
        "-i", str(src),
        "-c:v", "libx264",
        "-crf", "23",
        "-preset", "fast",
        "-c:a", "aac",
        "-b:a", "128k",
        "-movflags", "+faststart",  # optimize for web playback
        "-y",
        str(dst),
    ]

    result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    return result.returncode == 0


def batch_convert(input_dir: Path, output_dir: Path) -> None:
    """Convert all INPUT_EXT files in input_dir to OUTPUT_EXT."""
    files = list(input_dir.glob(f"**/*{INPUT_EXT}"))

    if not files:
        print(f"No {INPUT_EXT} files found in {input_dir}")
        return

    print(f"Found {len(files)} file(s) to convert")
    success_count = 0
    fail_list: List[Path] = []

    for i, src in enumerate(files, start=1):
        # preserve the relative folder structure in the output
        relative = src.relative_to(input_dir)
        dst = output_dir / relative.with_suffix(OUTPUT_EXT)

        print(f"[{i}/{len(files)}] {src.name} → {dst.name}", end=" ... ")

        if convert_file(src, dst):
            print("OK")
            success_count += 1
        else:
            print("FAILED")
            fail_list.append(src)

    print(f"\nDone: {success_count}/{len(files)} succeeded")

    if fail_list:
        print("Failed files:")
        for f in fail_list:
            print(f"  {f}")


if __name__ == "__main__":
    batch_convert(INPUT_DIR, OUTPUT_DIR)

The recursive glob pattern processes subfolders automatically. The output mirrors the original directory structure, so even deeply nested files stay organized.

Adding a Progress Bar

When processing a large number of files, a progress bar makes the wait much more manageable. tqdm adds one in a single line.

python
import subprocess
from pathlib import Path
from typing import List
from tqdm import tqdm


# pip install tqdm


def convert_file(src: Path, dst: Path) -> bool:
    dst.parent.mkdir(parents=True, exist_ok=True)
    cmd = [
        "ffmpeg", "-i", str(src),
        "-c:v", "libx264", "-crf", "23", "-preset", "fast",
        "-c:a", "aac", "-b:a", "128k",
        "-y", str(dst),
    ]
    result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    return result.returncode == 0


def batch_convert_with_progress(input_dir: Path, output_dir: Path) -> None:
    files = list(input_dir.glob("**/*.mov"))
    fail_list: List[Path] = []

    with tqdm(total=len(files), unit="file", ncols=80) as pbar:
        for src in files:
            relative = src.relative_to(input_dir)
            dst = output_dir / relative.with_suffix(".mp4")

            pbar.set_description(src.name[:30])  # show filename on the left

            if not convert_file(src, dst):
                fail_list.append(src)

            pbar.update(1)

    if fail_list:
        print(f"\nFailed: {len(fail_list)} file(s)")
        for f in fail_list:
            print(f"  {f}")


if __name__ == "__main__":
    batch_convert_with_progress(Path("./originals"), Path("./converted"))

set_description() shows the current filename to the left of the bar. Truncating with [:30] prevents long filenames from breaking the layout.

Error Handling and Retry Logic

One tricky aspect of FFmpeg error handling is that a successful exit code does not guarantee the output file is valid. If the input has data corruption partway through, FFmpeg may finish with exit code 0 while producing an unplayable file.

The script below combines retry logic with a basic output validation step using ffprobe.

python
import subprocess
import time
from pathlib import Path


MAX_RETRIES = 3
RETRY_DELAY = 2  # seconds


def verify_output(path: Path) -> bool:
    """Validate the output file using ffprobe."""
    result = subprocess.run(
        ["ffprobe", "-v", "error", "-show_entries", "format=duration",
         "-of", "default=noprint_wrappers=1:nokey=1", str(path)],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )
    if result.returncode != 0:
        return False
    output = result.stdout.decode().strip()
    return bool(output) and float(output) > 0


def convert_with_retry(src: Path, dst: Path, retries: int = MAX_RETRIES) -> bool:
    """Convert with up to `retries` attempts."""
    for attempt in range(1, retries + 1):
        dst.parent.mkdir(parents=True, exist_ok=True)

        cmd = [
            "ffmpeg", "-i", str(src),
            "-c:v", "libx264", "-crf", "23", "-preset", "fast",
            "-c:a", "aac", "-b:a", "128k",
            "-y", str(dst),
        ]

        result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

        if result.returncode != 0:
            print(f"  Attempt {attempt}/{retries}: FFmpeg error (exit {result.returncode})")
            if attempt < retries:
                time.sleep(RETRY_DELAY)
            continue

        if not verify_output(dst):
            print(f"  Attempt {attempt}/{retries}: output validation failed")
            dst.unlink(missing_ok=True)  # remove the broken file
            if attempt < retries:
                time.sleep(RETRY_DELAY)
            continue

        return True

    return False


def batch_convert_robust(input_dir: Path, output_dir: Path) -> None:
    files = list(input_dir.glob("**/*.mov"))
    success, failed = 0, []

    for i, src in enumerate(files, start=1):
        relative = src.relative_to(input_dir)
        dst = output_dir / relative.with_suffix(".mp4")
        print(f"[{i}/{len(files)}] {src.name}")

        if convert_with_retry(src, dst):
            print(f"  Done: {dst}")
            success += 1
        else:
            print(f"  Skipped after {MAX_RETRIES} failed attempts: {src}")
            failed.append(src)

    print(f"\nResult: {success} succeeded / {len(failed)} failed / {len(files)} total")

    if failed:
        log_path = output_dir / "failed.txt"
        log_path.write_text("\n".join(str(f) for f in failed))
        print(f"Failure log saved to: {log_path}")


if __name__ == "__main__":
    batch_convert_robust(Path("./originals"), Path("./converted"))

Failed files are written to failed.txt in the output directory so you can review and reprocess them later.

For a solid grounding in FFmpeg itself, check out the FFmpeg usage tutorial.

Parallel Encoding with concurrent.futures

All the scripts above process files one at a time. If your machine has multiple CPU cores — and most do — you're leaving performance on the table. Python's concurrent.futures module makes it straightforward to encode multiple files simultaneously.

python
import subprocess
from concurrent.futures import ProcessPoolExecutor, as_completed
from pathlib import Path


MAX_WORKERS = 4  # adjust to your CPU core count


def convert_file(src: Path, dst: Path) -> tuple[Path, bool]:
    """Convert a single file. Returns (source_path, success)."""
    dst.parent.mkdir(parents=True, exist_ok=True)
    cmd = [
        "ffmpeg", "-i", str(src),
        "-c:v", "libx264", "-crf", "23", "-preset", "fast",
        "-c:a", "aac", "-b:a", "128k",
        "-y", str(dst),
    ]
    result = subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    return src, result.returncode == 0


def batch_convert_parallel(input_dir: Path, output_dir: Path) -> None:
    files = list(input_dir.glob("**/*.mov"))
    tasks: dict = {}
    success, failed = 0, []

    with ProcessPoolExecutor(max_workers=MAX_WORKERS) as executor:
        for src in files:
            relative = src.relative_to(input_dir)
            dst = output_dir / relative.with_suffix(".mp4")
            future = executor.submit(convert_file, src, dst)
            tasks[future] = src

        for future in as_completed(tasks):
            src_path, ok = future.result()
            if ok:
                success += 1
                print(f"  Done: {src_path.name}")
            else:
                failed.append(src_path)
                print(f"  FAILED: {src_path.name}")

    print(f"\nResult: {success} succeeded / {len(failed)} failed / {len(files)} total")


if __name__ == "__main__":
    batch_convert_parallel(Path("./originals"), Path("./converted"))

A few things to keep in mind with parallel encoding.

  • MAX_WORKERS should match your available cores. FFmpeg itself can use multiple threads per encode, so setting this too high will cause contention. For x264 with -preset fast, 2-4 workers is a reasonable starting point on an 8-core machine
  • Memory usage scales linearly with the number of parallel encodes. If you're working with 4K footage, monitor RAM usage
  • File I/O can become the bottleneck — especially on spinning disks. SSDs handle parallel reads/writes much better

For a batch of 50 screen recordings (1080p, 2-5 minutes each), 4 workers can deliver a 3-4x speedup over single-threaded processing. It's not a perfect N-times improvement due to I/O overhead, but the difference is significant.

For even larger batches where your local machine isn't enough, consider offloading the work to a VPS encoding server. A cloud instance with 16+ cores can chew through hundreds of files while you keep working locally.

Kamatera

Enterprise-grade cloud VPS with global data centers

  • 13 data centers (US, EU, Asia, Middle East)
  • Starting at $4/month for 1GB RAM — pay-as-you-go
  • 30-day free trial available

FAQ

What's the best way to call FFmpeg from Python?

Use subprocess.run() with the command as a list of strings. It's the most reliable and gives you full control over FFmpeg's options. The ffmpeg-python library is a convenient wrapper but adds a dependency.

Does ffmpeg-python still work? It hasn't been updated since 2019.

Yes. The latest release is v0.2.0 from 2019, but the library is stable because it simply wraps the FFmpeg CLI. As long as FFmpeg is installed and the CLI interface doesn't change (it rarely does), ffmpeg-python continues to work. If you want a more actively maintained alternative, look at python-ffmpeg or typed-ffmpeg.

How do I show a progress bar for each individual file's encoding?

The examples in this article show file-level progress (file 1 of N). For frame-level progress within a single encode, parse FFmpeg's stderr output line by line looking for frame= and time= fields. The better-ffmpeg-progress package does this out of the box.

Can FFmpeg exit with code 0 but produce a broken file?

Yes. This happens when the input has partial corruption or disk space runs out mid-encode. Always validate the output with ffprobe — check that the duration is greater than zero and matches the expected value.

How many parallel FFmpeg processes should I run?

Start with the number of physical CPU cores divided by 2. FFmpeg's x264 encoder uses multiple threads per process, so running too many in parallel causes thread contention and actually slows things down. Monitor CPU usage and adjust.

Should I use subprocess.run() or subprocess.Popen()?

Use subprocess.run() for most batch processing — it blocks until FFmpeg finishes, which makes error handling simple. Use Popen() only when you need to stream stderr in real time (e.g., for frame-level progress parsing).

What Python version do I need?

Python 3.8 or later. The scripts use pathlib, keyword arguments for subprocess.run(), and type hints. The parallel processing example uses tuple[Path, bool] syntax which requires Python 3.9+.

How do I handle FFmpeg encoding on a remote server?

Upload your files via rsync or scp, run the Python script over SSH (use nohup or tmux so it survives disconnection), then download the results. A VPS with dedicated CPU cores is ideal for this.

Wrapping Up

Here's a summary of what this article covered.

  • subprocess.run() is the simplest way to call FFmpeg from Python, using the same options you'd pass on the command line
  • ffmpeg-python lets you build pipelines with method chaining, which is especially useful for complex filter graphs
  • Batch conversion uses pathlib.glob() to enumerate files and preserves the original folder structure in the output
  • tqdm adds a progress bar in one line — essential for long-running batch jobs
  • Robust error handling means not trusting exit code alone. Validate output with ffprobe and retry on failure
  • Parallel encoding with concurrent.futures cuts processing time roughly in proportion to your CPU cores

To take these scripts further, consider adding argparse for CLI arguments and logging to write structured logs to a file.


Related articles: