---
title: "Performance Recipes: Concurrency, GPU, Prefetch & Encoding"
type: synthesis
tags: [performance, concurrency, gpu, hardware-acceleration, rendering, decision-guide]
created: 2026-06-11
updated: 2026-06-11
confidence: high
sources: ["raw/github_doc-packages-docs-docs-performance-mdx.md", "raw/github_doc-packages-docs-docs-gpu-mdx.md", "raw/github_doc-packages-docs-docs-hardware-acceleration-mdx.md", "raw/github_doc-packages-docs-docs-scaling-mdx.md", "raw/github_doc-packages-docs-docs-slow-method-to-extract-frame-mdx.md", "raw/github_doc-packages-docs-docs-quality-mdx.md", "raw/github_doc-packages-docs-docs-prefetch-mdx.md", "raw/github_issue-parallel-frame-capturing-and-encoding-for-better-performance.md"]
---

# Performance Recipes: Concurrency, GPU, Prefetch & Encoding

The levers that actually move render speed (and Player smoothness), with verbatim flags and values.

## Comparison

| Lever | How | Effect | Applies to |
|---|---|---|---|
| Concurrency | `--concurrency` flag; find optimum with `npx remotion benchmark` | Too high AND too low are both counterproductive | CLI/SSR render |
| Image format | default JPEG (quality 80); `--image-format=png` only for transparency; tune `--jpeg-quality` | PNG screenshots are significantly slower than JPEG | Render |
| GPU (`--gl`) | enable an OpenGL backend during rendering (headless Chrome disables GPU by default) | Big speedup ONLY for GPU content: WebGL (Three.js/Skia/P5/Mapbox), 2D canvas, `box-shadow`, `text-shadow`, `linear-gradient`/`radial-gradient`, `filter: blur()`, `filter: drop-shadow()`, `transform` | Render on GPU machines; **not Lambda (no GPU)** |
| Hardware-accelerated encoding | `--hardware-acceleration if-possible` (CLI) or `hardwareAcceleration: 'if-possible'` in `renderMedia()`; also `Config.setHardwareAcceleration('if-possible')` | Faster encoding; **macOS only (VideoToolbox)**; ProRes from v4.0.228, H.264/H.265 from v4.0.236; not supported on Lambda/Cloud Run | Encoding step |
| Bitrate w/ HW accel | `--video-bitrate=8M` | HW-encoded files are much larger by default; 8M ≈ software-encoded size for H.264 Full HD; `crf` is incompatible with HW encoders | Encoding |
| Codec choice | avoid `vp8`/`vp9` (WebM) unless needed | WebM codecs are very slow to encode | Encoding |
| Resolution | `--scale` (e.g. `--scale=0.5`; max 16; since 4.0.328 non-integer/odd values auto-round) | Lower resolution = faster; higher scale = sharper text on hi-DPI but slower | Render |
| Video tag choice | use `<Video>` from `@remotion/media` | `<Html5Video>` and `<OffthreadVideo>` are "not optimized" per official docs | Render + Player |
| OffthreadVideo flags | leave `transparent` off; set `toneMapped={false}` if color accuracy is secondary | PNG extraction (~40% slower) and tone mapping both cost time | Render |
| Slow frame extraction (pre-v4) | re-encode: `npx remotion ffmpeg -i inputvideo.mp4 outputvideo.mp4`; or prefer VP9 over VP8 / JPEG over PNG | Eliminates "Using a slow method to extract the frame at 1000ms of [video]" | Render (Remotion < 4.0 only) |
| Asset prefetch (Player) | `prefetch(src, {method: 'blob-url'})`; `free()` to release; `base64` for Safari audio hiccups; set `contentType` if server headers are wrong | Media ready before playback; Player-only (no-op when rendering) | Player/Studio |
| Slow JS / data fetching | `--log=verbose` lists slowest frames; `console.time`; `useMemo`/`useCallback`; pre-compute and pass as input props; cache in localStorage | Removes per-frame bottlenecks | Everything |
| Silent video audio | add `muted` to videos without sound | Skips downloading the file for audio mixing | Render |

## Analysis

- **Concurrency is empirical, not "more is better."** The benchmark command exists precisely because the optimum depends on composition and hardware. History: Remotion once ran one browser with many tabs, which serialized rendering (issue #584 measured 67s → 43s frame time by switching to multiple browser instances; 39.7s → 22.2s on a 5900X). Modern Remotion parallelizes, but the lesson stands — measure with `npx remotion benchmark`.
- **Studio/Player feel fast, renders feel slow — because the GPU disappears.** In Studio and Player the GPU is on; headless Chromium disables it. That's why a composition full of `box-shadow`, gradients, and blurs renders far slower than it previews. Options: enable `--gl` on a GPU machine, or replace GPU-heavy CSS with precomputed images. Note what the GPU does NOT accelerate: `<Html5Video>`, `<OffthreadVideo>`, canvas pixel manipulation, and video encoding itself.
- **Two different "hardware" levers get conflated:** GPU rendering of frames (`--gl`, content-dependent) vs hardware-accelerated *encoding* (VideoToolbox on macOS only). They're independent; on Lambda you get neither.
- **Quality and speed share knobs.** CRF controls quality for software encoding; `--video-bitrate` replaces it under hardware acceleration (mutually exclusive with `--crf`). `--x264-preset` trades speed/compression. Sharp text on hi-DPI displays needs output scaling (`--scale=2`), which multiplies render cost — and canvas/WebGL content won't upscale unless components support `pixelDensity` (`<Solid>`, `<HtmlInCanvas>` with `usePixelDensity()`).
- **Cloud renders pay for everything twice:** GPU effects render slowly on CPU-only Lambda AND big assets cost egress. Performance work and cost work overlap heavily — see [[syntheses/cost-estimation]].

## Recommendations

- **Always start with `npx remotion benchmark`** before hand-tuning `--concurrency`.
- **Rendering locally on a Mac with H.264/H.265/ProRes output:** add `--hardware-acceleration if-possible --video-bitrate=8M`.
- **Composition is CSS-effect heavy (shadows/gradients/blur):** either render on a GPU machine with `--gl`, or bake effects into images before rendering on Lambda.
- **Slow renders with embedded videos:** switch to `<Video>` from `@remotion/media` ([[syntheses/media-tag-picker]]); mute silent videos; keep `transparent` off.
- **Player startup jank with remote media:** `prefetch()` assets on mount with `blob-url`; switch to `base64` only if Safari audio stutters.
- **Diagnose before tuning:** `--log=verbose` to find the slowest frames; if early frames are slow that's thread initialization, not your code.
- **Need 4K output of a 1080p comp:** `--scale=2` and accept the slowdown; don't redesign the composition.

## Pages Compared

- [[concepts/rendering-ssr]] — render pipeline these levers act on
- [[concepts/cli-reference]] — `benchmark`, `render` flags
- [[concepts/lambda]] — what's unavailable there (GPU, HW encoding)
- [[concepts/player]] — prefetch applies here only
- [[syntheses/media-tag-picker]] — tag choice as a performance lever
- [[syntheses/cost-estimation]] — speed savings = cost savings on Lambda
- [[summaries/casebook-rendering]] — the multi-tab concurrency history (issue #584)
