fix(engine): auto-normalize VFR video inputs to CFR before frame extraction by jrusso1020 · Pull Request #360 · heygen-com/hyperframes

jrusso1020 · 2026-04-21T07:00:37Z

What

Auto-normalize variable-frame-rate (VFR) video inputs to constant-frame-rate (CFR) before the frame-extraction stage, eliminating the "frozen screen recording" class of bugs in rendered compositions.

Why

A user reported on X that their screen-recording scenes were freezing in hyperframes renders. Investigation:

macOS ScreenCaptureKit, QuickTime screen recordings, and phone videos all emit VFR by default (frames written only on content change).
The extractor at packages/engine/src/services/videoFrameExtractor.ts:142 runs ffmpeg -ss <start> -i <video> -t <dur> -vf fps=N on the input.
On VFR inputs, the fps filter exhibits two failure modes:
1. Frame-count shortfall: for a 4-second segment at 30fps starting mid-file, output was ~90 frames instead of 120. FrameLookupTable.getFrameAtTime (line 440) returns null for out-of-range indices, so the compositor holds the last valid frame — the user sees a freeze.
2. Duplicate-frame runs: 32-39% of output frames were duplicates on realistic VFR fixtures, because the fps filter snaps multiple outputs to the same nearest source frame.

The engine already detects VFR via metadata.isVFR (packages/engine/src/utils/ffprobe.ts:257) but never acted on it — the compiler only logged a warning telling users to manually re-encode.

How

Mirrored the existing SDR→HDR normalization pattern in the same file. When metadata.isVFR === true, re-encode the used segment to CFR before extraction:

ffmpeg -ss <mediaStart> -i <video> -t <duration> \
       -fps_mode cfr -r <targetFps> \
       -c:v libx264 -preset fast -crf 18 \
       -c:a copy -y <normalized.mp4>

Scoping to [mediaStart, mediaStart+duration] rather than full-file means a 30-second clip cut from a 60-minute screen recording pays ~1s of transcode cost, not 18s.

Benchmarked locally on a synthesized VFR fixture (10s of testsrc2 with ~40% frames dropped, sparse keyframes every 10s):

Strategy	Dupe rate	Frame-count shortfall
Baseline (before)	32-39%	25% shortfall on mid-segments
Flag changes only (`-fps_mode cfr`, `-r` vs `-vf fps=N`, accurate seek)	~same	still short
CFR preflight (this PR)	1.7-6%	perfect frame count

The flag-only tweaks were tested and insufficient — the fps filter's VFR handling is the underlying issue, not the seek mode or output-rate flag. Pre-encoding is the fix experiment-framework already uses internally for the same reason (see worker/celery/movio/utils/ffmpeg_utils.py reencode_video_if_potentially_vfr).

The compiler warning that used to tell users to manually re-encode VFR videos is downgraded from console.warn to console.info since the engine now handles it; the message still mentions the pre-encode command for users who want to skip the per-render transcode cost.

Test plan

Unit tests added: packages/engine/src/services/videoFrameExtractor.test.ts
- Detects the fixture as VFR.
- Regression test: mid-file segment produces expected frame count (120 @ 4s × 30fps, previously ~90).
- Full-file case produces expected frame count (300 @ 10s × 30fps).
All 311 existing engine tests still pass.
Typecheck, lint, format clean.
Manual validation against the user's actual screen recording (once Miguel gets the repro file from the reporter).

— Rames Jusso

…action Screen recordings (macOS ScreenCaptureKit, QuickTime, phone videos) are commonly variable-frame-rate. When such inputs hit the extractor's `-ss <start> -i <video> -t <dur> -vf fps=N` pipeline, the fps filter can emit fewer frames than requested — for a 4-second 30fps segment starting mid-file, the output was ~90 frames instead of 120. `FrameLookupTable.getFrameAtTime` returns null for out-of-range indices, so the compositor held the last valid frame and the user perceived the video as freezing. This matches the bug report from an X community post where a user said "all of them freezes" on their screen recording scenes. The engine already detects VFR via `metadata.isVFR` in ffprobe.ts but never acted on it — the compiler only logged a warning. This change mirrors the existing SDR→HDR normalization pattern: when a source is detected as VFR, re-encode only the used segment with `-fps_mode cfr -r <fps> -preset fast -crf 18` before extraction. Scoping the re-encode to `[mediaStart, mediaStart+duration]` means a 30-second clip cut from a 60-minute screen recording pays ~1s of transcode cost, not 18s. Benchmarked locally: Baseline (current): 32-39% duplicate frames, 25% frame-count shortfall on mid-file segments. Tier 1 (flag changes only): ~same — fps filter issue is not flag-fixable. Tier 2 (CFR preflight): 1.7-6% duplicate frames, correct frame count in every scenario tested. The compiler warning that previously told users to manually re-encode is downgraded to `console.info` since the engine now handles it. — Rames Jusso

- Drop the `vfrNormDirCreated` flag; `mkdirSync({recursive:true})` is idempotent and cheap. - Don't re-wrap the `VFR→CFR conversion failed` prefix — `convertVfrToCfr` already throws a message with that label; adding it again in the catch produced "VFR→CFR conversion failed: VFR→CFR conversion failed (exit 1)". - Shorten the Phase 2b header comment; the function docstring above `convertVfrToCfr` already explains the failure modes and rationale. - Note which frame windows the VFR fixture's select filter drops so the magic numbers are scannable. No behavior change; 311/311 engine tests still pass. — Rames Jusso

miguel-heygen

I'd add a regression test just in case

Adds a describe block that synthesizes a VFR fixture via ffmpeg and asserts the extractor produces the expected frame count (no shortfall) and no long runs of duplicate frames — the user-visible "frozen screen recording" symptom. Covers both a mid-file segment and the full-file case. Guarded with describe.skipIf(!HAS_FFMPEG) because the CI Test job on ubuntu-24.04 and the Windows test-windows job don't install ffmpeg. The producer-level regression test in packages/producer/tests/vfr-screen-recording/ runs inside Dockerfile.test (which has ffmpeg) and is the primary CI signal for this bug; these unit tests are supplementary coverage for local and any ffmpeg-equipped CI environment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

End-to-end CI regression coverage for PR #360 via the existing regression-harness: renders a 3s composition containing a real macOS ScreenCaptureKit clip (r_frame_rate=120, avg≈36fps) seeked to mediaStart=1, then PSNR-compares against a committed output.mp4. Fixture src/clip.mp4 (108 KB) is a 5-second excerpt downscaled to 480×332 with -fps_mode passthrough to preserve the VFR timestamps. Content is the public hyperframes OSS repo root page — see NOTICE.md for provenance. With the fix applied, all 100 PSNR checkpoints pass. With the fix reverted, 66 of 100 fail (PSNR drops from ~43 dB to ~20 dB in the duplicate-frame windows). Tagged "regression,video,vfr" so it runs in the fast shard of .github/workflows/regression.yml automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The committed golden output.mp4 was initially rendered on the host machine; CI runs the renderer inside Dockerfile.test with a different Chrome + ffmpeg build, producing pixel-level drift that failed PSNR at 54/100 checkpoints (~20 dB vs 41 dB in the VFR sparse-content windows). Both renders are valid — the VFR source has inherent sampling ambiguity in static segments, and different Chrome/ffmpeg builds make different valid choices. Regenerated the baseline via `bun run docker:test:update vfr-screen-recording` so it matches the Docker environment CI actually uses. Matches the flow the existing sub-composition-video, hdr-pq, etc. baselines were captured with. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Hit this 2026-04-21 with the vfr-screen-recording regression test: host-generated output.mp4 baseline tripped 54/100 PSNR checkpoints in CI because Chrome + ffmpeg drift between the host and Dockerfile.test. Document the `bun run --cwd packages/producer docker:test:update <name>` flow so future contributors don't repeat the mistake. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jrusso1020 added 2 commits April 21, 2026 07:00

miguel-heygen requested changes Apr 21, 2026

View reviewed changes

jrusso1020 requested a review from miguel-heygen April 21, 2026 17:49

miguel-heygen approved these changes Apr 21, 2026

View reviewed changes

jrusso1020 and others added 4 commits April 21, 2026 18:01

jrusso1020 merged commit ffc0682 into main Apr 21, 2026
25 checks passed

This was referenced Apr 23, 2026

perf(producer): phase-level render instrumentation #430

Closed

perf(engine): segment-scope convertSdrToHdr re-encode #432

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(engine): auto-normalize VFR video inputs to CFR before frame extraction#360

fix(engine): auto-normalize VFR video inputs to CFR before frame extraction#360
jrusso1020 merged 6 commits intomainfrom
fix/vfr-screen-recording-freeze

jrusso1020 commented Apr 21, 2026

Uh oh!

miguel-heygen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jrusso1020 commented Apr 21, 2026

What

Why

How

Test plan

Uh oh!

miguel-heygen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants