perf(engine): content-addressed extraction cache for video frames#433
Closed
jrusso1020 wants to merge 1 commit intoperf/producer-segment-scope-hdr-preflightfrom
Closed
perf(engine): content-addressed extraction cache for video frames#433jrusso1020 wants to merge 1 commit intoperf/producer-segment-scope-hdr-preflightfrom
jrusso1020 wants to merge 1 commit intoperf/producer-segment-scope-hdr-preflightfrom
Conversation
Collaborator
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
This was referenced Apr 23, 2026
Collaborator
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What
Adds a content-addressed cache for pre-extracted video frames. When
extractCacheDiris configured,extractAllVideoFramesskips the Phase 3 ffmpeg extract on inputs whose(source, window, fps, format)tuple has been seen before.Opt-in via
EngineConfig.extractCacheDirorHYPERFRAMES_EXTRACT_CACHE_DIR. Disabled by default.Why
PR 3 of the 5-PR stack from
hyperframes-notes/producer-render-architecture-review-2026-04-21.md. The doc calls this the biggest wall-clock win for iteration workflows:Typical impact: the second render of an unchanged composition skips Phase 3 entirely, so
videoExtractBreakdown.extractMsdrops to ~0 (just an ffprobe on the source) andcacheHitsequals the video count. This is visible in theRenderPerfSummaryfrom PR 1.How
Key layout (
extractionCache.ts)v1-<sha256-prefix-32>— schema-version prefix + 128-bit hash ofpath | mtime-ms | size | mediaStart | duration | fps | format.mtime+sizeis a good proxy for "the same file on disk."v2.Completeness sentinel
.hf-completewritten after a successful extraction.Lifecycle
ExtractedFrames.ownedByLookup?: booleanfield. Whenfalse(cache-owned paths),FrameLookupTable.cleanup()skipsrmSyncso the cache dir survives the render.ExtractedFramesbuilt from the cache entry's frame files + the source's ffprobe metadata.outputDirOverrideparam onextractVideoFramesRange), then mark complete. The per-renderwork/video-frames/<id>/dir isn't created on the cache path.Counters
ExtractionPhaseBreakdown.cacheHits/cacheMissescounters flow throughRenderPerfSummary.videoExtractBreakdown.Known limitations (deliberate for v1)
work/dir (different path+mtime per render), so its cache key differs across runs. An optimization for later; the common SDR+CFR case is covered.Test plan
extractionCache.test.ts— key determinism, schema-version prefix, per-field invalidation (path / mtime / size / window / fps / format), float stability, source probing, sentinel presence/absence, format-mismatch miss, multi-frame hit.videoFrameExtractor.test.ts(skipped when ffmpeg unavailable):cacheHits=1, wall time under 2s, identical frame count.Stack
Built on #432 (segment-scope HDR preflight) which is built on #430 (phase-level instrumentation). Each PR in the stack adds to
RenderPerfSummaryso improvements are measurable.Follow-ups (remaining PRs in the stack)
CACHE_SCHEMA_VERSIONto 2)!hasAlphaand segment-length floor