fix(engine,producer): preserve template-wrapped sub-composition media offsets#476
Conversation
|
Apparently I'm working on the same issue in #475 |
jrusso1020
left a comment
There was a problem hiding this comment.
Verdict: approve — real silent-misrender bug, reproduced against main, fix cleanly addresses both root causes
Local verification
Bug reproduced on main. Cherry-picked the new htmlCompiler.test.ts cases onto main (3b8de7a5) and ran them:
(fail) template-wrapped sub-composition media offsets > offsets template-wrapped media to the host start during compile
expect(compiled.videos[0]?.start).toBe(2);
Expected: 2
Received: 0
(fail) template-wrapped sub-composition media offsets > preserves first-pass media offsets when durations are resolved after inlining
That's the real failure mode: a <video> authored scene-local at data-start="0" inside a template-wrapped sub-composition with host data-start="2" is being scheduled at t=0s instead of t=2s. In a render, the child video collides with the intro on the timeline — exactly what the PR description says.
Fix verified against 5d23706e:
bun test packages/engine/src/utils/htmlTemplate.test.ts→ 7/7 passbun test packages/producer/src/services/htmlCompiler.test.ts→ 30/30 pass (including both new regression cases)bun run --filter @hyperframes/engine test→ 490/490 passbun run --filter @hyperframes/engine typecheck→ cleanbun run --filter @hyperframes/producer typecheck→ clean
Staff review — architecture + correctness
The PR identifies TWO distinct root causes and fixes each at the precise site. Not a shotgun fix.
Root cause 1 — <template> contents invisible to media scrapers. LinkeDOM follows spec: <template> contents live in a DocumentFragment that querySelectorAll() on the outer document never traverses. Sub-compositions in this codebase commonly wrap their body in a single top-level <template> (see the skill contract — every sub-composition is a <template> wrapping a <div data-composition-id=…>). So parseVideoElements / parseImageElements / parseAudioElements silently returned [] for the most common sub-composition shape in the codebase. That's a huge correctness-scoped bug that likely was masked because many renders happened to use non-template sub-composition shapes or used root-level media.
The unwrapTemplate helper is precisely scoped. Reading htmlTemplate.ts:
- DOM-based, not regex-based. Miguel explicitly called out avoiding the CodeQL backtracking warning from a regex approach. Good — once you're in LinkeDOM, use LinkeDOM.
- Only unwraps the exact single-wrapper shape.
getSingleMeaningfulChildwalks past whitespace-only text nodes (nodeType 3) and comments (nodeType 8) but returnsnullon ANY of: multiple element children, mixed text + elements, no element children. If the walk doesn't find exactly one meaningful element AND that element isn't<template>, the helper returns the input HTML unchanged. That defensive shape means: multi-sibling-templates, mixed content, empty fragments all pass through untouched. - Short-circuits cheaply. The string-level
!lowered.includes("<template") || !lowered.includes("</template>")skip avoids LinkeDOM parsing for any HTML that doesn't plausibly contain a template. Zero cost on the common path. - Wrapping strategy is correct.
parseHTMLContentdetects<!doctype|<htmlto decide whether to wrap fragments in a synthetic<!doctype html><html><head></head><body>…</body></html>, so both full-document and fragment inputs go through one codepath. LinkeDOM'sparseHTMLthen gives a body we can walk. template.innerHTML ?? html— defensive fallback. In practiceinnerHTMLreturns""for empty templates (the test confirms this), and since""is defined,??returns the empty string, nothtml. Correct.
Nitty correctness check on each test case in htmlTemplate.test.ts — all 7 encode meaningful invariants:
returns the input unchanged when there is no template wrapper— short-circuit worksunwraps a bare top-level template fragment— the happy pathunwraps a full document whose body only contains a template— full-doc shape worksreturns the input unchanged when the closing template tag is missing— malformed input is passthrough, not a crashreturns an empty string for an empty template— matches the??semantics, not a passthroughpreserves nested templates inside the outer wrapper— only the OUTER wrapper unwrapsleaves multiple sibling templates unchanged— doesn't heuristically pick one
That's a solid coverage profile for a string→string helper.
Root cause 2 — recompileWithResolutions was clobbering first-pass offsets. The change:
const hasSubMedia = subVideos.length > 0 || subAudios.length > 0 || subImages.length > 0;
const videos = hasSubMedia ? dedupeElementsById([...mainVideos, ...subVideos]) : compiled.videos;The semantics: after the browser resolves composition durations, recompileWithResolutions reparses the already-inlined HTML. By this point the original [data-composition-src] hosts are gone (they got inlined into the main HTML), so parseSubCompositions legitimately returns empty. The bug was that mainVideos re-parsed from the inlined DOM would now find the scene-local video at its SCENE-LOCAL data-start="0" — losing the host offset that was correctly applied in the first pass. The fix preserves the first-pass compiled.videos when there's no second-pass sub-media.
The correctness argument: between the two passes, the set of media elements doesn't change — duration resolution just replaces data-composition-duration on hosts. So preserving the first-pass arrays when no fresh sub-media is discovered is safe — we aren't dropping any legitimately-new elements.
Non-blocking observations
-
unwrapTemplatelowercases for the short-circuit but the DOM check usestagName === "TEMPLATE". LinkeDOM normalizes tagName to uppercase, which matches the comparison — correct. No issue. -
BOM handling.
html.trimStart().toLowerCase()handlesas a whitespace code point in most engines (String.prototype.trimtreats it as whitespace per ES2019). If an author source-saved a<template>as UTF-8-BOM, the check still works. Minor trivia; not an action item. -
Documentation for the
hasSubMediapreservation. The conditional is subtle and the change is 4 lines. A single-line comment at the site explaining "after inlining, re-parsing the merged DOM loses the sub-composition offsets — preserve first-pass arrays when no fresh sub-media is discovered" would save the next reader the archaeology. Not a blocker; tests and commit message cover it. -
Parallel fix for
parseSubCompositions? The PR appliesunwrapTemplateto the three media parsers inengine/services/audioMixer.tsandengine/services/videoFrameExtractor.ts. If anywhere else in the engine usesparseHTML(rawSubCompositionHtml)and thenquerySelectorAllagainst a template-wrapped fragment, it would have the same bug.parseSubCompositionsitself handles the host-level traversal via its own regex + offset logic and doesn't hit this issue. Out of scope; flagging for future awareness.
CI state
On 5d23706e — all green except 2 regression-shards (styles-b, styles-e) still in-progress. Completed green: Format / Typecheck / Lint / Test / Build / Test: runtime contract / Smoke / Tests on windows-latest / Render on windows-latest / CodeQL / player-perf / regression-shards (render-compat, styles-a, styles-g). Perf suite skipped on this PR (Detect changes gated it out as non-perf-relevant).
Ship it.
Review by Rames Jusso
|
Nice catch on the DOM-based unwrap — that's cleaner than the regex I used in #475. One case I think is still broken after this PR: the browser-metadata reconciliation in executeRenderJob (renderOrchestrator.ts) overwrites existing.end whenever |existing.end - el.end| > 0.0001. For sub-composition media, existing.end is the compiled value (scene-local end + host offset, so e.g. 89.94), and el.end comes from the inlined DOM where the clip's data-end is still scene-local (85.52). The 4.417 gap triggers the reconcile and clips the tail off every sub-composition video/audio at render time. I hit this in my render: the host was data-start="4.417" and the last ~4.4s of every clip was cut. Only fixing the parse / recompile paths (the two in this PR) isn't enough — the reconciler undoes it. Repro just needs a host with data-start > 0 + a scene-local clip that runs to the end of the scene, then check whether the final audio.aac contains the full clip or is short. I have a fix for this in #475 as a second commit (link to commit). Happy to extract it into a separate PR on top of this one if that's cleaner. |
Merge activity
|
Problem
Template-wrapped sub-compositions could still lose correct parent timing during render in more than one place.
In the validated repros, a host sub-composition starting after the intro (and in one follow-up repro, starting at
20safter earlier compositions) contained scene-local media inside it. On the broken paths:recompileWithResolutions()executeRenderJob()could still overwrite a compiled globalendwith a scene-localdata-endfrom the inlined DOM, clipping the tail off late-start sub-composition mediaWhat this fixes
Template-wrapped media discovery
parseVideoElements,parseImageElements, andparseAudioElementsnow unwrap a single top-level<template>wrapper before scraping mediaOffset preservation after duration resolution
recompileWithResolutions()now preserves the first-pass sub-composition media arrays when the already-inlined HTML no longer contains[data-composition-src]hostsBrowser metadata reconciliation in the compiled time origin
data-start/data-endfrom the merged DOM after inliningendvalues into the compiled element's time origin before reconciling them back intocomposition.videos/composition.audiosRegression coverage
recompileWithResolutions()patht≈20) with scene-local media inside themRoot cause
There were three distinct renderer failures behind the bug:
1. Template contents were invisible to the media scrapers
parseSubCompositions()reads raw sub-composition HTML and applies the host offset to discovered media. But the engine media helpers were querying the parsed document directly, and linkedom follows browser semantics here: top-level<template>contents live in aDocumentFragment, soquerySelectorAll()never saw those<video>/<audio>/<img>nodes.That meant template-wrapped sub-compositions could silently produce zero discovered media during the first pass.
2. The duration-resolution recompile could clobber already-correct offsets
After the browser resolves composition durations,
recompileWithResolutions()reparses the already-inlined HTML. By that point the original[data-composition-src]hosts are gone, soparseSubCompositions()legitimately returns no nested media.The old code still rebuilt the deduped media arrays from the merged DOM, which let scene-local media parsed from the inlined HTML overwrite the correctly offset first-pass metadata.
3. The browser probe reconcile path mixed two timing coordinate systems
discoverMediaFromBrowser()readsdata-start/data-enddirectly from the live DOM after sub-compositions are already inlined. For nested media, those attributes can still be scene-local even though the compiled metadata has already been offset into the parent host timeline.The old reconcile path compared those values directly and overwrote
existing.endwhenever the numbers differed. For a late-start sub-composition, that could replace a correct global end like25.5with a scene-local end like5.5, cutting the clip off during render.Verification
Local checks
bun test packages/engine/src/utils/htmlTemplate.test.tsbun test packages/producer/src/services/htmlCompiler.test.tsbunx vitest run packages/producer/src/services/renderOrchestrator.test.tsbun run --filter @hyperframes/engine testbun run --filter @hyperframes/engine typecheckbun run --filter @hyperframes/producer typecheckbunx oxlint packages/engine/src/utils/htmlTemplate.ts packages/engine/src/utils/htmlTemplate.test.ts packages/producer/src/services/renderOrchestrator.ts packages/producer/src/services/renderOrchestrator.test.ts packages/producer/src/services/htmlCompiler.test.tsbunx oxfmt --check packages/engine/src/utils/htmlTemplate.ts packages/engine/src/utils/htmlTemplate.test.tsbun run build:producerRender / browser verification
Verified against two local repros:
Early offset repro
2s0-4sstart: 2agent-browserconfirmed the2.2sframe still shows the child clip active in the host timelineLate offset repro
20s1.5sand should remain visible through24.5sstart: 21.5,end: 25.5agent-browserconfirmed the24.5sframe still shows the late clip visible, which is the exact tail-clipping case the old reconcile path could breakNotes
/tmp/hf-pr475-reproand/tmp/hf-pr476-late-offset-reproprojects plus their browser-proof artifacts are verification-only and are not part of this PR