Skip to content

perf(producer): hdr benchmark harness — --tags filter, peak heap/RSS tracking, bench:hdr script#382

Open
vanceingalls wants to merge 1 commit intovance/frame-dir-cache-isolation-testsfrom
vance/hdr-benchmark-harness
Open

perf(producer): hdr benchmark harness — --tags filter, peak heap/RSS tracking, bench:hdr script#382
vanceingalls wants to merge 1 commit intovance/frame-dir-cache-isolation-testsfrom
vance/hdr-benchmark-harness

Conversation

@vanceingalls
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls commented Apr 21, 2026

Summary

Make the existing benchmark harness genuinely useful for HDR perf work: positive --tags filter, peak heap/RSS sampling, a bench:hdr script, and a perf README documenting the captured April-2026 baseline. Lands first in the Chunk 8 sub-stack so subsequent perf PRs can be measured against a known starting point.

Why

Chunk 8A of plans/hdr-followups.md. Wall-clock timing alone can't catch slow memory regressions like an unbounded image cache — peak RSS does. And the existing harness only had --exclude-tags, so HDR runs had to wait for unrelated SDR fixtures.

What changed

1. Positive --tags filter in benchmark.ts. Adds --tags hdr so HDR runs don't have to wait for unrelated fixtures. Filters compose: a fixture must match --tags (if provided) AND must not match --exclude-tags.

2. Peak heap + RSS tracking in executeRenderJob. A 250 ms periodic process.memoryUsage() sampler runs alongside every render and reports peakRssMb / peakHeapUsedMb in RenderPerfSummary. Sampler is unref'd and always cleared in finally so it never keeps the event loop alive or leaks across jobs. Both fields are optional on the interface for back-compat with serialized older summaries.

3. bench:hdr convenience script plus a perf README at tests/perf/README.md documenting the harness, the new flags, and the captured April-2026 HDR baseline (PQ regression: 34.5 s / 272 MiB RSS, HLG regression: 11.5 s / 227 MiB RSS, both 1080p / 1 worker / 1 run).

The benchmark output table is widened and gains PeakRSS / PeakHeap columns. A new avgOrNull helper preserves null in the JSON when no run reported memory (avoids silently coercing missing data to 0 in older snapshots).

No behavior change for non-benchmark renders — the sampler runs in every executeRenderJob but its overhead is a single process.memoryUsage() call every 250 ms, well below noise.

Test plan

  • bunx tsc --noEmit -p packages/producer — clean.
  • bunx oxlint / bunx oxfmt --check on changed files — clean.
  • bun test src/services/ — 60/60 pass (frameDirCache, orchestrator, etc.).
  • bunx tsx src/benchmark.ts --tags hdr --runs 1 — both HDR fixtures render successfully, summary table prints PeakRSS/PeakHeap columns, per-run output shows new memory line.
  • bunx tsx src/benchmark.ts --tags nonexistent — exits 1 with a helpful message naming the active filters.

Stack

Chunk 8A of plans/hdr-followups.md. First PR in the Chunk 8 perf sub-stack; subsequent PRs (image cache, logger gating) measured against this baseline.

Copy link
Copy Markdown
Collaborator Author

vanceingalls commented Apr 21, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 0d2175b to cd40e4b Compare April 22, 2026 22:53
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from b171d31 to aefd2fc Compare April 22, 2026 23:26
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from cd40e4b to 5594d94 Compare April 22, 2026 23:26
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from aefd2fc to 3be90e5 Compare April 23, 2026 00:06
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 5594d94 to 4ea0021 Compare April 23, 2026 00:06
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from 3be90e5 to ede0291 Compare April 23, 2026 00:10
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 4ea0021 to c95ffe5 Compare April 23, 2026 00:11
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from ede0291 to a4717db Compare April 23, 2026 00:45
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from c95ffe5 to 6cf7b33 Compare April 23, 2026 00:45
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from a4717db to c76bbb2 Compare April 23, 2026 01:58
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 6cf7b33 to 23df022 Compare April 23, 2026 01:58
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from c76bbb2 to 1cb6854 Compare April 23, 2026 02:58
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 23df022 to 53e0f64 Compare April 23, 2026 02:59
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from 1cb6854 to adfcf6f Compare April 23, 2026 03:21
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 53e0f64 to 9d3aa62 Compare April 23, 2026 03:22
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from adfcf6f to 56d9997 Compare April 23, 2026 03:43
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 9d3aa62 to 3500be4 Compare April 23, 2026 03:43
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from 56d9997 to b8fa66f Compare April 23, 2026 04:51
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 3500be4 to f572ea8 Compare April 23, 2026 04:52
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from b8fa66f to d9a7c43 Compare April 23, 2026 05:11
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from f572ea8 to 4a1a749 Compare April 23, 2026 05:11
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from d9a7c43 to dc034ec Compare April 23, 2026 05:46
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 4a1a749 to bbaef03 Compare April 23, 2026 05:47
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from dc034ec to 39201e6 Compare April 23, 2026 06:07
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from bbaef03 to 0b163b9 Compare April 23, 2026 06:07
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from 39201e6 to cdb1508 Compare April 23, 2026 06:59
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 0b163b9 to 2d59a61 Compare April 23, 2026 07:00
…tracking, bench:hdr script

Makes the existing benchmark harness genuinely useful for HDR perf work
before landing image-cache and debug-logging optimizations in the rest of
Chunk 8.

Three tightly-related changes:

1. **Positive --tags filter** in `benchmark.ts`. Existing harness only had
   `--exclude-tags` (which defaults to `slow`). Adds `--tags hdr` so HDR runs
   don't have to wait for unrelated SDR fixtures. Filters compose: a fixture
   must match `--tags` (if provided) AND must not match `--exclude-tags`.

2. **Peak heap + RSS tracking** in `executeRenderJob`. A 250ms periodic
   `process.memoryUsage()` sampler runs alongside every render and reports
   `peakRssMb` / `peakHeapUsedMb` in `RenderPerfSummary`. Wall-clock alone
   can't catch slow memory regressions like an unbounded image cache —
   peak RSS does. Sampler is `unref`'d and always cleared in `finally` so
   it never keeps the event loop alive or leaks across jobs. Both fields
   are optional on the interface for back-compat with serialized older
   summaries.

3. **bench:hdr convenience script** plus a perf README at
   `tests/perf/README.md` documenting the harness, the new flags, and the
   captured April-2026 HDR baseline (PQ regression: 34.5s / 272 MiB RSS,
   HLG regression: 11.5s / 227 MiB RSS, both 1080p / 1 worker / 1 run).

The benchmark output table is widened and gains PeakRSS / PeakHeap columns.
A new `avgOrNull` helper preserves `null` in the JSON when no run reported
memory (avoids silently coercing missing data to 0 in older snapshots).

No behavior change for non-benchmark renders — the sampler runs in every
`executeRenderJob` but its overhead is a single `process.memoryUsage()`
call every 250ms, well below noise.

Verification:
- `bunx tsc --noEmit -p packages/producer` — clean
- `bunx oxlint` / `bunx oxfmt --check` on changed files — clean
- `bun test src/services/` — 60/60 pass (frameDirCache, orchestrator, etc.)
- `bunx tsx src/benchmark.ts --tags hdr --runs 1` — both HDR fixtures
  render successfully, summary table prints PeakRSS/PeakHeap columns,
  per-run output shows new memory line.
- `bunx tsx src/benchmark.ts --tags nonexistent` — exits 1 with a
  helpful message naming the active filters.

Refs: plans/hdr-followups.md Chunk 8A.
@vanceingalls vanceingalls force-pushed the vance/frame-dir-cache-isolation-tests branch from cdb1508 to dea33f9 Compare April 23, 2026 15:33
@vanceingalls vanceingalls force-pushed the vance/hdr-benchmark-harness branch from 2d59a61 to 23c6d5c Compare April 23, 2026 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants