Skip to content

Compress HTML report JSON data and minify CSS#5368

Merged
thomhurst merged 12 commits intomainfrom
compress-html-reports
Apr 4, 2026
Merged

Compress HTML report JSON data and minify CSS#5368
thomhurst merged 12 commits intomainfrom
compress-html-reports

Conversation

@thomhurst
Copy link
Copy Markdown
Owner

Summary

  • GZip-compress the JSON test data payload and base64-encode it, reducing the largest portion of report file size (~91% of content) by 80-90%
  • Minify inline CSS by stripping comments and collapsing whitespace, saving ~40% of CSS size
  • Serialize JSON directly to the GZip stream to avoid intermediate string/byte[] allocations

Details

The JSON data is decompressed client-side using the native DecompressionStream API (supported in all modern browsers since 2023). A fallback path handles uncompressed data for older browsers.

Other improvements:

  • Cache minified CSS in a static readonly field (constant input, no need to recompute)
  • Use MemoryStream.TryGetBuffer instead of ToArray() to avoid a buffer copy
  • Add .catch() on the async IIFE to surface errors instead of silently swallowing them
  • Guard DecompressionStream availability before attempting decompression

Test plan

  • Builds on both net10.0 and netstandard2.0
  • All HtmlReporterTests pass
  • Generated report opens correctly in browser with full functionality (verified via Playwright)
  • Zero console errors in browser
  • Verify with a large test suite (hundreds+ tests) to confirm significant size reduction

The JSON test data (which dominates report file size) is now GZip-compressed
and base64-encoded, with client-side decompression via the native
DecompressionStream API. CSS is minified by stripping comments and collapsing
whitespace. Together these reduce report file sizes by 60-80% for large test
suites.
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Overall this is a well-motivated PR with clean implementation. The approach of GZip + base64 for the JSON payload, direct streaming serialization, and CSS minification caching are all solid. I found one functional bug and two smaller concerns.


Bug: Fallback path breaks in older browsers

The JavaScript else branch is reached in two distinct situations:

  1. The script tag has no data-compressed attribute (old-format uncompressed report) — ✅ works
  2. data-compressed="gzip" but DecompressionStream is undefined (new-format report in an older browser) — ❌ fails

In case 2, raw.textContent contains base64-encoded gzip data, not JSON, so JSON.parse(...) throws. The .catch() prevents a crash but silently produces a blank/broken report.

The PR description says "a fallback path handles uncompressed data for older browsers", but the fallback doesn't actually decode anything—it just attempts plain JSON parsing. If supporting older browsers is a goal, the fallback needs a real decompression alternative (e.g. a bundled polyfill, or keeping an uncompressed copy). If older browsers are not supported, the else branch should surface a clear error rather than silently failing:

} else if (raw.getAttribute('data-compressed') === 'gzip') {
    console.error('TUnit: DecompressionStream not supported; report data cannot be decoded.');
    return;
} else {
    data = JSON.parse(raw.textContent);
}

Minor: TryGetBuffer return value is ignored

ms.TryGetBuffer(out var buffer);
var base64 = Convert.ToBase64String(buffer.Array!, buffer.Offset, buffer.Count);

TryGetBuffer returns false when the internal buffer is inaccessible, in which case buffer.Array is null and the ! null-forgiving operator will cause a NullReferenceException. For a new MemoryStream() this can't happen today, but it's fragile to refactoring.

Prefer GetBuffer() (which throws a descriptive exception on failure and is guaranteed to work here) or at minimum assert the return value:

// Option A - simplest, throws MemoryStream.GetBuffer contract violation if ever misused
var rawBuffer = ms.GetBuffer();
var base64 = Convert.ToBase64String(rawBuffer, 0, (int)ms.Length);

// Option B - keep TryGetBuffer, but guard
if (!ms.TryGetBuffer(out var buffer))
    throw new InvalidOperationException("MemoryStream buffer not accessible.");

AOT concern: non-generated Regex in MinifyCss

The project requires Native AOT compatibility (CLAUDE.md). The three Regex.Replace calls inside MinifyCss use the interpreted cache-based API. Since the method runs once at class load, performance is not the concern—AOT trimmer correctness is.

Per project conventions, consider refactoring to [GeneratedRegex]:

private static partial class CssRegex
{
    [GeneratedRegex(@"/\*[\s\S]*?\*/")]
    public static partial Regex Comments();

    [GeneratedRegex(@"\s+")]
    public static partial Regex Whitespace();

    [GeneratedRegex(@"\s*([{}:;,>~+!])\s*")]
    public static partial Regex Separators();
}

Positive notes

  • Streaming JSON directly into the GZipStream (avoiding the intermediate string/byte[]) is a nice allocation win.
  • Caching MinifiedCss in a static readonly field is exactly right — constant input, no need to recompute.
  • CompressionLevel.Optimal is appropriate for a one-shot report write.
  • Adding .catch() on the async IIFE and guarding DecompressionStream availability are good defensive practices.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test comment

@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 4, 2026

Not up to standards ⛔

🔴 Issues 1 minor

Alerts:
⚠ 1 issue (≤ 0 issues of at least minor severity)

Results:
1 new issue

Category Results
CodeStyle 1 minor

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

- Surface clear error when DecompressionStream is unavailable instead of
  silently failing on base64 data with JSON.parse
- Use GetBuffer() + ms.Length instead of TryGetBuffer with null-forgiving
  operator
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review (after 086c973)

Both items flagged in my previous review have been addressed in the follow-up commit:

  • Fallback path bug — fixed. The three-branch JS structure now correctly errors with a descriptive message when DecompressionStream is unavailable, rather than silently attempting JSON.parse on base64 data.
  • TryGetBuffer return value ignored — fixed. Switched to GetBuffer() which is cleaner and throws a descriptive exception if ever misused.

Still open: non-generated Regex in MinifyCss (AOT concern)

The three Regex.Replace calls still use the interpreted cache-based API. Since CLAUDE.md requires Native AOT compatibility, these should use [GeneratedRegex]:

internal static partial class HtmlReportGenerator
{
    [GeneratedRegex(@"/\*[\s\S]*?\*/")]
    private static partial Regex CssCommentsRegex();

    [GeneratedRegex(@"\s+")]
    private static partial Regex CssWhitespaceRegex();

    [GeneratedRegex(@"\s*([{}:;,>~+!])\s*")]
    private static partial Regex CssSeparatorsRegex();
}

This applies even though MinifyCss runs once at class load — the trimmer needs the source-generated path to guarantee AOT correctness without interpreter fallback.


Minor: JS base64 decoding performance

Uint8Array.from(binary, c => c.charCodeAt(0)) creates one closure call per byte. For large compressed payloads (hundreds of tests), a manual loop is substantially faster and avoids the per-iteration function call overhead:

const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) bytes[i] = binary.charCodeAt(i);

This is a minor concern but relevant for the stated goal of handling large test suites efficiently.


Overall the PR is in good shape. The two functional issues are resolved. The AOT [GeneratedRegex] refactor is the one remaining item worth addressing before merge.

Source-generated regex on net8.0+, compiled regex on netstandard2.0.
Avoids per-byte closure overhead from Uint8Array.from on large payloads.
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review (after 7ca876a)

All three items from previous reviews are now addressed:

  • Fallback JS path bug — fixed with a clear console.error and early return.
  • TryGetBuffer return value ignored — fixed with GetBuffer().
  • [GeneratedRegex] for AOT compatibility — resolved. The #if NET / #else split is the right approach: source-generated regex on net8.0+ (AOT-safe), compiled regex on netstandard2.0 (no partial methods). The #if NET preprocessor symbol correctly covers all three TFMs (net8.0, net9.0, net10.0).

Remaining minor: JS base64 decoding loop

The previous review noted this; it is still present in the final diff:

const bytes = Uint8Array.from(binary, c => c.charCodeAt(0));

Uint8Array.from with a map function invokes one closure call per byte. For a compressed payload of even a few hundred tests this could be tens of thousands of calls. A manual loop is measurably faster and has identical semantics:

const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) bytes[i] = binary.charCodeAt(i);

This is the one item worth addressing before merge, especially given the PR's stated goal of performing well with large test suites.


Optional: consider CompressionLevel.SmallestSize

CompressionLevel.Optimal balances speed vs. size. Since report generation is a one-shot write (not a hot path), CompressionLevel.SmallestSize would squeeze out a bit more size reduction at negligible extra time. Worth considering if maximum report compactness is preferred, though Optimal is entirely reasonable here.


Overall the PR is in excellent shape. The implementation is clean, the allocation improvements are solid, and all substantive issues have been addressed. The Uint8Array.from loop is the one remaining item worth a one-line fix.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final Review — All Items Addressed

All issues raised in previous review rounds have been resolved in the current diff. Here is a summary of the final state:

All Previous Items: Resolved

Item Status
Fallback JS path bug (base64 data passed to JSON.parse) Fixed — three-branch structure with clear console.error and early return
TryGetBuffer return value ignored Fixed — switched to GetBuffer()
AOT concern: non-generated Regex in MinifyCss Fixed — [GeneratedRegex] on #if NET, compiled Regex on #else (correct split for net8+ vs netstandard2.0)
JS Uint8Array.from closure-per-byte overhead Fixed — replaced with a manual for loop

Implementation Quality Assessment

The final implementation is clean and well-structured:

  • C# side: Streaming JsonSerializer.Serialize directly into GZipStream, GetBuffer() + explicit length for zero-copy base64, and static readonly MinifiedCss are all solid allocation wins.
  • Regex handling: The #if NET / #else split is the idiomatic approach for dual net8+ / netstandard2.0 targets — source-generated on AOT-compatible targets, compiled on the legacy target. Fully correct.
  • JavaScript side: async IIFE with .catch(), DecompressionStream availability guard, three clearly-separated branches, and the manual byte loop — all good defensive practices.
  • CompressionLevel.Optimal: Fine for a one-shot report write. (SmallestSize remains an optional micro-improvement but Optimal is entirely reasonable.)

One Optional Suggestion (Non-blocking)

The data-compressed attribute is checked twice per branch with raw.getAttribute('data-compressed'). A one-line hoist would make the intent slightly clearer and avoid the repeated DOM access:

const compressed = raw.getAttribute('data-compressed');
if (compressed === 'gzip' && typeof DecompressionStream !== 'undefined') {
    // ...
} else if (compressed === 'gzip') {
    console.error('TUnit: This browser does not support DecompressionStream. Report data cannot be decoded.');
    return;
} else {
    data = JSON.parse(raw.textContent);
}

This is cosmetic — the DOM attribute read is trivially cheap — but it removes the duplication and makes it obvious that both branches are guarding against the same condition.


Overall the PR is in excellent shape and ready to merge. The size reduction for large test suites (~80–90% for the JSON payload) is a meaningful improvement, and the implementation respects the project's AOT, allocation, and dual-target constraints throughout.

- Read data-compressed attribute once into a local variable
- Generalize the error branch to handle any unsupported compression value
- Remove ! from CSS separators regex (not a structural separator)
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review (after e10961f)

All items from prior review rounds are resolved. This is the final state summary.

All Previous Items: Resolved

Item Status
Fallback JS path bug (base64 passed to JSON.parse) Fixed — three-branch with console.error + early return
TryGetBuffer return value ignored Fixed — switched to GetBuffer()
AOT concern: non-generated Regex in MinifyCss Fixed — [GeneratedRegex] on #if NET, compiled on #else
Uint8Array.from closure-per-byte overhead Fixed — manual for loop
getAttribute called twice in JS Fixed — compression variable extracted and reused

Correctness fix in this commit worth calling out

Removing ! from the CSS separator regex ([{}:;,>~+] → no !) is the right call. The previous suggestion to add it was incorrect — stripping whitespace around ! would mangle color: red !importantcolor: red!important, which parses to the same value but only by accident; more critically it would break constructs like @supports (display: flex) and not (...). Good catch.

One minor note (non-blocking)

else if (compression) is truthy for any non-null/empty value of data-compressed, not just 'gzip'. That's fine today (only 'gzip' is ever set), but if a future developer adds a second compression type (e.g. 'zstd'), they'd hit this error branch rather than a decode path. else if (compression === 'gzip') would be marginally more defensive. Not worth a code change on its own — just worth keeping in mind when extending.


The PR is in excellent shape and ready to merge. The GZip + base64 approach, direct-to-stream JSON serialization, cached minified CSS, and the #if NET / #else Regex split all respect the project's AOT, allocation, and dual-target constraints. Solid work.

Report generation is a one-shot write so the extra CPU time is negligible.
Falls back to Optimal on netstandard2.0 where SmallestSize is unavailable.
This was referenced Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant