Skip to content

feat(bench): add WASM benchmark harness and native allocation profiler#3153

Open
k4sperski wants to merge 13 commits intoAutomattic:masterfrom
k4sperski:wasm-perf-benchmarks-harper
Open

feat(bench): add WASM benchmark harness and native allocation profiler#3153
k4sperski wants to merge 13 commits intoAutomattic:masterfrom
k4sperski:wasm-perf-benchmarks-harper

Conversation

@k4sperski
Copy link
Copy Markdown
Contributor

@k4sperski k4sperski commented Apr 11, 2026

Issues

Follows up on #3025.

Description

Adds benchmarking infrastructure to measure spell-check performance in WASM and track allocation overhead; two dimensions native Criterion benchmarks don't cover. fuzzy_match is cheap on modern machines, but on lower-end devices WASM is slower and allocation costlier; these tools make that overhead visible.

Validated against #3025 (is_already_lower):

Metric Lowercase Mixed Capitalised
WASM time reduction 49.2% 43.4% ~0%
Allocation churn reduction 44.7% 38.9% ~0%

WASM time reductions track #3025's native Criterion numbers, so the optimisation carries through the WASM pipeline. Capitalised is flat because is_already_lower doesn't trigger on uppercase starts.

What's added

  • Native allocation profiler (harper-core/examples/alloc_profile.rs): counts allocs per word in fuzzy_match via a #[global_allocator]. No extra deps. Run with just alloc-profile.
  • WASM bench wrapper (harper-wasm/src/bench.rs): feature-gated PreparedWords struct exposing fuzzy_match to JS. Constructor parses the word list and caches an Arc<FstDictionary>; the method just runs the spell-check loop. cfg(feature = "bench") keeps it out of production builds.
  • JS bench harness (harper-wasm/benches/wasm_bench.js): Node script timing fuzzy_match under WASM to verify native perf wins carry through. Run with just bench-wasm.

Measurement hygiene

  • PreparedWords parses inputs and caches the dictionary once, outside bench(), so the timed region only contains fuzzy_match work, matching the native criterion pattern.
  • The alloc profiler warms the AUTOMATON_BUILDERS thread_local before measuring, so the first case doesn't absorb its one-time lazy-init cost.
  • `MAX_EDIT_DISTANCE = 3 and MAX_RESULTS = 200 are duplicated across the alloc profiler, the criterion bench, and the JS harness; a comment flags the invariant.
  • just bench-wasm builds into pkg-bench/, so the shipping pkg/ stays untouched

WASM Time: baseline vs optimised

xychart-beta
    title "WASM fuzzy_match: ms/iter min (lower is better)"
    x-axis ["mixed", "lowercase", "capitalized"]
    y-axis "ms/iter" 0 --> 100
    bar [96, 84, 80]
    bar [54, 43, 80]
Loading

Methodology

  • baseline = early-exit guard removed, optimised = guard restored. Numbers are the min of 50 iterations (after 5 warmup iterations) measured on the default just bench-wasm build (wasm-opt on). Same word lists and parameters as the native Criterion bench

Trade-offs

  • JS harness isn't in CI, so value depends on contributors running just bench-wasm before merging perf-sensitive changes. CI integration would be the natural next step if this becomes a recurring gap.
  • Running bench-wasm with the default wasm-opt build adds ~30s per rebuild. DISABLE_WASM_OPT=1 just bench-wasm skips it for faster dev iteration; the numbers that ship in this PR were measured with wasm-opt on.

Follow-ups

  • If the harness graduates to CI, pass MAX_EDIT_DISTANCE / MAX_RESULTS as args so just bench-wasm becomes the single source of truth for the duplicated constants.
  • Extend PreparedWords to cover suggest_correct_spelling and merged-dict lookup when optimisation work warrants it.

Demo

N/a

How Has This Been Tested?

  • cargo build -p harper-wasm: production build unaffected (no bench feature).
  • cargo clippy -p harper-wasm --features bench -- -Dwarnings: clean.
  • just alloc-profile: first case no longer inflated by thread-local init.
  • just bench-wasm: runs end-to-end; PreparedWords.free() called so no wasm-bindgen leaks.

Checklist

  • I have performed a self-review of my own code
  • I have added tests to cover my changes
  • I have considered splitting this into smaller pull requests.

@k4sperski k4sperski changed the title Add WASM benchmark harness and native allocation profiler feat(bench): add WASM benchmark harness and native allocation profiler Apr 11, 2026
@k4sperski k4sperski marked this pull request as draft April 16, 2026 17:48
@k4sperski
Copy link
Copy Markdown
Contributor Author

Moving to draft, needs a bit more refinement!

@k4sperski k4sperski marked this pull request as ready for review April 17, 2026 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant