Fix flaky perf tests by replacing wall-clock thresholds with relative comparisons#195
Fix flaky perf tests by replacing wall-clock thresholds with relative comparisons#195
Conversation
bananabot9000
left a comment
There was a problem hiding this comment.
Smart fix for flaky perf tests 🍌
Relative comparisons replace brittle wall-clock thresholds. Tests 1 and 3 now measure both prime (cold) and warm renders, assert warmMs < primeMs / 10. If CI is slow, both scale together -- ratio holds. 10x threshold with typical 1000x+ actual ratio = enormous headroom. No more CI-specific process.env.CI branches.
Test 2 (resize): Flat 50ms threshold. No fast/slow pair to compare, so absolute is the right approach. 10-50x the observed range (1-5ms). Catches O(n^2) regressions without flaking.
Variable naming improved: start/end/actual -> descriptive primeStart/primeMs/warmStart/warmMs.
One minor nit: Test 2's it description still says "under 2ms" but threshold is now 50ms. Description should be updated to match.
Ship it 🚢🍌
The perf tests in
terminal-perf.spec.tsuseprocess.env.CI ? X : Ywall-clock thresholds. These fail sporadically when CI is loaded — the immediate trigger wasexpected 5.013728 to be less than 5(13 microseconds of scheduler noise).Tests 1 and 3 (append-then-render, keystroke-only) are cache-effectiveness tests: the primed render should be dramatically faster than the cold render. Changed to relative assertions (
warmMs < primeMs / 10). The actual ratio is typically 1000x+; 10x gives massive headroom. These can never fail due to CI slowness — if the machine is slow, both measurements are slow and the ratio holds.Test 2 (resize re-wrap) is different: it tests that re-wrapping 10K lines after a column change completes in reasonable time. There’s no fast/slow pair to compare — the resize IS the expensive operation. Changed to a single generous absolute threshold (50ms) with no CI conditional. 10K lines re-wrapping takes ~1–5ms in practice; 50ms catches catastrophic regressions (accidentally O(n²)) without being sensitive to normal CI variance.
No
process.env.CIchecks remain in the test file.