test(e2e): event-based waits PR-4 — wait-timeout ratchet + flake harness#499
Merged
Conversation
Wait-timeout ratchet + flake harness. Five tasks: baseline file, enforcer script, flake recipe, wire into check-all, README docs. Per docs/specs/2026-04-27-event-based-waits-design.md §Implementation phasing PR 4. Predecessor: PR-3 (#496).
Snapshot of the current count in e2e/*.ts at PR-4 land time. Read by scripts/check-wait-timeout-count.sh (next commit). Future PRs that add new waitForTimeout calls fail the ratchet; future PRs that delete calls update this file in the same commit so it monotonically descends to zero by the 2026-09-30 sunset (per spec §Sunset). Per docs/specs/2026-04-27-event-based-waits-design.md §Implementation phasing PR 4.
Reads e2e/.wait-timeout-baseline, recounts waitForTimeout in e2e/*.ts, and fails if the count exceeds the baseline. Below-baseline counts pass with a hint to update the baseline file in the same PR so future runs lock in the improvement. Also enforces the spec's 2026-09-30 sunset cutoff: after that date, any spec under e2e/*.spec.ts that still has an eslint-disable header for no-restricted-syntax fails the gate. This is the forcing function for completing the file-by-file migration tracked at #458. Wired into 'just check-all' in the next commit.
Runs 'npx playwright test' N times in a loop and aggregates pass/fail per run. Default N=5 per docs/specs/2026-04-27-event-based-waits-design.md §Implementation phasing PR 4. Not wired into check-all (heavy + non-deterministic per spec). Opt-in: just test-e2e-flake # 5 runs just test-e2e-flake N=10 # 10 runs Use this when investigating intermittent failures or before merging risky changes to the e2e suite. The wait-timeout ratchet (also added in PR-4) gates compile-time regressions; this recipe surfaces runtime-only flakes that the ratchet can't see.
Failures here block PR merge — same precedence as fmt/clippy/test/wasm. The ratchet is fast (single grep + integer compare) so it adds negligible time to the gate. Per docs/specs/2026-04-27-event-based-waits-design.md §CI gate.
Two new sections appended to e2e/README.md: - 'Wait-timeout ratchet' explains the script, the baseline file, and how to update it on improve / fail on regress, plus the 2026-09-30 sunset cutoff. - 'Flake harness' documents the just test-e2e-flake recipe and when to use it. Per docs/specs/2026-04-27-event-based-waits-design.md §Implementation phasing PR 4.
Self-review caught two real bugs.
1. scripts/check-wait-timeout-count.sh — silent abort once specs are
fully migrated. With 'set -euo pipefail' active, the line:
current=$(grep -roh "waitForTimeout" e2e/ --include='*.ts' \
2>/dev/null | wc -l | tr -d ' ')
exits the script the moment grep finds zero matches (grep returns
1 → pipefail propagates → set -e kills with exit 1, no output).
That's the SUCCESS condition the spec's sunset path is driving
toward, so the gate would break the wrong way once migration
completes.
Fix: brace-group with '|| true' so the grep's exit code is
absorbed before piping to wc:
current=$( { grep -roh ... 2>/dev/null || true; } | wc -l | tr -d ' ')
Note the precedence: '||' binds looser than '|', so the naive
'grep ... || true | wc -l' would parse as
'grep ... || (true | wc -l | tr -d ...)' and disconnect the count
from grep entirely. The brace group fixes that.
Verified: zero-match path now reports 'current=0' instead of
silently exiting 1.
2. justfile test-e2e-flake — '@just setup-e2e FEATURES={{FEATURES}}'
is invalid inside a shebang recipe. The '@' is a just line-
attribute meaningful only in plain (non-shebang) recipes; under
'#!/usr/bin/env bash' it's a literal command name, so bash would
fail with 'command not found'. Mirrors the bug pattern from
test-e2e-full (which correctly does
'WILLOW_FEATURES="{{FEATURES}}" ./scripts/setup-e2e.sh' inside
its shebang body).
Fix: replace with the same bash-direct invocation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR-4 — final installment of the event-based-waits initiative (#454, #495, #496). No test migration; pure enforcement infrastructure:
scripts/check-wait-timeout-count.shreadse2e/.wait-timeout-baseline(locked at the current count of 53) and fails CI if any PR introduces a newpage.waitForTimeoutine2e/*.ts. Below-baseline counts pass with a hint to update the baseline file in the same PR. The script also enforces the spec's 2026-09-30 sunset cutoff — after that date, any spec still carrying aneslint-disable.*no-restricted-syntaxheader fails the gate, forcing the file-by-file migration tracked at e2e: migrate remaining specs to event-based waits #458 to complete.just test-e2e-flake N=5runs the Playwright suite N times in sequence and aggregates pass/fail per run. Default 5, override viaN=10etc. Not wired intocheck-all(heavy + non-deterministic per spec); opt-in for investigating intermittent failures.e2e/README.md.The ratchet wires into
just check-allaftercheck-no-test-hooks-in-prod.shso any regression fails the PR gate.Spec:
docs/specs/2026-04-27-event-based-waits-design.md§"Implementation phasing PR 4" + §"CI gate" + §"Sunset". Plan:docs/plans/2026-04-30-event-based-waits-pr4-ratchet-flake-harness.md. Tracking: #458.Baseline correction during implementation
The plan stated
54based on agrep -roh "waitForTimeout" e2e/(no filter), which counted a documentation reference ine2e/README.md. The script correctly filters to*.tsfiles only. The committed baseline is 53 — this is the right number.Self-review caught two blocking bugs (committed in
eeed70f)Script silent-aborts at sunset success.
current=$(grep ... | wc -l)underset -euo pipefailwould silently exit whengrepfound zero matches (which is the success state the migration is driving toward). When migration completes and zerowaitForTimeoutcalls remain, grep returns 1 → pipefail propagates →set -ekills with no output → CI thinks the ratchet failed. Fix: brace-group the grep with|| trueso the exit code is absorbed before the pipe, givingcurrent=0correctly. Naivegrep ... || true | wc -lwon't work due to||having lower precedence than|.@just setup-e2einvalid inside shebang recipe.test-e2e-flakeuses#!/usr/bin/env bash, which makes the body one bash script. The@prefix is a just line-attribute that only works in plain (non-shebang) recipes — under bash it's interpreted as a literal command name. The recipe would fail on first invocation with@just: command not found. Fix: replace withWILLOW_FEATURES="{{FEATURES}}" ./scripts/setup-e2e.shmirroring the working pattern in the existingtest-e2e-fullrecipe.Both fixes verified locally:
current=0instead of silently exiting 1.test-e2e-full.Test plan
./scripts/check-wait-timeout-count.shexits 0 withcurrent=53 baseline=53 / ratchet okcurrent=0reported correctly (this exercises the post-sunset success path)git diffconfirms the two fix-commit edits land where intendedjust check-all FEATURES=test-hooksruns the full gate plus the new ratchet step. The flake recipe is not incheck-allper spec — opt-in only.What this completes
The four-PR initiative (spec) ships its full skeleton:
test-hookscargo feature:window.__willowpull API +__willowEventpush streamPeerwrapper, helpers split, pilot conversion ofmulti-peer-sync.spec.tsdata-statelifecycle on 5 animated components,page.clockhelperRemaining work is per-spec migration (5 specs still carry
eslint-disableheaders — cross-browser-sync, multi-peer-mobile, mobile-actions, permissions, mobile), tracked file-by-file via #458. The ratchet ensures each migration only ever decreases the count.Out of scope (deferred)
check-all(explicitly opt-in per spec — heavy + non-deterministic).{ timeout: 30_000 }overrides — the ESLint rule blocks newwaitForTimeoutcalls already; the 30s overrides are a different convention. Defer until a regression case justifies it.https://claude.ai/code/session_01AKogx2HEvgHw41aPHyp1Va
Generated by Claude Code