Skip to content

test(e2e): event-based waits PR-4 — wait-timeout ratchet + flake harness#499

Merged
intendednull merged 7 commits into
mainfrom
claude/event-based-waits-pr4
May 1, 2026
Merged

test(e2e): event-based waits PR-4 — wait-timeout ratchet + flake harness#499
intendednull merged 7 commits into
mainfrom
claude/event-based-waits-pr4

Conversation

@intendednull
Copy link
Copy Markdown
Owner

Summary

PR-4 — final installment of the event-based-waits initiative (#454, #495, #496). No test migration; pure enforcement infrastructure:

  1. Wait-timeout ratchet. scripts/check-wait-timeout-count.sh reads e2e/.wait-timeout-baseline (locked at the current count of 53) and fails CI if any PR introduces a new page.waitForTimeout in e2e/*.ts. Below-baseline counts pass with a hint to update the baseline file in the same PR. The script also enforces the spec's 2026-09-30 sunset cutoff — after that date, any spec still carrying an eslint-disable.*no-restricted-syntax header fails the gate, forcing the file-by-file migration tracked at e2e: migrate remaining specs to event-based waits #458 to complete.
  2. Flake harness. just test-e2e-flake N=5 runs the Playwright suite N times in sequence and aggregates pass/fail per run. Default 5, override via N=10 etc. Not wired into check-all (heavy + non-deterministic per spec); opt-in for investigating intermittent failures.
  3. README docs for both, in e2e/README.md.

The ratchet wires into just check-all after check-no-test-hooks-in-prod.sh so any regression fails the PR gate.

Spec: docs/specs/2026-04-27-event-based-waits-design.md §"Implementation phasing PR 4" + §"CI gate" + §"Sunset". Plan: docs/plans/2026-04-30-event-based-waits-pr4-ratchet-flake-harness.md. Tracking: #458.

Baseline correction during implementation

The plan stated 54 based on a grep -roh "waitForTimeout" e2e/ (no filter), which counted a documentation reference in e2e/README.md. The script correctly filters to *.ts files only. The committed baseline is 53 — this is the right number.

Self-review caught two blocking bugs (committed in eeed70f)

  1. Script silent-aborts at sunset success. current=$(grep ... | wc -l) under set -euo pipefail would silently exit when grep found zero matches (which is the success state the migration is driving toward). When migration completes and zero waitForTimeout calls remain, grep returns 1 → pipefail propagates → set -e kills with no output → CI thinks the ratchet failed. Fix: brace-group the grep with || true so the exit code is absorbed before the pipe, giving current=0 correctly. Naive grep ... || true | wc -l won't work due to || having lower precedence than |.

  2. @just setup-e2e invalid inside shebang recipe. test-e2e-flake uses #!/usr/bin/env bash, which makes the body one bash script. The @ prefix is a just line-attribute that only works in plain (non-shebang) recipes — under bash it's interpreted as a literal command name. The recipe would fail on first invocation with @just: command not found. Fix: replace with WILLOW_FEATURES="{{FEATURES}}" ./scripts/setup-e2e.sh mirroring the working pattern in the existing test-e2e-full recipe.

Both fixes verified locally:

  • Zero-match self-test now reports current=0 instead of silently exiting 1.
  • The flake recipe's setup line now matches the same shape as test-e2e-full.

Test plan

  • ./scripts/check-wait-timeout-count.sh exits 0 with current=53 baseline=53 / ratchet ok
  • Regression self-test (baseline=52, current=53): script exits non-zero with the "regressed" error block
  • Below-baseline self-test (baseline=54, current=53): script exits 0 with the "note: ... below baseline" hint
  • Zero-match self-test (using a pattern that doesn't exist): current=0 reported correctly (this exercises the post-sunset success path)
  • git diff confirms the two fix-commit edits land where intended
  • CI: just check-all FEATURES=test-hooks runs the full gate plus the new ratchet step. The flake recipe is not in check-all per spec — opt-in only.

What this completes

The four-PR initiative (spec) ships its full skeleton:

PR Delivered
#454 test-hooks cargo feature: window.__willow pull API + __willowEvent push stream
#495 JS Peer wrapper, helpers split, pilot conversion of multi-peer-sync.spec.ts
#496 data-state lifecycle on 5 animated components, page.clock helper
#this wait-timeout ratchet + flake harness + sunset enforcement

Remaining work is per-spec migration (5 specs still carry eslint-disable headers — cross-browser-sync, multi-peer-mobile, mobile-actions, permissions, mobile), tracked file-by-file via #458. The ratchet ensures each migration only ever decreases the count.

Out of scope (deferred)

  • Migration of any of the 5 remaining specs.
  • Adding the flake harness to check-all (explicitly opt-in per spec — heavy + non-deterministic).
  • A separate ratchet for { timeout: 30_000 } overrides — the ESLint rule blocks new waitForTimeout calls already; the 30s overrides are a different convention. Defer until a regression case justifies it.

https://claude.ai/code/session_01AKogx2HEvgHw41aPHyp1Va


Generated by Claude Code

claude added 7 commits April 30, 2026 00:43
Wait-timeout ratchet + flake harness. Five tasks: baseline file,
enforcer script, flake recipe, wire into check-all, README docs.

Per docs/specs/2026-04-27-event-based-waits-design.md §Implementation
phasing PR 4. Predecessor: PR-3 (#496).
Snapshot of the current count in e2e/*.ts at PR-4 land time. Read by
scripts/check-wait-timeout-count.sh (next commit). Future PRs that add
new waitForTimeout calls fail the ratchet; future PRs that delete
calls update this file in the same commit so it monotonically
descends to zero by the 2026-09-30 sunset (per spec §Sunset).

Per docs/specs/2026-04-27-event-based-waits-design.md §Implementation
phasing PR 4.
Reads e2e/.wait-timeout-baseline, recounts waitForTimeout in e2e/*.ts,
and fails if the count exceeds the baseline. Below-baseline counts
pass with a hint to update the baseline file in the same PR so future
runs lock in the improvement.

Also enforces the spec's 2026-09-30 sunset cutoff: after that date,
any spec under e2e/*.spec.ts that still has an eslint-disable header
for no-restricted-syntax fails the gate. This is the forcing function
for completing the file-by-file migration tracked at #458.

Wired into 'just check-all' in the next commit.
Runs 'npx playwright test' N times in a loop and aggregates pass/fail
per run. Default N=5 per docs/specs/2026-04-27-event-based-waits-design.md
§Implementation phasing PR 4.

Not wired into check-all (heavy + non-deterministic per spec). Opt-in:
  just test-e2e-flake             # 5 runs
  just test-e2e-flake N=10        # 10 runs

Use this when investigating intermittent failures or before merging
risky changes to the e2e suite. The wait-timeout ratchet (also added
in PR-4) gates compile-time regressions; this recipe surfaces
runtime-only flakes that the ratchet can't see.
Failures here block PR merge — same precedence as fmt/clippy/test/wasm.
The ratchet is fast (single grep + integer compare) so it adds
negligible time to the gate.

Per docs/specs/2026-04-27-event-based-waits-design.md §CI gate.
Two new sections appended to e2e/README.md:
- 'Wait-timeout ratchet' explains the script, the baseline file, and
  how to update it on improve / fail on regress, plus the 2026-09-30
  sunset cutoff.
- 'Flake harness' documents the just test-e2e-flake recipe and when
  to use it.

Per docs/specs/2026-04-27-event-based-waits-design.md §Implementation
phasing PR 4.
Self-review caught two real bugs.

1. scripts/check-wait-timeout-count.sh — silent abort once specs are
   fully migrated. With 'set -euo pipefail' active, the line:

     current=$(grep -roh "waitForTimeout" e2e/ --include='*.ts' \
                  2>/dev/null | wc -l | tr -d ' ')

   exits the script the moment grep finds zero matches (grep returns
   1 → pipefail propagates → set -e kills with exit 1, no output).
   That's the SUCCESS condition the spec's sunset path is driving
   toward, so the gate would break the wrong way once migration
   completes.

   Fix: brace-group with '|| true' so the grep's exit code is
   absorbed before piping to wc:

     current=$( { grep -roh ... 2>/dev/null || true; } | wc -l | tr -d ' ')

   Note the precedence: '||' binds looser than '|', so the naive
   'grep ... || true | wc -l' would parse as
   'grep ... || (true | wc -l | tr -d ...)' and disconnect the count
   from grep entirely. The brace group fixes that.

   Verified: zero-match path now reports 'current=0' instead of
   silently exiting 1.

2. justfile test-e2e-flake — '@just setup-e2e FEATURES={{FEATURES}}'
   is invalid inside a shebang recipe. The '@' is a just line-
   attribute meaningful only in plain (non-shebang) recipes; under
   '#!/usr/bin/env bash' it's a literal command name, so bash would
   fail with 'command not found'. Mirrors the bug pattern from
   test-e2e-full (which correctly does
   'WILLOW_FEATURES="{{FEATURES}}" ./scripts/setup-e2e.sh' inside
   its shebang body).

   Fix: replace with the same bash-direct invocation.
@intendednull intendednull merged commit 1cee5c4 into main May 1, 2026
7 checks passed
@intendednull intendednull deleted the claude/event-based-waits-pr4 branch May 1, 2026 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants