Skip to content

test(e2e): event-based waits PR-2 — Peer wrapper, helpers split, pilot conversion#495

Merged
intendednull merged 12 commits into
mainfrom
claude/event-based-waits-pr2
Apr 29, 2026
Merged

test(e2e): event-based waits PR-2 — Peer wrapper, helpers split, pilot conversion#495
intendednull merged 12 commits into
mainfrom
claude/event-based-waits-pr2

Conversation

@intendednull
Copy link
Copy Markdown
Owner

Summary

PR-2 of the event-based testing initiative (PR-1 = #454). Adds the JS-side test infrastructure on top of PR-1's window.__willow foundation:

  • e2e/test-hooks.ts — typed Peer wrapper around the WASM pull API + push stream, plus a peer Playwright fixture that lazily wires __willowEvent / __willowOverflow bindings per BrowserContext.
  • e2e/helpers/{peers,ui,touch}.ts — the 703-LOC e2e/helpers.ts split into focused modules. e2e/helpers.ts is now a re-export barrel; the seven un-migrated specs continue to import from ./helpers with zero diff.
  • e2e/helpers.barrel.test.ts — build-time TS coverage so an accidental rename in any focused module fails tsc with TS2305 before Playwright runs.
  • e2e/multi-peer-sync.spec.ts — pilot conversion. Eight { timeout: 30_000 } cross-peer DOM overrides + one expect.poll({ timeout: 30_000 }) are replaced with await peer.waitUntilHeadsEqual(other) followed by default-5s assertions. waitForMessage(..., 30_000) becomes await peer.nextEvent(MessageReceived && !isLocal) then default-timeout waitForMessage.
  • e2e/test-hooks.spec.ts — five smoke tests covering snapshot shape, eventCount growth, nextEvent push wiring, waitUntilHeadsEqual convergence, and the timeout error path.

Spec: docs/specs/2026-04-27-event-based-waits-design.md. Plan: docs/plans/2026-04-29-event-based-waits-pr2-peer-wrapper.md. Tracking: #458.

Deviations from the plan

Two surfaced during execution; both are documented inline in commit bodies.

1. peer() factory is async + lazy-wires its context. The plan's Task 3 fixture wired exposeBinding('__willowEvent') on the fixture's default context, but Task 6's smoke tests + setupTwoPeers(browser) create additional contexts via browser.newContext() that the binding never reached. Fix in 14c0c5f: peer(page, label) is async and idempotently wires the page's BrowserContext on first call per context (tracked via WeakSet<BrowserContext>). Trade-off: events that fire before the first peer() call on a context are lost (no addInitScript buffer); subsequent gossip activity reaches the queue. Documented in the README.

2. waitUntilHeadsEqual matcher target was stale-frozen. The plan's pattern was:

await expect.poll(async () => {
  lastSelf = await this.heads();
  lastOther = await other.heads();
  return canonicalHeads(lastSelf);
}, { timeout }).toBe(canonicalHeads(lastOther));

The .toBe(canonicalHeads(lastOther)) matcher arg is evaluated once at chain construction, when lastOther is still {} (so the target is '[]'). The poll therefore checked "self has no heads", not "self converged with other" — tests would silently mask real convergence failures and only pass when self was empty. Fix in 24765ef: poll a boolean (canonicalHeads(self) === canonicalHeads(other)) and assert .toBe(true). The matcher target is now constant; both sides re-fetch on every tick, so neither freezes.

Also in 24765ef cleanup: removed dead PeerInternals type, consolidated imports at top, moved class block before fixture so PeerFactory's referent is in scope when readers reach the fixture.

Test plan

  • npx tsc --noEmit clean across all e2e/*.ts, e2e/*.spec.ts, e2e/helpers/*.ts, e2e/helpers.barrel.test.ts.
  • npx eslint e2e/ clean (zero warnings, zero errors).
  • Pilot spec (e2e/multi-peer-sync.spec.ts) contains zero waitForTimeout and zero { timeout: 30_000 overrides.
  • Playwright correctly skips e2e/helpers.barrel.test.ts (filename ends .test.ts, default testMatch is *.spec.ts).
  • CI: just check-all FEATURES=test-hooks runs the full Playwright suite (smoke spec + pilot + 7 un-migrated specs) and the symbol-leak guard (scripts/check-no-test-hooks-in-prod.sh).
  • CI: just test-e2e-sync verifies the converted pilot against a live just dev FEATURES=test-hooks stack.

Out of scope (deferred)

https://claude.ai/code/session_01AKogx2HEvgHw41aPHyp1Va


Generated by Claude Code

claude added 12 commits April 29, 2026 07:27
14 tasks (0-13) ship the JS-side test infrastructure on top of PR-1's
window.__willow foundation:

- e2e/test-hooks.ts: Peer wrapper with snapshot/heads/eventCount/lastEvent,
  nextEvent(predicate), waitUntilHeadsEqual + waitUntilAllHeadsEqual,
  plus a Playwright `peer` fixture wiring exposeBinding for
  __willowEvent + __willowOverflow.
- e2e/helpers/{peers,ui,touch}.ts: split of legacy 703-LOC helpers.ts
  behind a re-export barrel so the 7 un-migrated specs work unchanged.
- e2e/helpers.barrel.test.ts: build-time TS coverage of every
  legacy export (TS2305 fails the build before any spec runs).
- e2e/test-hooks.spec.ts: 5 smoke tests covering pull/push/convergence.
- e2e/multi-peer-sync.spec.ts: pilot conversion replacing 8x
  { timeout: 30_000 } cross-peer assertions with peer.waitUntilHeadsEqual
  followed by default 5s assertions; MessageReceived push events gate
  the cross-peer message tests.
- e2e/README.md: documents helpers layout + Peer API + ESLint sunset.

Out of scope (deferred to PR-3/PR-4): data-state lifecycle, page.clock,
migration of the other 7 specs (#458), flake harness ratchet,
removal of magic-number sleeps inside helpers/.
Type-only mirror of crates/web/src/test_hooks/{wire,snapshot}.rs. Keeps
the wire-shape contract co-located with the wrapper that consumes it.
Runtime Peer class lands in subsequent commits.
snapshot / heads / eventCount / lastEvent each round-trip through
window.__willow.* (PR-1's wasm_bindgen surface). Push-API methods land
in the next commit once the per-page event queue is wired.
Per-page event queue keyed via WeakMap<Page, ClientEvent[]>; binding
callback's source.page is the lookup key. Overflow binding fails the
test on any droppedCount > 0 (PR-1 dispatcher only calls it on the
65k-buffer overflow path).

Specs opt in via 'import { test, expect } from "./test-hooks";'
— legacy specs continue using the default '@playwright/test' import.
50ms polling loop; consumes the first matching event from the queue
and leaves non-matches in place. Failure message names the peer and
shows the queue tail to debug 'why didn't my event arrive' cases.
Pairwise CRDT convergence check via expect.poll on canonicalised heads
maps. Failure message appends a per-author-key diff so 'A missing
authors: [x]; B missing authors: []' is visible without a manual
console.log round-trip.

waitUntilAllHeadsEqual fans out N-1 pairwise checks for true N-peer
convergence per the spec's 'Partial-equality footgun' section.
Erratum to Task 3: the original fixture wired __willowEvent on the
fixture's default context only. Tests that create extra contexts via
'browser.newContext()' or setupTwoPeers(browser) — including the smoke
test in this commit — would never see events from those new contexts.

Fix: peer(page, label) is now async and idempotently wires the page's
BrowserContext on first call. Bindings still install before any new
goto on that context, and exposeBinding takes effect on existing pages
too, so the read path recovers events as soon as wiring lands.

Trade-off vs. eager wiring: events that fire on a context BEFORE the
first peer() call on that context are dropped (no buffer initialised).
For setupTwoPeers callers that means the very first SyncCompleted may
be missed; the smoke test relies on subsequent gossip events arriving
within the 5s timeout, which matches normal mesh activity.

Smoke spec covers: snapshot shape, eventCount growth, nextEvent push
wiring, waitUntilHeadsEqual convergence, and the timeout error path.
Three-peer waitUntilAllHeadsEqual is deferred to issue #458 — no
current spec needs it.
Per spec §'Helpers redesign' and PR-2 plan Tasks 7-10. Three focused
modules behind a re-export barrel — legacy specs import-from-helpers
unchanged, new specs import directly from the focused module they need.

Behaviour preserved verbatim. Magic-number sleeps and 30s timeout
overrides stay where they were; PR-2 only deletes them from the pilot
spec (multi-peer-sync.spec.ts) where the Peer wrapper replaces them.
Other 7 specs migrate file-by-file via tracking issue #458.

The barrel drops its eslint-disable header — re-exports contain no
waitForTimeout calls, and a lingering unused-disable warning would
trip CI.
Imports every name used by any un-migrated spec from './helpers' and
references it once. If a name disappears from helpers/{peers,ui,touch}
(e.g. accidental rename during the next migration), tsc fails here
with TS2305 before any Playwright test runs.

Filename ends with .test.ts so Playwright's default testMatch skips
it — this is a build-time TypeScript check, not a runtime spec.
Pilot for PR-2 per docs/specs/2026-04-27-event-based-waits-design.md.

- 8 'toBeVisible({ timeout: 30_000 })' cross-peer assertions replaced
  with 'await peerB.waitUntilHeadsEqual(peerA);' followed by default
  5s assertions.
- 'waitForMessage(page, text, 30_000)' replaced with
  'await peerB.nextEvent(e => MessageReceived && !isLocal);' then a
  default-timeout waitForMessage.
- 'expect.poll(..., { timeout: 30_000 })' on member-list count drops
  the override after convergence.

Signs the test on the Peer fixture from ./test-hooks. The other 7
specs continue to import from @playwright/test directly; they migrate
file-by-file via tracking issue #458.

Acceptance: smoke spec + pilot type-check and lint clean. Live runs
against 'just dev FEATURES=test-hooks' will exercise both at the PR
acceptance gate; the full N=10 flake harness ships in PR-4 alongside
the wait-timeout ratchet baseline.
Three new sections:
- Helpers layout (peers/ui/touch + barrel)
- Event-based waits (Peer pull/push/convergence API + import pattern)
- Anti-patterns blocked by ESLint (waitForTimeout + sunset date)

Code sample documents the lazy-wire async peer() factory introduced
in this PR — peer(page, label) is awaited and wires the binding on
the page's context idempotently, so contexts created via
browser.newContext() or setupTwoPeers(browser) work without per-spec
setup.

Points readers at the design spec + tracking issue #458 for the
remaining-specs migration.
Self-review found the assertion was logically broken:

    await expect.poll(async () => {
        lastSelf = await this.heads();
        lastOther = await other.heads();
        return canonicalHeads(lastSelf);
      }, { timeout })
      .toBe(canonicalHeads(lastOther));

The matcher arg 'canonicalHeads(lastOther)' is evaluated ONCE when the
expect chain is constructed — before any poll runs and while lastOther
is still {} (canonicalHeads({}) === '[]'). The poll therefore checked
canonicalHeads(self) === '[]', not 'self converges with other'. Tests
would only have 'passed' when self had no heads — masking real
convergence failures.

Fix: poll a boolean (canonicalHeads(self) === canonicalHeads(other))
and assert it's true. The matcher target is now constant; both heads
maps are re-fetched on every tick, so neither side freezes.

Also in this cleanup pass:
- Removed dead PeerInternals type (declared but unused).
- Consolidated imports at top of file (test, expect, Page, BrowserContext
  from @playwright/test); 'export { expect }' moved next to its import
  rather than mid-file between fixture and class.
- Class block sits before the fixture now so PeerFactory's referent is
  defined when the fixture's PeerFactory generic is parsed by readers.
- Reworded the Peer doc comment to drop the 'Task 3' plan reference.
@intendednull intendednull merged commit 5cc9efb into main Apr 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants