Complete the WebSocket close handshake in webSocketClose#393
Merged
Conversation
When a client initiated a clean close, PartyServer's `webSocketClose` forwarded to user `onClose` but never reciprocated the peer's Close frame, leaving the client in CLOSING until it timed out and reported 1006 (abnormal closure). The Hibernation API contract requires the application to reciprocate on every compat date; the standard `accept()` API requires it on compat dates before 2026-04-07 (where the runtime's `web_socket_auto_reply_to_close` flag isn't yet active). Both the hibernating and non-hibernating paths now reciprocate via a shared `closeQuietly` helper in a `finally` block, after `onClose` has run (and even when `onClose` throws synchronously or asynchronously). Reserved close codes (1005, 1006, 1015) are normalized to 1000 so the reciprocation can never throw `InvalidAccessError`. Calling `close()` on an already-closed socket is a silent no-op, so user code that already calls `connection.close()` from `onClose` is unaffected. Adds 10 regression tests (5 hibernating + 4 non-hibernating + 1 cross-cutting) covering the headline #389 repro, peer code/reason delivery to onClose, throwing-onClose recovery, idempotent reciprocation when user code closes from onClose, and reserved-code normalization. The tests fail loudly without the fix with a clear "server-side WebSocket never reciprocated the peer's Close frame" timeout message. Fixes #389 Made-with: Cursor
🦋 Changeset detectedLatest commit: 5fe2f55 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
hono-party
partyfn
partyserver
partysocket
partysub
partysync
partytracks
partywhen
y-partyserver
commit: |
Merged
6 tasks
threepointone
added a commit
that referenced
this pull request
Apr 28, 2026
Followup to #393. The close handshake fix landed in 0.5.4 normalized the reserved synthetic codes (1005 NoStatusReceived, 1006 AbnormalClosure, 1015 TLSHandshake) to 1000 and reciprocated anyway, on the theory that calling `ws.close(...)` was always safe. In practice it isn't: those codes are precisely the runtime's signal that there was no real Close frame from the peer — the underlying transport is already gone. The reciprocating `ws.close(...)` succeeds synchronously but schedules an outbound write on a dead transport, which the runtime later rejects asynchronously with `Network connection lost` / `WebSocket peer disconnected`. That rejection can't be observed from `closeQuietly`'s synchronous try/catch (the call returns void, no Promise to attach a `.catch` to), so it surfaces as an unhandled promise rejection in tests and production logs. Surfaced by Cloudflare Agents sub-agent routing, where the WebSocket pair is tunneled across Durable Object RPC boundaries: the runtime delivered `webSocketClose(ws, 1005, "", true)` reliably, our reciprocation tried to write a Close frame back through an already-severed RPC link, and vitest reported `Errors 11` on an otherwise-passing 21-test suite. The fix is to recognize the reserved-code shape and skip the reciprocation entirely. There is no peer to acknowledge to. Both the hibernating `webSocketClose` path and the non-hibernating `#attachSocketEventHandlers` close listener pick this up via the shared `closeQuietly` helper. Adds one regression test under `Close handshake (#389) > hibernating` that exercises a client `ws.close()` with no code (the cleanest way to drive a code-1005 arrival on the server in the in-process test runner) and asserts that `onClose` still runs with the reserved code while the framework no longer attempts a reciprocation. User-visible behavior change: a client that calls `ws.close()` with no code on a server running a compatibility date `< 2026-04-07` (where the runtime's `web_socket_auto_reply_to_close` flag isn't yet active) will now observe a non-clean close instead of the previously-fabricated 1000 reciprocation. Clients that pass an explicit close code, and any client on compatibility dates `>= 2026-04-07` (auto-reply does the work), are unaffected. Verification: - `npm run check:test -w partyserver`: 73/73 pass. - `npm run check:type`, `check:lint`, `check:format`: clean. - Repro suite in cloudflare/agents (examples/assistant): 21/21 pass with 0 errors. Was 21/21 + 11 unhandled rejection errors before this fix. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #389 — clients that initiate a clean WebSocket close get stuck in
CLOSINGand time out with1006(abnormal closure) because PartyServer'swebSocketClosenever reciprocates the peer's Close frame.The Hibernation API documents this contract explicitly:
PartyServer's wrapper was forwarding to user
onClosebut never callingws.close(), so every clean client-initiated disconnect ended up dangling.The non-hibernating path (
#attachSocketEventHandlers) had the same omission. It was masked on compat dates>= 2026-04-07by the runtime'sweb_socket_auto_reply_to_closeflag, but broken on older compat dates.Changes
packages/partyserver/src/index.tscloseQuietly(ws, code, reason)helper that wrapsws.close()in atry/catchand normalizes reserved close codes (1005,1006,1015→1000) so the framework's reciprocation can never throwInvalidAccessError.Server.webSocketClosenowawaitsonCloseand reciprocates viacloseQuietlyin afinally. The headline fix.#attachSocketEventHandlers's close listener restructured to handle both synchronous and asynchronousonClosethrows, then unconditionally reciprocate. On compat dates>= 2026-04-07this is a silent no-op (runtime already auto-replied); on older compat dates it's the only way the client gets a clean close.packages/partyserver/src/tests/— 10 new tests under aClose handshake (#389)describe block, plus 6 fixture DOs (hibernating + in-memory variants of three scenarios):onClosereceives the peer's code/reason/wasCleanonClosestill completes the handshake (regression guard for sync vs async throws)connection.close()fromonCloseis idempotent with framework reciprocationWhy this is safe
close()on an already-closed socket is a documented silent no-op (changelog), so the new reciprocation doesn't conflict with user code that already callsconnection.close()fromonClose.closeQuietly'stry/catchcovers the remaining edge cases (oversizereason, future runtime invariants).y-partyserver,partysub,partysync,partywhen) extendServerbut don't overridewebSocketClose/#attachSocketEventHandlers, so they pick up the fix transparently. y-partyserver's existing awareness cleanup inonClosebenefits directly:onCloseis nowawaited before reciprocation, so awareness state is fully cleaned up before the handshake completes.Verification
The tests fail loudly without the fix — verified by stashing the source change and re-running:
7 of the 10 new tests fail without the fix; all 10 pass with it.
Test plan
npm run check:test— 72/72 partyserver tests pass (62 existing + 10 new)npm run check:type— all packages typechecknpm run check:format— cleannpm run check:lint— 0 warnings, 0 errorsy-partyservertest suite still passes (23/23)pnpm client(no?ack=1workaround required)Made with Cursor