Skip to content

Fix chat replay recovery during WebSocket reconnect (resubscribe)#1718

Closed
hwanseoc wants to merge 5 commits intopingdotgg:mainfrom
hwanseoc:codex/fix-orchestration-reconnect-recovery
Closed

Fix chat replay recovery during WebSocket reconnect (resubscribe)#1718
hwanseoc wants to merge 5 commits intopingdotgg:mainfrom
hwanseoc:codex/fix-orchestration-reconnect-recovery

Conversation

@hwanseoc
Copy link
Copy Markdown
Contributor

@hwanseoc hwanseoc commented Apr 3, 2026

Why

If t3code web is served over an unstable and frequent-reconnect network, there was a problem in chat replay recovery where the UI could stay stale until a manual refresh.

This change starts replay recovery on WebSocket resubscribe so missed chat/orchestration events are recovered as soon as the stream comes back.

What Changed

This fixes a blind spot in chat replay recovery after WebSocket interruption.

The client already handled sequence gaps and snapshot fallback for orchestration events, but it only recovered missed events when a later domain event arrived and exposed the gap. If the WebSocket dropped, the client missed the last chat/orchestration events, and the thread then went quiet, the UI could stay stale until a manual refresh.

This change adds reconnect-aware replay on orchestration stream re-subscribe so the client asks for missed events as soon as the stream is re-established.

Checklist

  • This PR is small and focused
  • I explained what changed and why

Note

Medium Risk
Touches the WebSocket subscription/retry path and orchestration recovery flow; incorrect resubscribe signaling could cause redundant replays or missed updates, but changes are scoped and covered by new tests.

Overview
Ensures orchestration/chat state catches up immediately after a WebSocket reconnect by triggering replay recovery on stream resubscribe (new recovery reason "resubscribe").

Adds an optional onResubscribe callback plumbed through contracts (NativeApi), wsNativeApi, WsRpcClient, and WsTransport.subscribe, with WsTransport only firing the hook after the stream has emitted at least one value; updates and extends tests to verify option forwarding and correct hook behavior.

Reviewed by Cursor Bugbot for commit 9fbb588. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Fix chat replay recovery during WebSocket reconnect by triggering resubscribe recovery

  • Adds onResubscribe callback support to WsTransport.subscribe, invoked on reconnection after at least one value has been received; errors are swallowed so recovery proceeds.
  • Propagates onResubscribe through the RPC client stream methods and up to the NativeApi.orchestration.onDomainEvent contract.
  • In the EventRouter component, the onResubscribe handler flushes pending domain events and triggers replay recovery with the new 'resubscribe' reason, mirroring the existing 'sequence-gap' path.
  • Adds 'resubscribe' as a valid OrchestrationRecoveryReason and refactors sequence-gap recovery into a shared runReplayRecovery(reason) function.

Macroscope summarized 9fbb588.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 42c4408e-f8a0-48eb-bda5-6904d82e8001

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added size:M 30-99 changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. labels Apr 3, 2026
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Apr 3, 2026

Approvability

Verdict: Approved

Bug fix that triggers existing replay recovery when WebSocket reconnects, ensuring chat events aren't lost. Changes are additive (new callback option), backwards-compatible, and well-tested. The author has recent commit history in the modified files.

You can customize Macroscope's approvability policy. Learn more.

hwanseoc and others added 3 commits April 3, 2026 13:55
Resolve the replay recovery merge conflict by keeping main's bounded replay retry logic and the PR's reconnect-triggered resubscribe recovery path.

Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge
Copy link
Copy Markdown
Member

picked your commits and extended the scope a bit in #1730 to improve the visibliity of the connection state

@hwanseoc hwanseoc deleted the codex/fix-orchestration-reconnect-recovery branch April 10, 2026 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M 30-99 changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants