Skip to content

Add WebSocket disconnect recovery and slow RPC toast UX#1730

Merged
juliusmarminge merged 10 commits intomainfrom
t3code/websocket-error-toast
Apr 5, 2026
Merged

Add WebSocket disconnect recovery and slow RPC toast UX#1730
juliusmarminge merged 10 commits intomainfrom
t3code/websocket-error-toast

Conversation

@juliusmarminge
Copy link
Copy Markdown
Member

@juliusmarminge juliusmarminge commented Apr 4, 2026

Summary

  • Adds a WebSocket connection surface that blocks the app while the initial connection is unavailable and shows reconnect/offline/exhausted-retry states.
  • Introduces automatic reconnect coordination on browser online/focus events, plus a manual retry action and recovery toast when the connection returns.
  • Tracks slow RPC acknowledgements and surfaces a warning toast when requests exceed the ack threshold.
  • Sanitizes transport-level WebSocket errors so they do not leak into thread-level error surfaces.
  • Tightens toast rendering for long messages and hides the error copy button for connection-related toasts.
  • Adds unit coverage for reconnect decisions, slow request tracking, transport error sanitization, and WebSocket transport behavior.

Testing

  • Not run: bun fmt
  • Not run: bun lint
  • Not run: bun typecheck
  • Not run: bun run test

Note

Medium Risk
Touches core WebSocket transport/protocol and subscription recovery logic; regressions could impact connectivity, retries, and event replay during disconnects, though changes are mostly additive with tests.

Overview
Improves WebSocket resiliency and UX by adding a blocking WebSocketConnectionSurface during initial connection failures plus coordinators that auto-reconnect on browser online/focus, show reconnect/offline/exhausted/recovered toasts, and allow manual retry.

Adds global WebSocket connection state tracking with exponential backoff and a WsTransport.reconnect() path that swaps sessions without disposing the transport; stream subscriptions now support onResubscribe callbacks and orchestration recovery triggers replay on resubscribe.

Introduces slow RPC acknowledgment tracking and a persistent warning toast when unary RPCs exceed a threshold, and sanitizes transport-level connection errors out of thread error surfaces (store + ChatView) while allowing toasts to suppress the error copy button and improving long-message wrapping.

Reviewed by Cursor Bugbot for commit 9782434. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Add WebSocket disconnect recovery UI and slow RPC acknowledgment warning toasts

  • Adds WebSocketConnectionCoordinator to manage automatic reconnection on browser online/focus events, with exponential backoff capped at WS_RECONNECT_MAX_RETRIES, and lifecycle toasts for reconnecting, offline, exhausted, and recovered states.
  • Adds WebSocketConnectionSurface to gate the main layout behind server config availability, showing a blocking status screen until connected.
  • Adds SlowRpcAckToastCoordinator to display a persistent warning toast when RPC requests exceed SLOW_RPC_ACK_THRESHOLD_MS without acknowledgment.
  • Tracks outbound RPC request latency in requestLatencyState.ts; acknowledges requests on response and clears all on transport error.
  • Adds WsTransport.reconnect() for explicit session replacement without disposal, and wires onResubscribe callbacks into stream subscriptions to trigger replay recovery after reconnect.
  • Sanitizes transport-layer connection errors (socket close/open, ping timeout) from thread error surfaces via transportError.ts so they no longer appear in the UI.
  • Risk: reconnect backoff state, connection lifecycle atoms, and slow-RPC tracking are new global state; existing tests required updates to the reset/disposal sequencing.

Macroscope summarized 9782434.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 813664b5-ff5e-4087-b6e2-011dc701fae3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch t3code/websocket-error-toast

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added size:XXL 1,000+ changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. labels Apr 4, 2026
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Apr 4, 2026

Approvability

Verdict: Needs human review

This PR introduces substantial new functionality: WebSocket disconnect recovery with automatic reconnection, slow RPC toast notifications, and blocking connection states. The scope includes new state management (~500 lines), new UI components, modified transport behavior, and changes to error propagation throughout the app. While well-tested and from a trusted author, the feature's breadth and runtime impact warrant human review.

You can customize Macroscope's approvability policy. Learn more.

Resolve the replay recovery merge conflict by keeping main's bounded replay retry logic and the PR's reconnect-triggered resubscribe recovery path.

Co-authored-by: codex <codex@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix prepared a fix for 1 of the 2 issues found in the latest run.

  • ✅ Fixed: Dead conditional returns identical value in both branches
    • Changed the null-nextRetryAt branch to return 'Reconnecting to T3 Server' to differentiate the actively-retrying state from the waiting-for-next-retry state.

Create PR

Or push these changes by commenting:

@cursor push c69586f797
Preview (c69586f797)
diff --git a/apps/web/src/components/WebSocketConnectionSurface.tsx b/apps/web/src/components/WebSocketConnectionSurface.tsx
--- a/apps/web/src/components/WebSocketConnectionSurface.tsx
+++ b/apps/web/src/components/WebSocketConnectionSurface.tsx
@@ -60,7 +60,7 @@
 
 function buildReconnectTitle(status: WsConnectionStatus): string {
   if (status.nextRetryAt === null) {
-    return "Disconnected from T3 Server";
+    return "Reconnecting to T3 Server";
   }
 
   return "Disconnected from T3 Server";

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 4cd078a. Configure here.

}

return "Disconnected from T3 Server";
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead conditional returns identical value in both branches

Low Severity

In buildReconnectTitle, the conditional check on status.nextRetryAt always returns "Disconnected from T3 Server". This makes the conditional effectively dead code, suggesting an incomplete implementation where different titles were intended for various reconnection states.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4cd078a. Configure here.

juliusmarminge and others added 3 commits April 3, 2026 22:28
- Show a blocking connection surface when the socket is unavailable
- Add reconnect/offline toasts and slow RPC ack warnings
- Hide transport errors from thread-level error displays
- Drop the slow-ack warning threshold from 15s to 2.5s
- Extend the websocket transport test timeout to avoid flake
Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge juliusmarminge force-pushed the t3code/websocket-error-toast branch from 4cd078a to 6e70a96 Compare April 4, 2026 05:32
@hwanseoc
Copy link
Copy Markdown
Contributor

hwanseoc commented Apr 4, 2026

It'd definitely help if we add UI that shows websocket errors!
Thanks @juliusmarminge!

juliusmarminge and others added 2 commits April 4, 2026 13:13
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge juliusmarminge merged commit f2cd53f into main Apr 5, 2026
12 checks passed
@juliusmarminge juliusmarminge deleted the t3code/websocket-error-toast branch April 5, 2026 00:12
aaditagrawal pushed a commit to aaditagrawal/t3code that referenced this pull request Apr 5, 2026
Co-authored-by: Hwanseo Choi <hwanseoc@nvidia.com>
Co-authored-by: codex <codex@users.noreply.github.com>
aaditagrawal added a commit to aaditagrawal/t3code that referenced this pull request Apr 5, 2026
…nnect-recovery

Merge upstream: Add WebSocket disconnect recovery and slow RPC toast UX (pingdotgg#1730)
gigq pushed a commit to gigq/t3code that referenced this pull request Apr 6, 2026
Co-authored-by: Hwanseo Choi <hwanseoc@nvidia.com>
Co-authored-by: codex <codex@users.noreply.github.com>
Chrono-byte pushed a commit to Chrono-byte/t3code that referenced this pull request Apr 7, 2026
Co-authored-by: Hwanseo Choi <hwanseoc@nvidia.com>
Co-authored-by: codex <codex@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants