fix: prevent unhandled rejections on close and detect stdin EOF#1853
fix: prevent unhandled rejections on close and detect stdin EOF#1853EtanHey wants to merge 3 commits intomodelcontextprotocol:mainfrom
Conversation
Two fixes for the "Connection closed" crash (MCP error -32000): 1. Protocol._onclose() now defers pending response handler rejections to the next microtask via Promise.resolve().then(), giving callers time to attach .catch() handlers. This prevents Node.js from triggering unhandled promise rejections when the transport closes unexpectedly while requests are in-flight. 2. StdioServerTransport now listens for the stdin 'end' event (EOF) and triggers a clean close(). Previously, when the client process exited, the server had no way to detect the disconnection, leaving it running indefinitely with pending requests that would never resolve. Together these fixes address the crash reported in modelcontextprotocol#1049 where stdio-based MCP servers would die with "MCP error -32000: Connection closed" unhandled rejections when the client process exits. Also addresses the root cause described in modelcontextprotocol#392 (promise/async handling causes unhandled rejections). Fixes modelcontextprotocol#1049 Refs modelcontextprotocol#392
🦋 Changeset detectedLatest commit: 10c6e6c The changes in this PR will be included in the next version bump. This PR includes changesets to release 6 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
The handler variable using PromiseRejectionEvent was dead code — the
test already uses the Node.js process.on('unhandledRejection') handler.
Removing it fixes the TS2304 type error in CI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note: I noticed PR #1814 also touches stdin EOF detection and |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
travisbreaks
left a comment
There was a problem hiding this comment.
Good fix for a real production issue. The microtask deferral pattern and stdin EOF detection are both needed. Notes:
1. Microtask timing changes observable behavior
The Promise.resolve().then() deferral means pending request rejections now happen asynchronously after _onclose() returns. Any code that previously assumed rejection was synchronous with close will now have a subtle ordering difference. The AbortController.abort() calls in the same finally block still run synchronously, so there's a timing gap between abort and rejection. Worth documenting this in the changeset.
2. Error swallowing in _onend
this.close().catch(() => {
// Ignore errors during close
});Silent error swallowing is understandable on the shutdown path, but routing to _onerror instead of () => {} would preserve observability for server authors who have error handlers configured:
this.close().catch((err) => {
this.onerror?.(err instanceof Error ? err : new Error(String(err)));
});3. Timer-based microtask flush in tests
The setTimeout(resolve, 50) approach works but is technically racy under heavy CI load. await Promise.resolve(); await Promise.resolve(); (chained) would be a more deterministic microtask flush. Minor, but avoids flaky test failures.
4. Missing test for handler-throws path
The try/catch in the microtask routes handler errors to _onerror:
try {
handler(error);
} catch (handlerError) {
this._onerror(handlerError instanceof Error ? handlerError : new Error(String(handlerError)));
}But no test exercises this path (where handler(error) itself throws). A test that installs a throwing handler and verifies _onerror receives the thrown error would complete the coverage.
The 5 tests are well-structured, especially the idempotent close test and the message-processing-before-EOF test. Good work.
|
@claude review |
|
Thanks for this. Closing in favor of #1814 which covers the stdin EOF detection and has been through a review round already so this looks like a dupe. |
There was a problem hiding this comment.
This fixes a real production issue (Node.js 24 unhandled rejections), but the author mentions overlap with PR #1814 which also touches Protocol._onclose() and stdin EOF detection — maintainers should coordinate to avoid duplicate or conflicting behavior before merging.
Extended reasoning...
Overview
This PR makes two targeted changes: (1) defers handler(error) calls in Protocol._onclose() to the next microtask via Promise.resolve().then(), and (2) adds a _onend handler for stdin end events in StdioServerTransport. Both are accompanied by new tests.
Security risks
No security-sensitive code paths are touched. No injection vectors, auth changes, or permission logic involved.
Level of scrutiny
The core change in protocol.ts is subtle but well-understood — microtask deferral to avoid synchronous rejection before .catch() can be attached is a standard Node.js pattern. The stdio.ts change is mechanical, following the existing _onstdouterror pattern exactly.
However, the author explicitly flags overlap with PR #1814, which also touches Protocol._onclose() and stdin EOF. A human maintainer needs to decide whether to land this first, land #1814 first, or merge them — this is a coordination decision that goes beyond code correctness.
Other factors
The bug hunter flagged a test weakness: the unhandled-rejection test pre-attaches .catch() before close, so it validates nothing about the microtask deferral. This is a nit on test coverage rather than a production bug, but it means the fix has weaker regression protection than it appears. The production fix itself is correct.
| // Create a pending request and attach .catch() to prevent the expected rejection | ||
| // from triggering the handler | ||
| const requestPromise = testRequest(protocol, { method: 'example', params: {} }, mockSchema); | ||
| requestPromise.catch(() => { | ||
| // Expected — the request was rejected due to connection close | ||
| }); | ||
|
|
||
| // Close transport | ||
| await transport.close(); | ||
|
|
||
| // Wait for microtasks to flush | ||
| await new Promise(resolve => setTimeout(resolve, 50)); | ||
|
|
||
| // No unhandled rejections should have occurred | ||
| expect(unhandledRejections).toHaveLength(0); | ||
| } finally { | ||
| process.off('unhandledRejection', processHandler); | ||
| } | ||
| }); |
There was a problem hiding this comment.
🟡 The test pre-attaches requestPromise.catch() synchronously before calling transport.close(), so the rejection handler is already in place when the rejection fires regardless of whether it fires synchronously (old code) or via deferred microtask (new code). This means the test passes equally well with the old synchronous rejection code and does not actually validate that the microtask deferral prevents unhandled rejections. Consider restructuring the test to close the transport first and then observe that a request made immediately afterward does not produce an unhandled rejection without an explicit .catch().
Extended reasoning...
What the test does vs. what it should validate
The test is intended to verify that deferring handler(error) calls to the next microtask (via Promise.resolve().then()) prevents unhandled promise rejections. The race condition the PR fixes is: a transport closes and synchronously rejects all pending promises before the caller has had a chance to attach a .catch() handler. On Node.js 24 with --unhandled-rejections=throw, such an unhandled rejection kills the process.
Why the test gives false assurance
The test attaches requestPromise.catch(() => {}) synchronously and immediately after creating the request promise — well before transport.close() is called. Because .catch() is already registered, no unhandled rejection can occur regardless of when the underlying reject() call fires:
- Old synchronous code path:
transport.close()→_onclose()→handler(error)→reject(error)fires synchronously. But.catch()was attached two lines earlier, so Node.js sees the rejection as handled. ✓ test passes. - New deferred code path: same, except
reject(error)fires in the next microtask..catch()was still attached before close, so the rejection is still handled. ✓ test passes.
Both implementations pass the test with identical results; the deferral change is entirely unobservable by this test.
The actual race condition is not exercised
The true race occurs when: (1) a request promise is created, (2) the transport closes synchronously in the same JS tick, and (3) the caller hasn't yet attached .catch() — e.g., because the close happens inside the same synchronous block before the caller returns to user code. The test eliminates this scenario by pre-attaching the handler.
Impact
The production code change is correct and the fix is real. However, the test provides false confidence that the fix is validated. If someone reverts the deferral change (or introduces a regression), this test would still pass, making it ineffective as a regression guard.
How to fix the test
A correct test would:
- Call
transport.close()without any pending request. - Then synchronously call
protocol.request()to get a promise in a state where the transport is already closed. - Do NOT attach
.catch()before verifying the unhandledRejection listener count stays at zero.
Alternatively: capture the promise, do NOT attach .catch(), flush microtasks, then attach .catch() and assert no unhandledRejection fired. This directly reproduces the race because the rejection fires (in the next microtask) while no handler is attached.
Summary
Fixes two related bugs that together cause the "MCP error -32000: Connection closed" crash (#1049):
Defer pending response handler rejections in
Protocol._onclose()— When the transport closes with in-flight requests,_onclose()rejects all pending promises. Previously this happened synchronously, which could trigger unhandled promise rejections if callers hadn't attached.catch()handlers yet (especially on Node.js 24's strict unhandled-rejection behavior). Now rejections are deferred to the next microtask viaPromise.resolve().then(), giving callers time to attach handlers.Detect stdin EOF in
StdioServerTransport— The server transport only listened fordataanderrorevents on stdin, but neverend. When the client process exits, stdin emitsend(EOF), but the server had no listener for it — leaving the server running indefinitely with pending requests that would never resolve. NowStdioServerTransportlistens for theendevent and triggers a cleanclose().What
packages/core/src/shared/protocol.tshandler(error)calls in_onclose()to next microtask; wrap in try/catch routing toonerrorpackages/server/src/server/stdio.ts_onendhandler for stdinendevent; register/unregister instart()/close()Why
When a stdio-based MCP server's client process exits unexpectedly:
.catch()attached yet, Node.js triggersunhandledRejection--unhandled-rejections=throw), this kills the entire processThis crashes BrainLayer, VoiceLayer, and other MCP servers in production environments daily.
Test Plan
should reject pending requests with ConnectionClosed when transport closesshould not cause unhandled promise rejections when transport closes with pending requestsshould close transport when stdin emits end (EOF)should not fire onclose twice when stdin EOF followed by explicit close()should process remaining messages before closing on stdin EOFRelated
🤖 Generated with Claude Code