feat(mcp): Implement MCP sampling support with transport memory leak fix#12801
feat(mcp): Implement MCP sampling support with transport memory leak fix#12801cbcoutinho wants to merge 13 commits intogoogle-gemini:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @cbcoutinho, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces significant enhancements to the Model Context Protocol (MCP) integration by adding support for MCP sampling, which enables external MCP servers to request LLM completions from the client. A key aspect of this feature is the implementation of a user approval dialog, ensuring that users retain control over these requests. Concurrently, a critical memory leak related to unclosed MCP server transports has been addressed, improving the stability and performance of the system. These changes are supported by new configuration options, a dedicated UI component, and comprehensive integration tests. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces MCP sampling support, a significant feature that allows MCP servers to request LLM completions from the client. The implementation is well-structured, including a user approval dialog in the CLI and a corresponding --auto-confirm-mcp-sampling flag for automated testing.
Crucially, this PR also addresses a critical memory leak by ensuring MCP server transports are properly closed on disconnection or error. The changes to return and manage the transport object from connectToMcpServer are clean and effectively solve the reported "JavaScript heap out of memory" issue.
The addition of an integration test for the new sampling flow is excellent and ensures the feature is robust.
My review has identified one high-severity issue related to duplicated code in the command-line argument parsing, which should be addressed.
1051c8b to
8790b2f
Compare
|
Fixed the duplicate |
|
Thanks for this @cbcoutinho 👏 For changes this large we require an issue first to discuss best approach usually, is there an open issue or FR for MCP sampling that this PR links to? Are you able to also to provide a screencast of this in action? Would be great to see how it works with an example 😄 |
|
Hi @jackwotherspoon , Thanks! I've now linked the relevant open issues for MCP sampling in the PR description. Since they were quiet, I took a shot at the implementation. Happy to move the technical discussion to whichever issue you'd like to consolidate on. I'll add a screencast later this evening. My use case is an MCP server that supports sampling for RAG (summarizing results from a vector DB query), and the screencast will show gemini-cli handling the sampled prompt requests from it. |
|
Howdy @jackwotherspoon, the attached link shows a screencast demonstrating the MCP sampling feature introduced by this PR. This enables MCP servers to leverage the existing LLM within In the example asciinema recording, I have an MCP server that includes a vector database to support semantic queries, and pre-loaded it with some documents about the National Park Service and various National Parks in the US. The https://asciinema.org/a/YlEZapOSLEV4RbHii9uyXLJ5E Flow Overview
This demonstrates how Gemini supports MCP sampling, enabling AI-powered synthesis of retrieved context, providing users with coherent answers rather than just document references. sequenceDiagram
participant Client as MCP Client<br/>(Gemini CLI)
participant Server as MCP Server<br/>(Nextcloud)
participant VectorDB as Vector DB<br/>(Qdrant)
participant LLM as LLM<br/>(Gemini 2.5 Flash)
Client->>Server: (1) nc_semantic_search_answer("What are my Q1 goals?")
Server->>VectorDB: (2) Vector search for similar docs
VectorDB-->>Server: (3) Top K docs<br/>[note, event, card, file]
Server->>Client: (4) MCP Sampling Request<br/>create_message(messages=[{<br/>role: "user",<br/>content: "Query + Docs"<br/>}])
Client->>LLM: (5) Forward prompt + docs to LLM
Note over LLM: (6) Generate answer<br/>with citations
LLM-->>Client: (7) Generated answer<br/>"Based on Doc 1 (note: Project)...<br/>Doc 2 (event)..."
Client-->>Server: (8) Sampling Response<br/>(answer + model info)
Server-->>Client: (9) Final Response<br/>{<br/>generated_answer,<br/>sources: [...],<br/>model_used<br/>}
|
Adds comprehensive support for MCP (Model Context Protocol) sampling, allowing MCP servers to request LLM completions through Gemini CLI. This enables MCP tools to leverage AI capabilities for processing and analyzing data. Key Features: - Implement sampling request/response handling in MCP client - Add McpSamplingDialog component for user consent and progress - Integrate sampling workflow with event system and UI state management - Add comprehensive integration tests for sampling scenarios UI Improvements: - Make consent screen scrollable for better UX with long prompts - Fix MarkdownDisplay width to ensure proper text wrapping - Implement proper cleanup of consent prompts on dialog close Bug Fixes: - Close sampling dialog properly on server timeout - Fix build errors related to upstream merges - Ensure proper event handler cleanup to prevent memory leaks Technical Changes: - Enhanced mcp-client.ts with sampling support and event emission - Updated events.ts with new sampling-related event types - Modified UIStateContext to manage sampling dialog state - Added McpSamplingDialog component with progress indicators - Created integration tests covering various sampling scenarios Files Modified: - packages/core/src/tools/mcp-client.ts (sampling logic) - packages/core/src/utils/events.ts (event types) - packages/cli/src/ui/components/McpSamplingDialog.tsx (new) - integration-tests/mcp-sampling.test.ts (new) - Various UI components for integration
9bb2577 to
e971fdf
Compare
|
Hi @jackwotherspoon, I've updated this branch to correctly handle large prompt requests and provide scrolling. Screencast.From.2025-11-16.10-01-31.mp4 |
|
Hi @jackwotherspoon - is there still interest in this feature? I'm trying to keep the branch up to date with Any guidance about getting this merged into |
|
Hey @cbcoutinho sorry for the delay here! Definitely still interested in this 👍 Been a busy couple weeks with most of the team working towards the Gemini 3 launch and now it being US thanksgiving week a lot of folks are on holidays. I can take a quick look today or tomorrow and then can have an expert take a look next week 😄 |
|
Hi! Wanted to chime in that over on Dart & Flutter DevTools we are eagerly watching this PR and hoping it lands. This would be very helpful for us to begin integrating AI assistance into DevTools that works with Gemini CLI. See flutter/devtools#9570 for details. |
# Conflicts: # packages/core/src/tools/mcp-client-manager.ts # packages/core/src/tools/mcp-client.test.ts # packages/core/src/tools/mcp-client.ts # packages/core/src/utils/events.ts
Merge upstream/main into MCP sampling branch
Add the 'type' field to SETTINGS_SCHEMA_DEFINITIONS for MCPServerConfig to match the actual TypeScript type definition in core/config.ts. This field was added to MCPServerConfig class in commit bdbbe92 but was not included in the JSON schema definition. The field allows users to specify transport type ('sse' or 'http') when using the 'url' field, with auto-detection as fallback.
11d6c1c to
e207c47
Compare
|
Hi @jackwotherspoon I've updated this branch based on upstream/main so it should be ready now for review. Looking forward to your feedback 🙏 |
- Add 5-minute timeout to sampling consent promise to prevent memory leaks when dialog is closed without user response - Improve McpSamplingDialog formatting to show messages with role labels instead of raw JSON - Fix MCP role mapping: map 'assistant' to 'model' for Gemini compatibility - Fix incomplete test mocks by adding required Config methods 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, the Scrollable component would auto-scroll to the bottom on initial render due to the isAtBottom check triggering when size changed from the initial zero values. This caused the MCP sampling dialog to show the end of long prompts first, requiring users to scroll up. Add hasInitializedRef to track first render and only maintain scroll position at bottom after initial render. This allows dialogs to start at the top so users can read prompts from the beginning and scroll down as needed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The register callback was creating a new Map on every call, even when re-registering the same entry ID. This caused React to see a state change and re-render all children, which in turn triggered more register calls, creating an infinite re-render loop. Fix by checking if the entry ID already exists - if so, update the entry in-place without creating a new Map reference. This prevents unnecessary state updates and stops the flickering. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Wrap mouse event handlers in useCallback to prevent constant re-subscription in useMouse hook, which was causing UI flickering - Increase scroll step from 1 to 3 lines per scroll event for smoother trackpad and mouse wheel scrolling - Update tests to account for the new scroll step 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Hi @cbcoutinho, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this. We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines. Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed. Thank you for your understanding and for being a part of our community! |
|
Hi @bdmorgan this PR is associated with an issue, but was closed by your bot a few days ago due to inactivity. It's my understanding that this feature is generally desirable by both users and colleagues of yours. Please confirm if this is still the case, otherwise I can close it |
|
@cbcoutinho Yes I am going to review again, we definitely want this 😄 Thanks for the patience. Do you mind trying to fix the merge conflicts and sign the CLA? Will take a look. |
Resolve merge conflicts: - packages/cli/src/config/config.ts: Keep auto-confirm-mcp-sampling option - packages/cli/src/gemini.tsx: Use non-nullable settings access with type assertion - packages/cli/src/ui/AppContainer.tsx: Keep MCP sampling handler - packages/cli/src/ui/contexts/UIStateContext.tsx: Add AuthState and ActiveHook imports - packages/core/src/tools/mcp-client.ts: Keep cliConfig parameter for sampling support - packages/core/src/tools/mcp-client.test.ts: Add sanitizationConfig to mockConfig - packages/cli/src/config/settingsSchema.ts: Remove duplicate 'type' key in MCP config Co-Authored-By: Claude <noreply@anthropic.com>
…th headers Two fixes for MCP client: 1. Race condition fix: MCP servers start asynchronously during initialization and are not awaited. If tool discovery completes before geminiClient.initialize(), the setTools() call in the finally block is skipped because isInitialized() returns false. This causes MCP tools to be registered but never sent to the Gemini API. Fix by calling setTools() after geminiClient.initialize() completes. 2. Auth header precedence: When an Authorization header is explicitly configured (e.g., via --header flag), respect it instead of overwriting with stored OAuth tokens. This allows users to manually configure authentication for MCP servers that require custom auth mechanisms. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolved conflict in packages/core/src/config/config.ts: - Kept autoConfirmMcpSampling feature from our branch - Adopted upstream's refactored hooks handling that extracts the disabled property Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
d35d497 to
b93e9c6
Compare
|
Hi @jackwotherspoon the PR is merge-conflict free and ready for review. The reason the CLA is failing is because the CI workflow is detecting Claude Code as a co-author. I will update the commits shortly so that they only contain my username, but in terms of content I consider this PR complete cc @bdmorgan |
jackwotherspoon
left a comment
There was a problem hiding this comment.
Alright I took a first pass here.
Thanks for this awesome PR @cbcoutinho 👍
Couple things of note:
- Seems like a lot of changes to the scrollable UI components, are those necessary?
- Can we extract the sampling logic from
mcp-client.tsinto its own named function so we don't bloatconnectToMcpServer. - Not sure we want to be adding a top-level flag for auto approving sampling (i think we'd rather follow-up with integrating it into our policy engine or something of the like)
There was a problem hiding this comment.
can we remove the package-lock.json changes, they aren't needed
| }) | ||
| .option('auto-confirm-mcp-sampling', { | ||
| type: 'boolean', | ||
| description: 'Automatically confirm all MCP sampling requests.', | ||
| hidden: true, |
There was a problem hiding this comment.
We try and keep our top level CLI flags clean.
I see this is hidden, yet I would still prefer we not add this.
I think ideally we find a way to add sampling to our policy engine for users to provide a way to blanket allow.
Probably a follow-up PR though. I would just remove flag from initial implementation unless you have reservations with doing so.
https://geminicli.com/docs/core/policy-engine/#special-syntax-for-mcp-tools
There was a problem hiding this comment.
This is a good point regarding keeping top-level CLI flags clean, even if hidden. I agree that integrating sampling into the policy engine for blanket allowance is a more robust and scalable solution. I'll remove this flag for now and create a follow-up task to explore the policy engine integration. Thanks for the suggestion and the link to the policy engine documentation!
| }) | |
| .option('auto-confirm-mcp-sampling', { | |
| type: 'boolean', | |
| description: 'Automatically confirm all MCP sampling requests.', | |
| hidden: true, | |
| }) |
| incrementalRendering: | ||
| settings.merged.ui.incrementalRendering !== false && useAlternateBuffer, | ||
| }, | ||
| } as RenderOptions & { incrementalRendering?: boolean }, |
There was a problem hiding this comment.
is this related to sample...?
There was a problem hiding this comment.
mind explaining the changes to this file?
maybe this was added as part of a merge conflict or is there reason for the scrollable?
There was a problem hiding this comment.
again this seems kind of odd to change in this PR...
| }); | ||
|
|
||
| // Timeout for sampling consent dialog (5 minutes) | ||
| const SAMPLING_CONSENT_TIMEOUT_MS = 5 * 60 * 1000; |
There was a problem hiding this comment.
maybe move this to top of file alongside MCP_DEFAULT_TIMEOUT_MSEC
| clearTimeout(timeoutId); | ||
| resolve(); | ||
| }, | ||
| reject: (_reason?: unknown) => { |
There was a problem hiding this comment.
Any reason for ignoring the outer reason here?
| // Timeout for sampling consent dialog (5 minutes) | ||
| const SAMPLING_CONSENT_TIMEOUT_MS = 5 * 60 * 1000; | ||
|
|
||
| mcpClient.setRequestHandler(CreateMessageRequestSchema, async (req) => { |
There was a problem hiding this comment.
Can we maybe move this to a named function so it keeps connectToMcpServer a little more lean?
That way we can test more easily in isolation.
Maybe something like:
// At top of file or in a logical section
async function handleSamplingRequest(
req: CreateMessageRequest,
serverName: string,
cliConfig: Config,
): Promise<CreateMessageResult> {
// All the sampling logic here
}|
Thank you for submission to the Gemini CLI project. At this time, we are closing this pull request in order to allow us to better triage and support more recent pull requests against the latest code changes. If you feel like this pull request is a critical contribution to the Gemini CLI project, please associate the pull request with an existing GitHub issue (instructions here: https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue) before reopening. After Monday January 26 2026, any pull requests submitted by contributors without an associated issue will be automatically closed (more information here: #16706). If you do choose to reopen and submit this pull request, please ensure you rebase your changes onto the current main branch before resubmitting. This will help avoid merge conflicts and ensure your contribution is compatible with the latest codebase. |
- Remove --auto-confirm-mcp-sampling CLI flag (use env var for testing) - Extract sampling handler to named function createSamplingRequestHandler() - Move SAMPLING_CONSENT_TIMEOUT_MS constant to top of file - Fix ignored reason parameter in reject callback - Update integration test to use GEMINI_AUTO_CONFIRM_MCP_SAMPLING env var - Remove autoConfirmMcpSampling from Config interface and implementation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Root cause: When MCP servers disconnect or encounter errors, the transport
(Stdio/SSE/HTTP) was not being closed because:
1. `connectToMcpServer()` only returned the Client, not the transport
2. The transport reference was created locally and lost after connection
3. `McpClient.transport` was declared but never assigned
This caused event listeners and buffers to run indefinitely, eventually
causing "JavaScript heap out of memory" errors in long-running sessions.
Changes:
- `connectToMcpServer()` now returns `{client, transport}` instead of just `Client`
- `McpClient.connect()` stores the transport reference
- `McpClient.disconnect()` now properly closes the transport before the client
- Error handlers in both `McpClient` and `connectAndDiscover` now close
the transport to prevent leaks on connection errors
- Helper functions `connectWithSSETransport()` and `retryWithOAuth()` now
return the transport for proper lifecycle management
Addresses the memory leak fix from PR google-gemini#12801.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Root cause: When MCP servers disconnect or encounter errors, the transport
(Stdio/SSE/HTTP) was not being closed because:
1. `connectToMcpServer()` only returned the Client, not the transport
2. The transport reference was created locally and lost after connection
3. `McpClient.transport` was declared but never assigned
This caused event listeners and buffers to run indefinitely, eventually
causing "JavaScript heap out of memory" errors in long-running sessions.
Changes:
- `connectToMcpServer()` now returns `{client, transport}` instead of just `Client`
- `McpClient.connect()` stores the transport reference
- `McpClient.disconnect()` now properly closes the transport before the client
- Error handlers in both `McpClient` and `connectAndDiscover` now close
the transport to prevent leaks on connection errors
- Helper functions `connectWithSSETransport()` and `retryWithOAuth()` now
return the transport for proper lifecycle management
Addresses the memory leak fix from PR google-gemini#12801.
Root cause: When MCP servers disconnect or encounter errors, the transport
(Stdio/SSE/HTTP) was not being closed because:
1. `connectToMcpServer()` only returned the Client, not the transport
2. The transport reference was created locally and lost after connection
3. `McpClient.transport` was declared but never assigned
This caused event listeners and buffers to run indefinitely, eventually
causing "JavaScript heap out of memory" errors in long-running sessions.
Changes:
- `connectToMcpServer()` now returns `{client, transport}` instead of just `Client`
- `McpClient.connect()` stores the transport reference
- `McpClient.disconnect()` now properly closes the transport before the client
- Error handlers in both `McpClient` and `connectAndDiscover` now close
the transport to prevent leaks on connection errors
- Helper functions `connectWithSSETransport()` and `retryWithOAuth()` now
return the transport for proper lifecycle management
Addresses the memory leak fix from PR google-gemini#12801.
Summary
Addresses:
MCP Spec:
Changes
MCP Sampling Feature
Implements the Model Context Protocol sampling capability, allowing MCP servers to request LLM completions from the client with user approval (human-in-the-loop pattern).
Core Implementation (packages/core/src/tools/mcp-client.ts):
sampling: {}capability to MCP serversCreateMessageRequestSchemahandler for sampling requestsUI Integration (packages/cli/src/ui):
McpSamplingDialogcomponent with RadioButtonSelect for approvalCoreEvent.McpSamplingRequestAppContainerwith wrapped callbacksDialogManagerConfiguration:
--auto-confirm-mcp-samplingCLI flag for testing/automationautoConfirmMcpSamplingto bypass user promptsMemory Leak Fix
Root Cause: MCP server transports were never closed on disconnection, leaving event listeners and buffers running indefinitely, eventually causing "JavaScript heap out of memory" errors.
Fix (packages/core/src/tools/mcp-client.ts):
connectToMcpServer()to return{client, transport}instead of justclientMcpClientclasstransport.close()indisconnect()and error handlersTest Updates
Integration Test (integration-tests/mcp-sampling.test.ts):
--auto-confirm-mcp-samplingflagUnit Tests (packages/core/src/tools/mcp-client.test.ts):
close()methodEvents Tests (packages/core/src/utils/events.ts):
Bug Fixes
FallbackModeChangedPayloadto useisInFallbackModeproperty (matching upstream)Test Plan
npm run build)npm run test:ci)This PR was generated with the help of AI, and reviewed by a Human