fix(mcp): roll back user message and restore input on config_assist error#2280
Conversation
Adds an `mcp_clients` domain that lets users browse the Smithery.ai MCP
server registry, install servers locally, spawn them over stdio via the
existing JS/Python runtimes, and expose their tools to the agent.
Rust core (src/openhuman/mcp_clients/):
- Smithery registry client with 10-minute SQLite cache
- stdio MCP JSON-RPC client (initialize / tools/list / tools/call)
behind a McpTransport trait so tests can inject a fake
- Per-server SQLite persistence (env values stored separately and
never returned in list/status responses — only key names)
- 10 RPCs under openhuman.mcp_clients_* covering registry browse,
install/uninstall, connect/disconnect, status, tool_call, and an
LLM-backed config_assist for credential walkthroughs
- DomainEvent::Mcp{ServerInstalled,Connected,Disconnected,
ClientToolExecuted} + bus subscriber
- tool_registry surfaces connected MCP-client tools to the agent
with route { protocol: "mcp-client", rpc_method, server_id, tool_name }
- json_rpc_e2e lifecycle test + ~60 unit tests across the domain
React UI (app/src/components/channels/mcp/):
- New "MCP Servers" tab in the Channels page (virtual tab — not a
backend channel definition)
- Catalog browser with debounced Smithery search + pagination
- Install dialog with per-required-env-key inputs (password type,
values never logged or displayed back)
- Installed-server list + detail with Connect/Disconnect/Uninstall
- Inline config assistant chat that calls config_assist and can
surface suggested env values
- Typed RPC wrapper in services/api/mcpClientsApi.ts
- Vitest suites for the wrapper and each non-trivial component
…rror On API failure, the user message was left orphaned in chat history with no assistant reply, and the input field was already cleared — making retry impossible without re-typing. Now catches roll back messages state to the pre-send snapshot and restore the input text so the user can retry immediately.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds frontend UI error recovery for configuration assistance, test coverage for channel component routing, and a complete MCP client RPC handler surface supporting registry discovery, server lifecycle management, tool invocation, and AI-assisted environment configuration. ChangesMCP Client Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The base branch was changed.
The frontend InstallDialog reads `required_env_keys` from the server detail returned by `registry_get` to build its env-var input form. The Rust handler was only returning the raw Smithery payload, which does not contain this field — env key derivation was only happening internally for `config_assist`. Augment `mcp_clients_registry_get` to call `collect_required_env_keys` and inject the result into the serialized server object before returning, so InstallDialog can render the correct input fields.
The failing tests used vi.useFakeTimers() but waited via waitFor(), which polls with setInterval — itself a fake timer that never fires. Replacing waitFor() with synchronous assertions after vi.runAllTimersAsync() properly flushes the debounce timeout and all pending promise microtasks in a single act(). Fixes 4 timeout failures in McpCatalogBrowser.test.tsx.
Lint caught waitFor as an unused import after replacing waitFor() calls with synchronous assertions in the fake-timer tests. Remove it.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/openhuman/mcp_clients/ops.rs (1)
544-549:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPotential panic on truncation of non-ASCII error text.
Line 548 truncates the error text using
&text[..text.len().min(200)], which can panic if the truncation point falls in the middle of a multi-byte UTF-8 character.🛡️ Use char_indices for safe truncation
- tracing::warn!( - "[mcp-client] config_assist inference HTTP {}: {}", - status, - &text[..text.len().min(200)] - ); + let truncated = text + .char_indices() + .nth(200) + .map(|(idx, _)| &text[..idx]) + .unwrap_or(&text); + tracing::warn!( + "[mcp-client] config_assist inference HTTP {}: {}", + status, + truncated + );Alternatively, use
text.chars().take(200).collect::<String>()for a simpler (though slightly less efficient) approach.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/mcp_clients/ops.rs` around lines 544 - 549, The code currently slices the UTF-8 string `text` with `&text[..text.len().min(200)]` in the `config_assist` inference error path, which can panic if the byte boundary splits a multi-byte character; change this to perform a safe truncation (e.g. build a truncated String using `text.chars().take(200).collect()` or use `char_indices` to pick a valid byte index) and log that truncated value in the `tracing::warn!` call instead of the raw slice; update the variable used in the `tracing::warn!` format so `status` and the new safe `truncated` string are logged.
🧹 Nitpick comments (2)
src/openhuman/mcp_clients/ops.rs (2)
215-228: ⚡ Quick winRedundant branching in default command resolution.
Both branches of the conditional return identical values for Node/npx. The Python package detection only works when
example_configcontains "uvx" or "python" in the command name (lines 205-209), so the default path always assumes Node/npx.♻️ Simplify the default logic
- // Default: use npx for npm-style packages (starts with @), uvx for python - if qualified_name.starts_with('@') || !qualified_name.contains('/') { - ( - CommandKind::Node, - "npx".to_string(), - vec!["-y".to_string(), qualified_name.to_string()], - ) - } else { - ( - CommandKind::Node, - "npx".to_string(), - vec!["-y".to_string(), qualified_name.to_string()], - ) - } + // Default: use npx for all packages when example_config doesn't specify a command + ( + CommandKind::Node, + "npx".to_string(), + vec!["-y".to_string(), qualified_name.to_string()], + )If Python package auto-detection from qualified_name patterns is intended for a future iteration, consider adding a comment or TODO.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/mcp_clients/ops.rs` around lines 215 - 228, The conditional that checks qualified_name and returns the same Node/npx tuple is redundant; in the function that builds the default command (use of qualified_name, CommandKind::Node and "npx"), replace the if/else block with a single return of (CommandKind::Node, "npx".to_string(), vec!["-y".to_string(), qualified_name.to_string()]) and remove the duplicate branch, and optionally add a TODO comment near that logic mentioning future Python/uvx detection if needed.
132-140: ⚡ Quick winConsider validating stdio connection availability.
If the registry detail has no stdio connections,
stdio_connwill beNoneandresolve_commandwill fall back to npx defaults. This could result in installing a server record that cannot actually connect. Consider adding explicit validation:let stdio_conn = detail .connections .iter() .filter(|c| c.r#type == "stdio") .find(|c| c.published) .or_else(|| detail.connections.iter().find(|c| c.r#type == "stdio")) .ok_or_else(|| "No stdio connection available for this server".to_string())?;This would fail fast at install time rather than later at connect time.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/mcp_clients/ops.rs` around lines 132 - 140, The code currently lets stdio_conn be None and calls resolve_command, which can hide missing stdio connections; change the stdio_conn assignment to validate presence of a stdio connection (from detail.connections) and return an error early if absent (use ok_or_else to produce a String or appropriate error type and propagate with ?), so the install fails fast; update the surrounding function signature to return Result if necessary and keep subsequent use of stdio_conn and resolve_command unchanged once you have a validated connection.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/openhuman/mcp_clients/ops.rs`:
- Around line 544-549: The code currently slices the UTF-8 string `text` with
`&text[..text.len().min(200)]` in the `config_assist` inference error path,
which can panic if the byte boundary splits a multi-byte character; change this
to perform a safe truncation (e.g. build a truncated String using
`text.chars().take(200).collect()` or use `char_indices` to pick a valid byte
index) and log that truncated value in the `tracing::warn!` call instead of the
raw slice; update the variable used in the `tracing::warn!` format so `status`
and the new safe `truncated` string are logged.
---
Nitpick comments:
In `@src/openhuman/mcp_clients/ops.rs`:
- Around line 215-228: The conditional that checks qualified_name and returns
the same Node/npx tuple is redundant; in the function that builds the default
command (use of qualified_name, CommandKind::Node and "npx"), replace the
if/else block with a single return of (CommandKind::Node, "npx".to_string(),
vec!["-y".to_string(), qualified_name.to_string()]) and remove the duplicate
branch, and optionally add a TODO comment near that logic mentioning future
Python/uvx detection if needed.
- Around line 132-140: The code currently lets stdio_conn be None and calls
resolve_command, which can hide missing stdio connections; change the stdio_conn
assignment to validate presence of a stdio connection (from detail.connections)
and return an error early if absent (use ok_or_else to produce a String or
appropriate error type and propagate with ?), so the install fails fast; update
the surrounding function signature to return Result if necessary and keep
subsequent use of stdio_conn and resolve_command unchanged once you have a
validated connection.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 50566bc8-88c5-44a2-abe4-1f3e44896dec
📒 Files selected for processing (6)
app/src/components/channels/mcp/McpCatalogBrowser.test.tsxapp/src/store/channelConnectionsSlice.tssrc/core/all.rssrc/openhuman/about_app/catalog.rssrc/openhuman/mcp_clients/ops.rssrc/openhuman/mod.rs
💤 Files with no reviewable changes (1)
- src/openhuman/mod.rs
…olve_command Addresses @coderabbitai review comments: 1. ops.rs config_assist error path: `&text[..text.len().min(200)]` can panic if the byte offset lands in the middle of a multi-byte UTF-8 character. Use char_indices to find a valid boundary instead. 2. resolve_command default fall-through had two identical if/else arms (both returning Node/npx). Collapse to a single unconditional return. 3. tests/json_rpc_e2e.rs mcp_clients_lifecycle: RpcOutcome wraps payload in {"result": ..., "logs": [...]} when logs are present. The installed_list and status assertions were accessing keys at the wrong nesting level, causing the test to panic. Mirror the pattern already used at steps 6-7.
…cpToolList Adds Vitest tests covering the previously uncovered lines (27–180 in McpServersTab, lines 28–84 in InstalledServerList, lines 19–38 in McpToolList) to push diff coverage above the 80% gate.
…or virtual tabs Covers the mcp branch in ChannelConfigPanel (line 16) and the virtual VIRTUAL_TABS rendering block in ChannelSelector (lines 102, 104, 108), bringing both files above the 80% diff-coverage threshold.
Take main's improvements (stale-response guards, security logging, waiter cleanup, proper error propagation, delete-then-insert env rows) while keeping PR's unique fix (config_assist error rollback) and required_env_keys enrichment in registry_get.
The merge resolution accidentally dropped closing parentheses from
three map_err(|e| anyhow::anyhow!("{e}")) expressions.
|
huge first merge @antfleet-ops 🙌 love the rollback + input restore fix, orphaned user messages with no way to retry is such a rough ux papercut and you nailed it. since this is your first one in, come hang with us in the builder discord: https://discord.tinyhumans.ai/ 🎉 |
Summary
handleSendwas leaving the user's message orphaned in chat history with no assistant replysetInput('')ran before thetry), making retry impossible without re-typingmessagesto the pre-send snapshot and restoresinputto the original textRoot cause
setMessages(updatedHistory)andsetInput('')executed unconditionally before thetryblock. On error, the user message was stuck in history and the input was gone.Fix
Two lines added to the
catchblock inhandleSend:The optimistic update (showing the user message while waiting) is preserved — only the rollback path is changed.
Test plan
Generated with Claude Code · Flagged by AntFleet code review
Summary by CodeRabbit
Bug Fixes
New Features
Tests