Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
d5c022c
Wire plan_list tool into the HTTP/FastMCP server
neoneye Feb 26, 2026
8df5b84
docs: rewrite MCP evaluation to reflect plan_* rename and current state
neoneye Feb 26, 2026
9bd73d3
docs: add issue 4.10 for stale task variable names and remove alias p…
neoneye Feb 26, 2026
9f1a7db
refactor: split mcp_cloud/app.py into 10 focused modules
neoneye Feb 26, 2026
a380e7d
refactor: remove user_api_key from plan_list visible input schema
neoneye Feb 26, 2026
259e282
docs: rewrite MCP evaluation to reflect current interface state
neoneye Feb 26, 2026
b6971ef
docs: remove stale section 5.5 (legacy /tasks REST endpoints)
neoneye Feb 26, 2026
284129e
fix: align plan_list auth with plan_create pattern
neoneye Feb 26, 2026
befa6cb
refactor: rename stale task variable names and remove backward-compat…
neoneye Feb 26, 2026
512b53c
docs: mark 4.9 stale task variable names as FIXED in evaluation
neoneye Feb 26, 2026
d39167e
feat: add rate limiting to /download endpoint
neoneye Feb 26, 2026
2fa2185
Merge branch 'main' into mcp-tweaks
neoneye Feb 26, 2026
831afd1
fix: enforce body size limit on Streamable HTTP /mcp/ endpoint
neoneye Feb 26, 2026
e8ff914
fix: return INVALID_ARGUMENT for unrecognised artifact type
neoneye Feb 26, 2026
ad0d339
test: add dedicated plan_list handler tests
neoneye Feb 26, 2026
a08f5bc
refactor: extract prompt excerpt length to module constant
neoneye Feb 26, 2026
3932e04
feat: add audit logging for all tool calls
neoneye Feb 26, 2026
642a759
fix: fail hard on missing secrets when auth is required
neoneye Feb 26, 2026
73457d4
fix: restrict default CORS origins in production
neoneye Feb 26, 2026
0dbe1af
refactor: rename external-facing task_id → plan_id, tasks → plans, er…
neoneye Feb 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
267 changes: 169 additions & 98 deletions docs/proposals/70-mcp-interface-evaluation-and-roadmap.md

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions mcp_cloud/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
- MCP tools must follow the specification in `docs/mcp/planexe_mcp_interface.md`:
- Task management maps to `PlanItem` records (each task = one PlanItem).
- Events are queried from `EventItem` database records.
- Use the PlanItem UUID as the MCP `task_id`.
- Use the PlanItem UUID as the MCP `plan_id`.
- Public task state contract:
- `plan_status.state` must use exactly: `pending`, `processing`, `completed`, `failed`.
- These values correspond 1:1 with `database_api.model_planitem.PlanState`.
Expand All @@ -35,7 +35,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
- Expose `model_profiles` as the discovery tool for profile selection.
- `model_profiles` must report profile guidance and currently available models after class whitelist filtering.
- Keep workflow wording explicit that prompt drafting + user approval is a non-tool step before `plan_create`.
- Keep concurrency wording explicit: each `plan_create` call creates a new `task_id`; no global per-client concurrency cap is enforced server-side.
- Keep concurrency wording explicit: each `plan_create` call creates a new `plan_id`; no global per-client concurrency cap is enforced server-side.
- Visible input schema is intentionally limited to:
- `prompt`
- `model_profile` (`baseline`, `premium`, `frontier`, `custom`)
Expand All @@ -45,7 +45,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
- The server communicates over stdio (standard input/output) following the MCP protocol.
- Tools are registered via `@mcp_cloud.list_tools()` and handled via `@mcp_cloud.call_tool()`.
- All tool responses must be JSON-serializable and follow the error model in the spec.
- Keep tool error codes/docs aligned with actual runtime payloads (for example `TASK_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `generation_failed`, `content_unavailable`, `INTERNAL_ERROR`).
- Keep tool error codes/docs aligned with actual runtime payloads (for example `PLAN_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `generation_failed`, `content_unavailable`, `INTERNAL_ERROR`).
- Event cursors use format `cursor_{event_id}` for incremental polling.
- **Run as task**: We expose MCP **tools** only (plan_create, plan_status, plan_stop, etc.), not the MCP **tasks** protocol (tasks/get, tasks/result, etc.). Do not advertise the tasks capability or add "Run as task" support; the spec and clients (e.g. Cursor) are aligned on tools-only.

Expand Down
28 changes: 15 additions & 13 deletions mcp_cloud/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ Build and run mcp_cloud with HTTP endpoints:
docker compose up
```

Important: `mcp_cloud` enqueues tasks and `worker_plan_database_{n}` executes them.
If no `worker_plan_database*` service is running, `plan_create` returns a task id but the task will not progress.
Important: `mcp_cloud` enqueues plans and `worker_plan_database_{n}` executes them.
If no `worker_plan_database*` service is running, `plan_create` returns a plan id but the plan will not progress.

mcp_cloud exposes HTTP endpoints on port `8001` (or `${PLANEXE_MCP_HTTP_PORT}`). Authentication is controlled by `PLANEXE_MCP_REQUIRE_AUTH`:
- `false`: no API key needed (local docker default).
Expand Down Expand Up @@ -133,31 +133,33 @@ See `docs/mcp/planexe_mcp_interface.md` for full specification. Available tools:

- `prompt_examples` - Return example prompts. Use these as examples for plan_create.
- `model_profiles` - List profile options and currently available models in each profile.
- `plan_create` - Create a new task (returns task_id as UUID; may require user_api_key for credits)
- `plan_status` - Get task status and progress
- `plan_stop` - Stop an active task
- `plan_retry` - Retry a failed task with the same task_id (optional model_profile, default baseline)
- `plan_create` - Create a new plan (returns plan_id as UUID; may require user_api_key for credits)
- `plan_status` - Get plan status and progress
- `plan_stop` - Stop an active plan
- `plan_retry` - Retry a failed plan with the same plan_id (optional model_profile, default baseline)
- `plan_file_info` - Get file metadata for report or zip

`plan_status` caller contract:
- `pending` / `processing`: keep polling.
- `completed`: terminal success, download is ready.
- `failed`: terminal error.
- If `failed`, call `plan_retry` to requeue the same task id.
- If `failed`, call `plan_retry` to requeue the same plan id.

Concurrency semantics:
- Each `plan_create` call creates a new `task_id`.
- `plan_retry` reuses the same failed `task_id`.
- Server does not enforce a global one-task-at-a-time cap per client.
- Client should track task ids explicitly when running tasks in parallel.
- Each `plan_create` call creates a new `plan_id`.
- `plan_retry` reuses the same failed `plan_id`.
- Server does not enforce a global one-plan-at-a-time cap per client.
- Client should track plan ids explicitly when running plans in parallel.

Minimal error contract:
- Tool errors use `{"error":{"code","message","details?"}}`.
- Common codes: `TASK_NOT_FOUND`, `TASK_NOT_FAILED`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`.
- Common codes: `PLAN_NOT_FOUND`, `PLAN_NOT_FAILED`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`.
- `plan_file_info` may return `{}` while output is not ready (not an error payload).

Note: `plan_download` is a synthetic tool provided by `mcp_local`, not by this server. If your client exposes `plan_download`, use it to save the report or zip locally; otherwise use `plan_file_info` to get `download_url` and fetch the file yourself.

> **Breaking change (v2026-02-26):** External-facing field names were renamed from `task_id` → `plan_id`, `tasks` → `plans`, and error codes from `TASK_NOT_FOUND` → `PLAN_NOT_FOUND`, `TASK_NOT_FAILED` → `PLAN_NOT_FAILED`.

**Tip**: Call `prompt_examples` to get example prompts to use with plan_create, then call `model_profiles` to choose `model_profile` based on current runtime availability. The prompt catalog is the same as in the frontends (`worker_plan.worker_plan_api.PromptCatalog`). When running with `PYTHONPATH` set to the repo root (e.g. stdio setup), the catalog is loaded automatically; otherwise built-in examples are returned.

Download flow: call `plan_file_info` to obtain the `download_url`, then fetch the
Expand Down Expand Up @@ -407,5 +409,5 @@ See `railway.md` for Railway-specific deployment instructions. The server automa
- Other files are fetched by downloading the run zip and extracting the file (less efficient but works without additional endpoints)
- Artifact writes are not yet supported via HTTP (would require a write endpoint in `worker_plan`).
- Artifact writes are rejected while a run is active (strict policy per spec).
- Task IDs use the PlanItem UUID (e.g., `5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1`).
- Plan IDs use the PlanItem UUID (e.g., `5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1`).
- **Security**: Authentication is configurable. For production, set `PLANEXE_MCP_REQUIRE_AUTH=true` and use UserApiKey validation (optionally with `PLANEXE_MCP_API_KEY` as a shared secret).
Loading