diff --git a/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md b/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md index 3cece72c1..1e9106f92 100644 --- a/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md +++ b/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md @@ -1,159 +1,212 @@ --- -title: MCP Interface — Evaluation and Roadmap -date: 2026-02-25 ---- + +## title: MCP Interface — Evaluation and Roadmap +date: 2026-02-26 # MCP Interface — Evaluation and Roadmap -An honest audit of the current MCP surface (mcp_cloud + mcp_local), followed by concrete improvements and promotion ideas. +An honest audit of the current MCP surface (`mcp_cloud` + `mcp_local`), followed by concrete improvements and promotion ideas. + +**Revision history:** +- **2026-02-26 (rev 1):** Initial version after `task_*` → `plan_*` rename. +- **2026-02-26 (rev 2):** Updated after `app.py` refactor into modules, `plan_list` `user_api_key` made optional in schema (auto-injected by HTTP layer), and re-evaluation of all open issues. +- **2026-02-26 (rev 3):** Updated after completing 4.9 — all stale `task` variable names, request classes, helper functions, and backward-compat aliases renamed/removed across `mcp_cloud` and `mcp_local`. Test files renamed from `test_task_*` to `test_plan_*`. +- **2026-02-26 (rev 4):** Updated after completing 4.2 — added separate download rate limiter with configurable limits (default 10 req/60s). +- **2026-02-26 (rev 5):** Renamed external-facing fields: `task_id` → `plan_id`, `tasks` → `plans`, error codes `TASK_NOT_FOUND` → `PLAN_NOT_FOUND`, `TASK_NOT_FAILED` → `PLAN_NOT_FAILED`. Internal function names and download URL paths unchanged. + +--- + +## 1. Current Tool Surface + +Nine tools, split across two transports: + + +| Tool | Cloud (`mcp_cloud`) | Local (`mcp_local`) | Auth | Annotations | +| ----------------- | ------------------- | ------------------- | -------- | ----------------------- | +| `prompt_examples` | yes | yes | Public | readOnly, idempotent | +| `model_profiles` | yes | yes | Public | readOnly, idempotent | +| `plan_create` | yes | yes | Required | openWorld | +| `plan_status` | yes | yes | Required | readOnly, idempotent | +| `plan_stop` | yes | yes | Required | destructive, idempotent | +| `plan_retry` | yes | yes | Required | openWorld | +| `plan_file_info` | yes | — | Required | readOnly, idempotent | +| `plan_download` | — | yes | Required | openWorld | +| `plan_list` | yes | yes | Required | readOnly, idempotent | + + +`plan_download` is a local-only synthetic tool that internally proxies to `plan_file_info` on the cloud, then downloads and saves the artifact to the user's filesystem. This intentional asymmetry is tested in `test_tool_surface_consistency.py`. + +**Auth model for `plan_create` and `plan_list`:** Both tools accept an optional `user_api_key` in the visible MCP input schema. When called over HTTP, the middleware authenticates the caller via the `X-API-Key` header and auto-injects `user_api_key` into handler arguments. This means MCP clients never need to pass `user_api_key` explicitly — the key is invisible in the tool's published schema but enforced at runtime. Both handlers return `USER_API_KEY_REQUIRED` if no key arrives by either path. --- -## 1. What's Working Well +## 2. What's Working Well **Dual transport.** `mcp_cloud` (stateless HTTP / Railway) and `mcp_local` (stdio proxy) cover the two major deployment patterns. Most users can pick one without reading source code. -**Layered authentication.** Two distinct auth paths — a server-wide `PLANEXE_MCP_API_KEY` for self-hosters, and per-user `pex_…` keys issued by home.planexe.org — are a good design. The key-normalisation fix (`_normalize_api_key_value`) makes the second path robust against copy-paste artefacts. +**Clean module structure.** `mcp_cloud/app.py` is now a thin re-export facade (~195 lines). Logic lives in focused modules: `handlers.py` (tool handlers), `schemas.py` (tool definitions), `tool_models.py` (Pydantic models), `db_queries.py` (DB operations), `auth.py` (key hashing/user resolution), `download_tokens.py` (signed tokens), `model_profiles.py`, `worker_fetchers.py`, `zip_utils.py`, `prompt_examples.py`. This makes PRs reviewable and bugs easy to isolate. + +**Consistent `plan_*` naming throughout.** The rename from `task_*` to `plan_*` covers the full stack: external tool names, handler functions, request classes (`PlanCreateRequest`, etc.), DB query helpers (`_create_plan_sync`, `get_plan_by_id`, etc.), local variable names, and test file names. No backward-compat aliases remain. + +**Layered authentication.** Two distinct auth paths — a server-wide `PLANEXE_MCP_API_KEY` for self-hosters, and per-user `pex_…` keys issued by home.planexe.org — are a good design. The key-normalisation helper (`_normalize_api_key_value` in `http_server.py`) handles common copy-paste artefacts (Bearer prefix, surrounding quotes, full header line pasted as value). + +**Auto-injected `user_api_key`.** For `plan_create` and `plan_list`, the HTTP layer reads the authenticated user from the request context and injects `user_api_key` into handler arguments automatically. Callers never see `user_api_key` as a required field in the MCP schema — a clean separation between transport-level auth and tool-level logic. -**Structured output schemas.** Every tool declares an `output_schema`, so MCP clients can validate responses without guessing. The `TestAllToolsHaveOutputSchema` test enforces this at CI time. +**Structured output schemas.** Every tool declares an `output_schema`, so MCP clients can validate responses without guessing. `TestAllToolsHaveOutputSchema` enforces this at CI time. **Tool annotations.** `readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint` are set on every tool and tested. This is ahead of most MCP servers. -**task_retry with model_profile selection.** Allowing the caller to re-run a failed task with a stronger model (e.g. upgrade from `baseline` to `thorough`) at retry time is genuinely useful. +**`plan_retry` with model_profile selection.** Allowing the caller to re-run a failed task with a stronger model (e.g. upgrade from `baseline` to `premium`) at retry time is genuinely useful. + +**Signed download tokens.** `plan_file_info` returns download URLs with HMAC-SHA256 signed, time-limited tokens (15-min default TTL) scoped to one artifact (`task_id:filename:expiry`). Tokens work in a browser without an API key header. Defence-in-depth: the download endpoint re-validates even after middleware has passed the token. The secret fallback chain is: `PLANEXE_DOWNLOAD_TOKEN_SECRET` → `PLANEXE_API_KEY_SECRET` → per-process random (with warning). **Glama + llms.txt.** Being listed in the Glama registry and providing `llms.txt` lowers the discovery barrier for new users. -**Rate limiting on REST endpoints.** `slowapi` limits `/tasks` create/retry endpoints, protecting the backend from burst abuse. +**Rate limiting on all MCP endpoints.** `_enforce_rate_limit` in `http_server.py` applies to `/mcp`, `/mcp/`, and `/mcp/tools/call`. The default limit (60 req / 60 s per client, keyed by API key or IP) is high enough that normal `plan_status` polling is never affected. **Prompt guidance in schema.** The `prompt` field description ("300–800 words … objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria") sets user expectations up front. +**`plan_list` for plan recovery.** Authenticated users can list their most recent plans (up to 50, newest-first) to recover a lost `plan_id`. Each entry includes `plan_id`, `state`, `progress_percentage`, `created_at`, and `prompt_excerpt`. + +**Comprehensive test suite.** 12 test files covering tool surface consistency, auth key parsing, CORS config, download tokens, HTTP routing, and individual tool behaviour (`test_plan_create_tool.py`, `test_plan_status_tool.py`, `test_plan_retry_tool.py`, `test_plan_file_info_tool.py`, `test_model_profiles_tool.py`). + --- -## 2. What's Broken or Inconsistent +## 3. What's Been Fixed (Previously Reported) + +### 3.1 ~~`skills/planexe-mcp/SKILL.md` says "5 tools"~~ (FIXED) + +Updated to nine tools; SKILL.md now lists all tools with example JSON-RPC calls. -### 2.1 ~~`skills/planexe-mcp/SKILL.md` says "5 tools"~~ (FIXED) +### 3.2 ~~Trailing-slash inconsistency~~ (FIXED) -Updated to "seven core tools"; added Tool 5 (`model_profiles`) and Tool 7 (`task_retry`) sections; updated the typical workflow to reference both. Note: with `task_list` now added the total is eight — SKILL.md updated accordingly. +The canonical URL (`https://mcp.planexe.org/mcp`, no trailing slash) is used in all JSON config files and registry entries. -### 2.2 ~~Trailing-slash inconsistency~~ (FIXED) +### 3.3 ~~`speed_vs_detail` documented but hidden from agents~~ (FIXED) -The canonical URL (`https://mcp.planexe.org/mcp`, no trailing slash) is used in all JSON config files and registry entries. The MCP Inspector CLI command in `docs/mcp/inspector.md` intentionally keeps the trailing slash (the inspector appends sub-paths; without `/` it sends requests to the wrong path). A note clarifying this distinction was added to `inspector.md`. +Removed entirely from the MCP interface. -### 2.3 ~~`speed_vs_detail` is documented but hidden from agents~~ (FIXED) +### 3.4 ~~`plan_file_info` returns `{}` on success instead of `isError`~~ (FIXED) -The `speed_vs_detail` parameter was a developer-only hidden override that was rarely used and created a docs/schema mismatch. It has been removed from the MCP interface entirely: the dead code was deleted from `mcp_cloud/app.py` and `mcp_cloud/http_server.py`, the legacy backward-compat forwarding block was removed from `mcp_local/planexe_mcp_local.py`, and all references were purged from docs. +Now returns `{"ready": false, "reason": "processing"}` while running and `{"ready": false, "reason": "failed", "error": {...}}` on failure. -### 2.4 ~~`task_file_info` returns `{}` on success instead of `isError`~~ (FIXED) +### 3.5 ~~Rate limiting covers REST but not Streamable HTTP `/mcp`~~ (FIXED) -`task_file_info` now returns `{"ready": false, "reason": "processing"}` when the task is still running, and `{"ready": false, "reason": "failed", "error": {...}}` when it has failed. The output schema was updated (replaced the empty-dict variant with `TaskFileInfoNotReadyOutput`), and both `PLANEXE_SERVER_INSTRUCTIONS` and the tool description were updated accordingly. +`_enforce_rate_limit` now covers `/mcp`, `/mcp/`, and `/mcp/tools/call`. -### 2.5 ~~Rate limiting covers REST but not the Streamable HTTP `/mcp` endpoint~~ (FIXED) +### 3.6 ~~No `plan_list` tool — lost `task_id` = lost task~~ (FIXED) -`_enforce_rate_limit` in `mcp_cloud/http_server.py` now applies to `/mcp` and `/mcp/` in addition to `/mcp/tools/call`. The default limit (60 req/60 s per client) is high enough that normal polling of `task_status` is never affected. +Added `plan_list` to both `mcp_cloud` and `mcp_local`. Returns up to 50 tasks newest-first. -### 2.6 ~~No `task_list` tool — lost `task_id` = lost task~~ (FIXED) +### 3.7 ~~Signed, expiring download tokens~~ (FIXED) -Added `task_list` to both `mcp_cloud` and `mcp_local`. Requires `user_api_key`; returns up to 50 tasks newest-first with `task_id`, `state`, `progress_percentage`, `created_at`, and `prompt_excerpt`. The `task_create` description was updated to say "call task_list to recover a lost task_id" instead of "no task_list, lost task_id = lost task". +HMAC-SHA256 tokens, 15-minute default TTL, scoped per-artifact. -### 2.7 `app.py` is an 81 KB monolith +### 3.8 ~~Tools used `task_*` prefix instead of `plan_*`~~ (FIXED) -All tool handlers, auth logic, DB calls, and schema definitions live in one file. This makes onboarding slow, PRs hard to review, and bugs harder to isolate. +All external tool names renamed to `plan_*`. -**Fix:** Refactor into modules: `auth.py`, `tools/task.py`, `tools/meta.py`, `schemas.py`. +### 3.9 ~~`app.py` is a 76 KB monolith~~ (FIXED) + +Refactored into 10+ focused modules (commit 9f1a7db9). `app.py` is now a thin re-export facade. + +### 3.10 ~~`plan_list` requires `user_api_key` in visible MCP schema~~ (FIXED) + +`user_api_key` is now optional in the `PlanListInput` schema (not in `required` list), matching `plan_create`. The HTTP layer auto-injects it from the `X-API-Key` header via `_get_authenticated_user_api_key()`. The handler still enforces the key at runtime (returns `USER_API_KEY_REQUIRED` if absent). --- -## 3. Proposed Improvements +## 4. What's Broken or Inconsistent + +### ~~4.1 Dev-secret fallback in production~~ (FIXED) + +`auth.py` now exports `validate_api_key_secret()` which raises `RuntimeError` when `PLANEXE_API_KEY_SECRET` is not set. `download_tokens.py` exports `validate_download_token_secret()` which raises when neither `PLANEXE_DOWNLOAD_TOKEN_SECRET` nor `PLANEXE_API_KEY_SECRET` is set. Both are called at module level in `http_server.py` when `AUTH_REQUIRED` is true, so the server fails hard at startup instead of silently falling back to dev secrets. The existing runtime fallbacks (`"dev-api-key-secret"` and random per-process secret) remain for local development with `PLANEXE_MCP_REQUIRE_AUTH=false`. -### 3.1 `task_list` tool (high value, low effort) +### ~~4.2 `/download` endpoint not rate-limited~~ (FIXED) -```json -{ - "name": "task_list", - "description": "List the most recent tasks for the authenticated user.", - "inputSchema": { - "properties": { - "limit": {"type": "integer", "default": 10, "maximum": 50} - } - } -} -``` +A separate download rate limiter (`_enforce_download_rate_limit`) now covers `/download` paths with its own bucket and configurable limits: `PLANEXE_MCP_DOWNLOAD_RATE_LIMIT` (default 10 req) and `PLANEXE_MCP_DOWNLOAD_RATE_WINDOW_SECONDS` (default 60s). This is deliberately tighter than the MCP rate limit (60 req/60s) since download responses are 700KB–6MB. The sweep task cleans up download buckets alongside MCP buckets. -Recovers lost task IDs, enables dashboards, and is the single most-requested missing feature in similar task-runner MCP servers. +### ~~4.3 Body size validation only on REST endpoint~~ (FIXED) -### 3.2 ~~Signed, expiring download tokens~~ (FIXED) +`_enforce_body_size` now checks both `/mcp/tools/call` and `/mcp/` POST requests. The `Content-Length` requirement (411) is only enforced on the REST endpoint since Streamable HTTP may use chunked encoding without `Content-Length`; however, when `Content-Length` is present on either endpoint it is validated against `MAX_BODY_BYTES`. -`task_file_info` now returns download URLs that include a signed, short-lived token: -`/download/{task_id}/{filename}?token={expiry}.{hmac_sha256}`. +### ~~4.4 `plan_file_info` silently defaults invalid artifact to `"report"`~~ (FIXED) -- Token is HMAC-SHA256 over `task_id:filename:expiry`, scoped to one artifact. -- Default TTL: 15 minutes (configurable via `PLANEXE_DOWNLOAD_TOKEN_TTL`). -- Secret priority: `PLANEXE_DOWNLOAD_TOKEN_SECRET` → `PLANEXE_API_KEY_SECRET` → random per-process (with warning). -- Tokenised URLs work in a browser without an API key header; the middleware validates the token and skips the API-key check. -- Defence-in-depth: the download endpoint re-validates the token even after the middleware has passed it. -- Backward compatible: requests without a token still require a valid API key header (existing behaviour). +Both `handle_plan_file_info` (cloud) and `handle_plan_download` (local) now return `INVALID_ARGUMENT` with a descriptive message when the artifact value is not `"report"` or `"zip"`. -### 3.3 SSE progress streaming (UX) +### ~~4.5 No dedicated `plan_list` test~~ (FIXED) -Long-running plans (10–20 minutes) give the user no feedback. A `task_progress` SSE endpoint (or a `progress` field in `task_status`) returning incremental log lines would dramatically improve perceived responsiveness. +Added `mcp_cloud/tests/test_plan_list_tool.py` with 8 tests covering: tool listed, returns tasks, empty result, limit clamping (both directions), invalid API key, `USER_API_KEY_REQUIRED` when env requires key, no-key passthrough when not required (user_id=None), and default limit. -Minimum viable version: a `log_lines` array in the `task_status` response (last 50 lines of agent output). +### ~~4.6 CORS default is wildcard~~ (FIXED) -### 3.4 Webhook / push notification (power users) +When `AUTH_REQUIRED` is true and `PLANEXE_MCP_CORS_ORIGINS` is unset, the default is now `["https://mcp.planexe.org", "https://home.planexe.org"]` instead of `["*"]`. Wildcard CORS is only used in dev mode (`PLANEXE_MCP_REQUIRE_AUTH=false`) so browser-based tools like MCP Inspector work without extra configuration. Operators can override via `PLANEXE_MCP_CORS_ORIGINS`. -Add an optional `webhook_url` to `task_create`. When the task transitions to `completed` or `failed`, POST a JSON summary to that URL. This removes the need for polling and enables CI/CD integrations. +### ~~4.7 No request logging for successful tool calls~~ (FIXED) -### 3.5 API versioning +`handle_call_tool` now logs every tool call at INFO level with tool name, result (ok/error/exception), and duration in milliseconds. Unknown tools are logged at WARNING. Format: `tool_call tool= result= duration_ms=`. -All tool names and schemas are currently unversioned. A future breaking change (e.g. renaming `task_file_info` to `task_files`) will silently break clients. -Add a `server_version` field to the `task_status` output and document a stability policy. +### ~~4.8 Prompt excerpt length hardcoded~~ (FIXED) -### 3.6 Refactor `app.py` into modules +Extracted to `PROMPT_EXCERPT_MAX_LENGTH = 100` at module level in `db_queries.py`. -``` -mcp_cloud/ - auth.py # _resolve_user_from_api_key, _hash_user_api_key - schemas.py # TASK_CREATE_INPUT_SCHEMA, TOOL_DEFINITIONS, … - tools/ - task.py # task_create, task_status, task_stop, task_retry, task_list - meta.py # prompt_examples, model_profiles - http_server.py # ASGI wiring only - app.py # thin entry-point, imports from above -``` +### ~~4.9 Stale `task` variable names and backward-compat aliases~~ (FIXED) -### 3.7 Remove or deprecate legacy REST endpoints +All internal naming now uses `plan` consistently. Request classes renamed (`TaskCreateRequest` → `PlanCreateRequest`, etc.), DB query helpers renamed (`_create_task_sync` → `_create_plan_sync`, `get_task_by_id` → `get_plan_by_id`, etc.), local variables renamed (`task_snapshot` → `plan_snapshot`, etc.), all backward-compat aliases removed from `tool_models.py`, `schemas.py`, `handlers.py`, `app.py`, and `mcp_local/planexe_mcp_local.py` (~86 lines deleted). Test files renamed from `test_task_*.py` to `test_plan_*.py` with patch targets updated. -The `/tasks` REST routes duplicate functionality now available through MCP tools. Keeping both surfaces means bugs can exist in one but not the other (as happened with the auth issue). Deprecate `/tasks` in favour of the MCP tool surface, with a sunset date in the changelog. +### ~~4.10 `plan_list` auth differs from `plan_create`~~ (FIXED) + +`plan_list` now uses the same `PLANEXE_MCP_REQUIRE_USER_KEY` check as `plan_create`. When the key is not required and not provided, `plan_list` returns all tasks (no user scoping). `_list_tasks_sync` accepts `user_id=None` to support this. --- -## 4. Promotion and Growth Strategies +## 5. Proposed Improvements + +### 5.1 SSE progress streaming (UX) + +Long-running plans (10–20 minutes) give the user no feedback. A `log_lines` array in the `plan_status` response (last 50 lines of agent output) would dramatically improve perceived responsiveness. + +### 5.2 Webhook / push notification (power users) + +Add an optional `webhook_url` to `plan_create`. When the task transitions to `completed` or `failed`, POST a JSON summary to that URL. This removes the need for polling and enables CI/CD integrations. -### 4.1 MCP registries +### 5.3 API versioning -- **Glama** — already listed ✓ +All tool names and schemas are currently unversioned. A future breaking change will silently break clients. Add a `server_version` field to the `plan_status` output and document a stability policy. + +### 5.4 Startup environment validation + +Add an explicit check at server startup that required secrets (`PLANEXE_API_KEY_SECRET`, `PLANEXE_DOWNLOAD_TOKEN_SECRET`) are set when auth is enabled. Fail loudly instead of falling back to dev defaults. + +--- + +## 6. Promotion and Growth Strategies + +### 6.1 MCP registries + +- **Glama** — already listed - **mcp.so** — submit `server.json`; high traffic from Claude desktop users - **Smithery** — another fast-growing directory; supports one-click install - **awesome-mcp-servers** (GitHub) — submit a PR; maintainers merge quickly - **OpenTools** — focus on enterprise MCP discovery -### 4.2 Content +### 6.2 Content -- **Blog post: "From prompt to project plan in 60 seconds"** — a short walkthrough showing MCP Inspector → task_create → task_status → download. Publish on dev.to, Hacker News (Show HN), and the PlanExe GitHub Discussions. +- **Blog post: "From prompt to project plan in 60 seconds"** — a short walkthrough showing MCP Inspector → `plan_create` → `plan_status` → download. Publish on dev.to, Hacker News (Show HN), and the PlanExe GitHub Discussions. - **YouTube demo (2–3 minutes)** — screen recording of Claude Desktop using PlanExe MCP end-to-end. Pin it to the README. -- **Twitter/X thread** — "I built an MCP server that turns a ~500-word prompt into a full project plan. Here's how it works: 🧵" +- **Twitter/X thread** — "I built an MCP server that turns a ~500-word prompt into a full project plan. Here's how it works:" -### 4.3 Community integrations +### 6.3 Community integrations - **Claude Desktop config snippet** — provide a ready-to-paste `claude_desktop_config.json` block in the README. - **Cursor / Windsurf rule** — provide a `.cursorrules` or `.windsurfrules` snippet that wires PlanExe MCP automatically. -- **GitHub Actions** — a reusable workflow `planexe/create-plan@v1` that runs `task_create` and uploads the result as a release asset. This is a high-visibility integration channel. +- **GitHub Actions** — a reusable workflow `planexe/create-plan@v1` that runs `plan_create` and uploads the result as a release asset. This is a high-visibility integration channel. -### 4.4 Example prompt gallery +### 6.4 Example prompt gallery Add 10–15 high-quality example prompts (startup, research paper, home renovation, hiring plan, …) to `prompt_examples`. Agents and users copy-paste these; each successful use is a social proof data point. -### 4.5 Observability / social proof +### 6.5 Observability / social proof - Add a public counter to the homepage: "X plans created this week". - Post a monthly changelog to GitHub Discussions so subscribers see activity. @@ -161,26 +214,44 @@ Add 10–15 high-quality example prompts (startup, research paper, home renovati --- -## 5. Quick-win Checklist - -| Priority | Task | Effort | -|----------|------|--------| -| P0 | ~~Fix SKILL.md tool count~~ (DONE) | — | -| P0 | ~~Standardise URL trailing slash~~ (DONE) | — | -| P0 | ~~Fix `speed_vs_detail` schema/docs mismatch~~ (DONE) | — | -| P1 | ~~Add `task_list` tool~~ (DONE) | — | -| P1 | ~~Fix `task_file_info` empty-dict response~~ (DONE) | — | -| P1 | ~~Add rate limiting to `/mcp` endpoint~~ (DONE) | — | -| P1 | Submit to mcp.so + Smithery | 30 min | -| P1 | Write README demo GIF / YouTube link | 1 h | -| P2 | Add `log_lines` to task_status | 4 h | -| P2 | Refactor app.py into modules | 1 day | -| P3 | ~~Signed download tokens~~ (DONE) | — | -| P3 | Webhook support | 1 day | -| P3 | GitHub Actions integration | 1 day | +## 7. Quick-win Checklist + + +| Priority | Task | Effort | Status | +| -------- | ---------------------------------------------------------------------- | ------ | ------ | +| P0 | ~~Fix SKILL.md tool count~~ | — | DONE | +| P0 | ~~Standardise URL trailing slash~~ | — | DONE | +| P0 | ~~Fix `speed_vs_detail` schema/docs mismatch~~ | — | DONE | +| P0 | ~~Rename tools from `task_*` to `plan_*`~~ | — | DONE | +| P1 | ~~Add `plan_list` tool~~ | — | DONE | +| P1 | ~~Fix `plan_file_info` empty-dict response~~ | — | DONE | +| P1 | ~~Add rate limiting to `/mcp` endpoint~~ | — | DONE | +| P1 | ~~Signed download tokens~~ | — | DONE | +| P1 | ~~Refactor `app.py` into modules~~ | — | DONE | +| P1 | ~~Remove `user_api_key` from `plan_list` visible schema~~ | — | DONE | +| P1 | ~~Fail-hard on missing secrets in production (4.1)~~ | — | DONE | +| P1 | ~~Rate-limit `/download` endpoint (4.2)~~ | — | DONE | +| P1 | ~~Add `plan_list` handler tests (4.5)~~ | — | DONE | +| P1 | Submit to mcp.so + Smithery | 30 min | | +| P1 | Write README demo GIF / YouTube link | 1 h | | +| P2 | ~~Body size validation on Streamable HTTP (4.3)~~ | — | DONE | +| P2 | ~~Return error for invalid artifact value (4.4)~~ | — | DONE | +| P2 | ~~Add tool-call audit logging (4.7)~~ | — | DONE | +| P2 | Add `log_lines` to `plan_status` (5.1) | 4 h | | +| P2 | ~~Rename internal `task` variables/classes/helpers to `plan` (4.9)~~ | — | DONE | +| P2 | ~~Remove backward-compat `Task*`/`handle_task_*`/`TASK_*` aliases (4.9)~~ | — | DONE | +| P2 | ~~Rename test files from `test_task_*` to `test_plan_*` (4.9)~~ | — | DONE | +| P2 | ~~Tighten default CORS origins (4.6)~~ | — | DONE | +| P2 | ~~Align `plan_list` auth with `plan_create` (4.10)~~ | — | DONE | +| P3 | Webhook support (5.2) | 1 day | | +| P3 | API versioning (5.3) | 4 h | | +| P3 | GitHub Actions integration (6.3) | 1 day | | + --- -## 6. Summary +## 8. Summary + +The MCP surface is functionally solid and ahead of most MCP servers in terms of schema rigour, annotation coverage, and security (signed download tokens, layered auth, auto-injected user keys). The codebase has been significantly improved since rev 1: `app.py` was refactored from a 76 KB monolith into 10+ focused modules, `plan_list` now follows the same auth-injection pattern as `plan_create`, and all P0 issues are resolved. -The MCP surface is functionally solid and ahead of most hobby MCP servers in terms of schema rigour and annotation coverage. The main weaknesses are: small but sharp inconsistencies in docs/schemas that erode trust, a missing `task_list` tool that makes the server feel fragile in long agent sessions, and limited discovery beyond Glama. Fixing the P0/P1 items above takes less than a day and would meaningfully improve both reliability and adoption. +All P1 code-quality issues are now resolved, including fail-hard on missing secrets in production (4.1). The remaining checklist items are promotion/growth tasks (mcp.so submission, README demo) and lower-priority enhancements (CORS tightening, SSE streaming, webhooks, API versioning). diff --git a/mcp_cloud/AGENTS.md b/mcp_cloud/AGENTS.md index e015e14a4..46773168b 100644 --- a/mcp_cloud/AGENTS.md +++ b/mcp_cloud/AGENTS.md @@ -15,7 +15,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with - MCP tools must follow the specification in `docs/mcp/planexe_mcp_interface.md`: - Task management maps to `PlanItem` records (each task = one PlanItem). - Events are queried from `EventItem` database records. -- Use the PlanItem UUID as the MCP `task_id`. +- Use the PlanItem UUID as the MCP `plan_id`. - Public task state contract: - `plan_status.state` must use exactly: `pending`, `processing`, `completed`, `failed`. - These values correspond 1:1 with `database_api.model_planitem.PlanState`. @@ -35,7 +35,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with - Expose `model_profiles` as the discovery tool for profile selection. - `model_profiles` must report profile guidance and currently available models after class whitelist filtering. - Keep workflow wording explicit that prompt drafting + user approval is a non-tool step before `plan_create`. -- Keep concurrency wording explicit: each `plan_create` call creates a new `task_id`; no global per-client concurrency cap is enforced server-side. +- Keep concurrency wording explicit: each `plan_create` call creates a new `plan_id`; no global per-client concurrency cap is enforced server-side. - Visible input schema is intentionally limited to: - `prompt` - `model_profile` (`baseline`, `premium`, `frontier`, `custom`) @@ -45,7 +45,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with - The server communicates over stdio (standard input/output) following the MCP protocol. - Tools are registered via `@mcp_cloud.list_tools()` and handled via `@mcp_cloud.call_tool()`. - All tool responses must be JSON-serializable and follow the error model in the spec. -- Keep tool error codes/docs aligned with actual runtime payloads (for example `TASK_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `generation_failed`, `content_unavailable`, `INTERNAL_ERROR`). +- Keep tool error codes/docs aligned with actual runtime payloads (for example `PLAN_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `generation_failed`, `content_unavailable`, `INTERNAL_ERROR`). - Event cursors use format `cursor_{event_id}` for incremental polling. - **Run as task**: We expose MCP **tools** only (plan_create, plan_status, plan_stop, etc.), not the MCP **tasks** protocol (tasks/get, tasks/result, etc.). Do not advertise the tasks capability or add "Run as task" support; the spec and clients (e.g. Cursor) are aligned on tools-only. diff --git a/mcp_cloud/README.md b/mcp_cloud/README.md index 2cdd3a0b0..91263dd63 100644 --- a/mcp_cloud/README.md +++ b/mcp_cloud/README.md @@ -34,8 +34,8 @@ Build and run mcp_cloud with HTTP endpoints: docker compose up ``` -Important: `mcp_cloud` enqueues tasks and `worker_plan_database_{n}` executes them. -If no `worker_plan_database*` service is running, `plan_create` returns a task id but the task will not progress. +Important: `mcp_cloud` enqueues plans and `worker_plan_database_{n}` executes them. +If no `worker_plan_database*` service is running, `plan_create` returns a plan id but the plan will not progress. mcp_cloud exposes HTTP endpoints on port `8001` (or `${PLANEXE_MCP_HTTP_PORT}`). Authentication is controlled by `PLANEXE_MCP_REQUIRE_AUTH`: - `false`: no API key needed (local docker default). @@ -133,31 +133,33 @@ See `docs/mcp/planexe_mcp_interface.md` for full specification. Available tools: - `prompt_examples` - Return example prompts. Use these as examples for plan_create. - `model_profiles` - List profile options and currently available models in each profile. -- `plan_create` - Create a new task (returns task_id as UUID; may require user_api_key for credits) -- `plan_status` - Get task status and progress -- `plan_stop` - Stop an active task -- `plan_retry` - Retry a failed task with the same task_id (optional model_profile, default baseline) +- `plan_create` - Create a new plan (returns plan_id as UUID; may require user_api_key for credits) +- `plan_status` - Get plan status and progress +- `plan_stop` - Stop an active plan +- `plan_retry` - Retry a failed plan with the same plan_id (optional model_profile, default baseline) - `plan_file_info` - Get file metadata for report or zip `plan_status` caller contract: - `pending` / `processing`: keep polling. - `completed`: terminal success, download is ready. - `failed`: terminal error. -- If `failed`, call `plan_retry` to requeue the same task id. +- If `failed`, call `plan_retry` to requeue the same plan id. Concurrency semantics: -- Each `plan_create` call creates a new `task_id`. -- `plan_retry` reuses the same failed `task_id`. -- Server does not enforce a global one-task-at-a-time cap per client. -- Client should track task ids explicitly when running tasks in parallel. +- Each `plan_create` call creates a new `plan_id`. +- `plan_retry` reuses the same failed `plan_id`. +- Server does not enforce a global one-plan-at-a-time cap per client. +- Client should track plan ids explicitly when running plans in parallel. Minimal error contract: - Tool errors use `{"error":{"code","message","details?"}}`. -- Common codes: `TASK_NOT_FOUND`, `TASK_NOT_FAILED`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`. +- Common codes: `PLAN_NOT_FOUND`, `PLAN_NOT_FAILED`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`. - `plan_file_info` may return `{}` while output is not ready (not an error payload). Note: `plan_download` is a synthetic tool provided by `mcp_local`, not by this server. If your client exposes `plan_download`, use it to save the report or zip locally; otherwise use `plan_file_info` to get `download_url` and fetch the file yourself. +> **Breaking change (v2026-02-26):** External-facing field names were renamed from `task_id` → `plan_id`, `tasks` → `plans`, and error codes from `TASK_NOT_FOUND` → `PLAN_NOT_FOUND`, `TASK_NOT_FAILED` → `PLAN_NOT_FAILED`. + **Tip**: Call `prompt_examples` to get example prompts to use with plan_create, then call `model_profiles` to choose `model_profile` based on current runtime availability. The prompt catalog is the same as in the frontends (`worker_plan.worker_plan_api.PromptCatalog`). When running with `PYTHONPATH` set to the repo root (e.g. stdio setup), the catalog is loaded automatically; otherwise built-in examples are returned. Download flow: call `plan_file_info` to obtain the `download_url`, then fetch the @@ -407,5 +409,5 @@ See `railway.md` for Railway-specific deployment instructions. The server automa - Other files are fetched by downloading the run zip and extracting the file (less efficient but works without additional endpoints) - Artifact writes are not yet supported via HTTP (would require a write endpoint in `worker_plan`). - Artifact writes are rejected while a run is active (strict policy per spec). -- Task IDs use the PlanItem UUID (e.g., `5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1`). +- Plan IDs use the PlanItem UUID (e.g., `5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1`). - **Security**: Authentication is configurable. For production, set `PLANEXE_MCP_REQUIRE_AUTH=true` and use UserApiKey validation (optionally with `PLANEXE_MCP_API_KEY` as a shared secret). diff --git a/mcp_cloud/app.py b/mcp_cloud/app.py index e41717f43..2ba5d0d2f 100644 --- a/mcp_cloud/app.py +++ b/mcp_cloud/app.py @@ -1,1837 +1,170 @@ """ -PlanExe MCP Cloud +PlanExe MCP Cloud – thin re-export facade. -Implements the Model Context Protocol interface for PlanExe as specified in - docs/mcp/planexe_mcp_interface.md. Communicates with worker_plan_database via the shared -database_api models. +All symbols previously importable from ``mcp_cloud.app`` are re-exported here +so that existing callers (http_server.py, tests, etc.) continue to work. +The actual implementations live in the focused modules under ``mcp_cloud/``. """ import asyncio -import contextvars -import hashlib -import hmac -import io -import json -import logging -import os -import secrets -import tempfile -import time -import uuid -import zipfile -from dataclasses import dataclass -from datetime import UTC, datetime -from pathlib import Path -from typing import Any, Literal, Optional -from urllib.parse import quote_plus -from io import BytesIO -import httpx -from sqlalchemy import cast, text -from sqlalchemy.dialects.postgresql import JSONB -from mcp.server import Server -from mcp.server.stdio import stdio_server -from mcp.types import CallToolResult, Tool, TextContent, ToolAnnotations -from pydantic import BaseModel -from worker_plan_api.model_profile import ( - ModelProfileEnum, - default_filename_for_profile, - normalize_model_profile, - resolve_model_profile_from_env, -) -from worker_plan_api.planexe_config import PlanExeConfig -from worker_plan_api.llm_class_filter import ( - ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES, - is_llm_class_allowed, - parse_llm_class_whitelist, -) -from mcp_cloud.dotenv_utils import load_planexe_dotenv -_dotenv_loaded, _dotenv_paths = load_planexe_dotenv(Path(__file__).parent) +from mcp.server.stdio import stdio_server -logging.basicConfig( - level=logging.INFO, - format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' +# -- db_setup: Flask app, DB, constants, request classes, MCP Server ---------- +from mcp_cloud.db_setup import ( # noqa: F401 + app, + db, + build_postgres_uri_from_env, + ensure_planitem_stop_columns, + PLANEXE_SERVER_INSTRUCTIONS, + mcp_cloud_server as mcp_cloud, + BASE_DIR_RUN, + WORKER_PLAN_URL, + REPORT_FILENAME, + REPORT_CONTENT_TYPE, + ZIP_FILENAME, + ZIP_CONTENT_TYPE, + ZIP_SNAPSHOT_MAX_BYTES, + ModelProfileInput, + MODEL_PROFILE_TITLES, + MODEL_PROFILE_SUMMARIES, + PlanCreateRequest, + PlanStatusRequest, + PlanStopRequest, + PlanRetryRequest, + PlanFileInfoRequest, + PlanListRequest, + ModelProfilesRequest, + PlanItem, + PlanState, + EventItem, + EventType, + UserAccount, + UserApiKey, + logger, ) -logger = logging.getLogger(__name__) -if not _dotenv_loaded: - logger.warning( - "No .env file found; searched: %s", - ", ".join(str(path) for path in _dotenv_paths), - ) -from database_api.planexe_db_singleton import db -from database_api.model_planitem import PlanItem, PlanState -from database_api.model_event import EventItem, EventType -from database_api.model_user_account import UserAccount -from database_api.model_user_api_key import UserApiKey -from flask import Flask, has_app_context -from mcp_cloud.tool_models import ( - ModelProfilesInput, - ModelProfilesOutput, - PromptExamplesInput, - PromptExamplesOutput, - PlanCreateInput, - PlanCreateOutput, - PlanRetryInput, - PlanRetryOutput, - PlanStopOutput, - PlanStatusInput, - PlanStopInput, - PlanFileInfoInput, - PlanFileInfoNotReadyOutput, - PlanStatusSuccess, - PlanFileInfoReadyOutput, - PlanListInput, - PlanListOutput, - ErrorDetail, - # backward-compat aliases used by internal Request classes - TaskCreateInput, - TaskStatusInput, - TaskStopInput, - TaskRetryInput, - TaskFileInfoInput, - TaskListInput, +# -- auth: API-key hashing and user resolution -------------------------------- +from mcp_cloud.auth import ( # noqa: F401 + _hash_user_api_key, + _resolve_user_from_api_key, ) -app = Flask(__name__) -app.config.from_pyfile('config.py') - -def build_postgres_uri_from_env(env: dict[str, str]) -> tuple[str, dict[str, str]]: - """Construct a SQLAlchemy URI for Postgres using environment variables.""" - host = env.get("PLANEXE_POSTGRES_HOST") or "database_postgres" - port = str(env.get("PLANEXE_POSTGRES_PORT") or "5432") - dbname = env.get("PLANEXE_POSTGRES_DB") or "planexe" - user = env.get("PLANEXE_POSTGRES_USER") or "planexe" - password = env.get("PLANEXE_POSTGRES_PASSWORD") or "planexe" - uri = f"postgresql+psycopg2://{quote_plus(user)}:{quote_plus(password)}@{host}:{port}/{dbname}" - safe_config = {"host": host, "port": port, "dbname": dbname, "user": user} - return uri, safe_config - -sqlalchemy_database_uri = os.environ.get("SQLALCHEMY_DATABASE_URI") -if sqlalchemy_database_uri is None: - sqlalchemy_database_uri, db_settings = build_postgres_uri_from_env(os.environ) - logger.info(f"SQLALCHEMY_DATABASE_URI not set. Using Postgres defaults: {db_settings}") -else: - logger.info("Using SQLALCHEMY_DATABASE_URI from environment.") - -app.config['SQLALCHEMY_DATABASE_URI'] = sqlalchemy_database_uri -app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {'pool_recycle': 280, 'pool_pre_ping': True} -db.init_app(app) - -def ensure_planitem_stop_columns() -> None: - statements = ( - "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_jsonl TEXT", - "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_bytes INTEGER", - "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_activity_overview_json JSON", - "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_artifact_layout_version INTEGER", - "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested BOOLEAN", - "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested_timestamp TIMESTAMP", - ) - with db.engine.begin() as conn: - for statement in statements: - try: - conn.execute(text(statement)) - except Exception as exc: - logger.warning("Schema update failed for %s: %s", statement, exc, exc_info=True) - -with app.app_context(): - ensure_planitem_stop_columns() - -# Shown in MCP initialize (e.g. Inspector) so clients know what PlanExe does. -PLANEXE_SERVER_INSTRUCTIONS = ( - "PlanExe generates strategic project-plan drafts from a natural-language prompt. " - "Output is a self-contained interactive HTML report (~700KB) with 20+ sections including " - "executive summary, interactive Gantt charts, risk analysis, SWOT, governance, investor pitch, " - "team profiles, work breakdown, scenario comparison, expert criticism, and adversarial sections " - "(premortem, self-audit checklist, premise attacks) that stress-test whether the plan holds up. " - "The output is a draft to refine, not final ground truth — but it surfaces hard questions the prompter may not have considered. " - "Use PlanExe for substantial multi-phase projects with constraints, stakeholders, budgets, and timelines. " - "Do not use PlanExe for tiny one-shot outputs (for example: 'give me a 5-point checklist'); use a normal LLM response for that. " - "The planning pipeline is fixed end-to-end; callers cannot select individual internal pipeline steps to run. " - "Required interaction order: call prompt_examples first. " - "Optional before plan_create: call model_profiles to see profile guidance and available models in each profile. " - "Then perform a non-tool step: draft a strong prompt as flowing prose (not structured markdown with headers or bullets), " - "typically ~300-800 words, and get user approval. " - "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. " - "Write the prompt as flowing prose — weave specs, constraints, and targets naturally into sentences. " - "Only after approval, call plan_create. " - "Each plan_create call creates a new task_id; the server does not enforce a global per-client concurrency limit. " - "Then poll plan_status (about every 5 minutes); use plan_file_info when complete. " - "If a run fails, call plan_retry with the failed task_id to requeue it (optional model_profile, defaults to baseline). " - "To stop, call plan_stop with the task_id from plan_create; stopping is asynchronous and the task will eventually transition to failed. " - "If model_profiles returns MODEL_PROFILES_UNAVAILABLE, inform the user that no models are currently configured and the server administrator needs to set up model profiles. " - "Tool errors use {error:{code,message}}. plan_file_info returns {ready:false,reason:...} while the artifact is not yet ready; check readiness by testing whether download_url is present in the response. " - "plan_file_info download_url is the absolute URL where the requested artifact can be downloaded. " - "To list recent tasks for a user call plan_list with user_api_key; returns task_id, state, progress_percentage, created_at, and prompt_excerpt for each task. " - "plan_status state contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. " - "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the task was likely queued but not picked up by a worker (server issue). " - "If plan_status is in processing and output files do not change for longer than 20 minutes, the plan_create likely failed/stalled. " - "In both cases, report the issue to PlanExe developers on GitHub: https://github.com/PlanExeOrg/PlanExe/issues . " - "Main output: a self-contained interactive HTML report (~700KB) with collapsible sections and interactive Gantt charts — open in a browser. " - "The zip contains the intermediary pipeline files (md, json, csv) that fed the report." +# -- db_queries: plan lookup and sync DB operations ---------------------------- +from mcp_cloud.db_queries import ( # noqa: F401 + find_plan_by_task_id, + get_plan_by_id, + resolve_plan_for_task_id, + _create_plan_sync, + _get_plan_status_snapshot_sync, + _request_plan_stop_sync, + _retry_failed_plan_sync, + _get_plan_for_report_sync, + _list_plans_sync, + get_plan_state_mapping, + _extract_plan_create_metadata_overrides, + _merge_plan_create_config, ) -mcp_cloud = Server("planexe-mcp-cloud", instructions=PLANEXE_SERVER_INSTRUCTIONS) - -# Base directory for run artifacts (not used directly, fetched via worker_plan HTTP API) -BASE_DIR_RUN = Path(os.environ.get("PLANEXE_RUN_DIR", Path(__file__).parent.parent / "run")).resolve() - -WORKER_PLAN_URL = os.environ.get("PLANEXE_WORKER_PLAN_URL", "http://worker_plan:8000") - -REPORT_FILENAME = "030-report.html" -REPORT_CONTENT_TYPE = "text/html; charset=utf-8" -ZIP_FILENAME = "run.zip" -ZIP_CONTENT_TYPE = "application/zip" -ZIP_SNAPSHOT_MAX_BYTES = 100_000_000 - -ModelProfileInput = Literal[ - "baseline", - "premium", - "frontier", - "custom", -] -MODEL_PROFILE_TITLES = { - ModelProfileEnum.BASELINE.value: "Baseline", - ModelProfileEnum.PREMIUM.value: "Premium", - ModelProfileEnum.FRONTIER.value: "Frontier", - ModelProfileEnum.CUSTOM.value: "Custom", -} -MODEL_PROFILE_SUMMARIES = { - ModelProfileEnum.BASELINE.value: "Cheap and fast; recommended default when creating a plan.", - ModelProfileEnum.PREMIUM.value: "Higher-cost profile tuned for stronger output quality.", - ModelProfileEnum.FRONTIER.value: "Most capable models first; usually slowest/most expensive.", - ModelProfileEnum.CUSTOM.value: "User-managed profile file for custom model ordering.", -} - -class TaskCreateRequest(BaseModel): - prompt: str - model_profile: Optional[ModelProfileInput] = None - user_api_key: Optional[str] = None - -class TaskStatusRequest(BaseModel): - task_id: str - -class TaskStopRequest(BaseModel): - task_id: str - -class TaskRetryRequest(BaseModel): - task_id: str - model_profile: ModelProfileInput = "baseline" - -class TaskFileInfoRequest(BaseModel): - task_id: str - artifact: Optional[str] = None - -class TaskListRequest(BaseModel): - user_api_key: str - limit: int = 10 - -class ModelProfilesRequest(BaseModel): - """No input parameters.""" - pass - -# Helper functions -def find_plan_by_task_id(task_id: str) -> Optional[PlanItem]: - """Find PlanItem by MCP task_id (UUID), with legacy fallback.""" - task = get_task_by_id(task_id) - if task is not None: - return task - - def _query_legacy() -> Optional[PlanItem]: - query = db.session.query(PlanItem) - if db.engine.dialect.name == "postgresql": - tasks = query.filter( - cast(PlanItem.parameters, JSONB).contains({"_mcp_task_id": task_id}) - ).all() - else: - tasks = query.filter( - PlanItem.parameters.contains({"_mcp_task_id": task_id}) - ).all() - if tasks: - return tasks[0] - return None - - if has_app_context(): - legacy_task = _query_legacy() - else: - with app.app_context(): - legacy_task = _query_legacy() - if legacy_task is not None: - logger.debug("Resolved legacy MCP task id %s to task %s", task_id, legacy_task.id) - return legacy_task - -def get_task_by_id(task_id: str) -> Optional[PlanItem]: - """Fetch a PlanItem by its UUID string.""" - def _query() -> Optional[PlanItem]: - try: - task_uuid = uuid.UUID(task_id) - except ValueError: - return None - return db.session.get(PlanItem, task_uuid) - - if has_app_context(): - return _query() - with app.app_context(): - return _query() - -def resolve_task_for_task_id(task_id: str) -> Optional[PlanItem]: - """Resolve a PlanItem from a task_id (UUID), with legacy fallback.""" - return find_plan_by_task_id(task_id) - -def _hash_user_api_key(raw_key: str) -> str: - secret = os.environ.get("PLANEXE_API_KEY_SECRET", "dev-api-key-secret") - if secret == "dev-api-key-secret": - logger.warning("PLANEXE_API_KEY_SECRET not set. Using dev secret for API key hashing.") - return hashlib.sha256(f"{secret}:{raw_key}".encode("utf-8")).hexdigest() - -def _resolve_user_from_api_key(raw_key: str) -> Optional[dict[str, Any]]: - if not raw_key: - return None - key_hash = _hash_user_api_key(raw_key) - with app.app_context(): - api_key = UserApiKey.query.filter_by(key_hash=key_hash, revoked_at=None).first() - if not api_key: - return None - user = db.session.get(UserAccount, api_key.user_id) - if not user: - return None - - user_context = { - "user_id": str(user.id), - "credits_balance": float(user.credits_balance or 0), - } - api_key.last_used_at = datetime.now(UTC) - db.session.commit() - return user_context - -def _create_task_sync( - prompt: str, - config: Optional[dict[str, Any]], - metadata: Optional[dict[str, Any]], -) -> dict[str, Any]: - with app.app_context(): - parameters = dict(config or {}) - parameters["model_profile"] = normalize_model_profile(parameters.get("model_profile")).value - parameters["trigger_source"] = "mcp plan_create" - - task = PlanItem( - prompt=prompt, - state=PlanState.pending, - user_id=metadata.get("user_id", "admin") if metadata else "admin", - parameters=parameters, - ) - db.session.add(task) - db.session.commit() - - task_id = str(task.id) - event_context = { - "task_id": task_id, - "task_handle": task_id, - "prompt": task.prompt, - "user_id": task.user_id, - "config": config, - "metadata": metadata, - "parameters": task.parameters, - } - event = EventItem( - event_type=EventType.TASK_PENDING, - message="Enqueued task via MCP", - context=event_context, - ) - db.session.add(event) - db.session.commit() - - created_at = task.timestamp_created - if created_at and created_at.tzinfo is None: - created_at = created_at.replace(tzinfo=UTC) - return { - "task_id": task_id, - "created_at": created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z"), - } - -def _get_task_status_snapshot_sync(task_id: str) -> Optional[dict[str, Any]]: - with app.app_context(): - task = find_plan_by_task_id(task_id) - if task is None: - return None - return { - "id": str(task.id), - "state": task.state, - "stop_requested": bool(task.stop_requested), - "progress_percentage": task.progress_percentage, - "timestamp_created": task.timestamp_created, - } - -def _request_task_stop_sync(task_id: str) -> Optional[dict[str, Any]]: - with app.app_context(): - task = find_plan_by_task_id(task_id) - if task is None: - return None - stop_requested = False - if task.state in (PlanState.pending, PlanState.processing): - task.stop_requested = True - task.stop_requested_timestamp = datetime.now(UTC) - task.progress_message = "Stop requested by user." - db.session.commit() - logger.info("Stop requested for task %s; stop flag set on task %s.", task_id, task.id) - stop_requested = True - return { - "state": get_task_state_mapping(task.state), - "stop_requested": stop_requested, - } - - -def _retry_failed_task_sync(task_id: str, model_profile: str) -> Optional[dict[str, Any]]: - with app.app_context(): - task = find_plan_by_task_id(task_id) - if task is None: - return None - if task.state != PlanState.failed: - return { - "error": { - "code": "TASK_NOT_FAILED", - "message": f"Task is not in failed state: {task_id}", - } - } - - normalized_profile = normalize_model_profile(model_profile).value - now_utc = datetime.now(UTC) - parameters = dict(task.parameters) if isinstance(task.parameters, dict) else {} - parameters["model_profile"] = normalized_profile - parameters["trigger_source"] = "mcp plan_retry" - - # Reset task state and clear prior run artifacts before requeueing. - task.state = PlanState.pending - task.timestamp_created = now_utc - task.progress_percentage = 0.0 - task.progress_message = "Retry requested via MCP." - task.stop_requested = False - task.stop_requested_timestamp = None - task.generated_report_html = None - task.run_zip_snapshot = None - task.run_track_activity_jsonl = None - task.run_track_activity_bytes = None - task.run_activity_overview_json = None - task.run_artifact_layout_version = None - task.parameters = parameters - db.session.commit() - - event_context = { - "task_id": str(task.id), - "task_handle": str(task.id), - "retry_of_task_id": task_id, - "model_profile": normalized_profile, - "parameters": task.parameters, - } - event = EventItem( - event_type=EventType.TASK_PENDING, - message="Retried failed task via MCP", - context=event_context, - ) - db.session.add(event) - db.session.commit() - - return { - "task_id": str(task.id), - "state": get_task_state_mapping(task.state), - "model_profile": normalized_profile, - "retried_at": now_utc.replace(microsecond=0).isoformat().replace("+00:00", "Z"), - } - - -def _get_task_for_report_sync(task_id: str) -> Optional[dict[str, Any]]: - with app.app_context(): - task = resolve_task_for_task_id(task_id) - if task is None: - return None - return { - "id": str(task.id), - "state": task.state, - "progress_message": task.progress_message, - } - -def _list_tasks_sync(user_id: str, limit: int) -> list[dict[str, Any]]: - with app.app_context(): - tasks = ( - db.session.query(PlanItem) - .filter_by(user_id=user_id) - .order_by(PlanItem.timestamp_created.desc()) - .limit(max(1, min(limit, 50))) - .all() - ) - results = [] - for task in tasks: - created_at = task.timestamp_created - if created_at and created_at.tzinfo is None: - created_at = created_at.replace(tzinfo=UTC) - results.append({ - "task_id": str(task.id), - "state": get_task_state_mapping(task.state), - "progress_percentage": float(task.progress_percentage or 0.0), - "created_at": ( - created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z") - if created_at else None - ), - "prompt_excerpt": (task.prompt or "")[:100], - }) - return results - - -def list_files_from_zip_bytes(zip_bytes: bytes) -> list[str]: - """List file entries from an in-memory zip archive.""" - try: - with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file: - files = [name for name in zip_file.namelist() if not name.endswith("/")] - return sorted(files) - except Exception as exc: - logger.warning("Unable to list files from zip snapshot: %s", exc) - return [] - -def extract_file_from_zip_bytes(zip_bytes: bytes, file_path: str) -> Optional[bytes]: - """Extract a file from an in-memory zip archive.""" - try: - with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file: - file_path_normalized = file_path.lstrip('/') - try: - return zip_file.read(file_path_normalized) - except KeyError: - return None - except Exception as exc: - logger.warning("Unable to read %s from zip snapshot: %s", file_path, exc) - return None - -def extract_file_from_zip_file(file_handle: io.BufferedIOBase, file_path: str) -> Optional[bytes]: - """Extract a file from a seekable zip file handle.""" - try: - with zipfile.ZipFile(file_handle, 'r') as zip_file: - file_path_normalized = file_path.lstrip('/') - try: - return zip_file.read(file_path_normalized) - except KeyError: - return None - except Exception as exc: - logger.warning("Unable to read %s from zip stream: %s", file_path, exc) - return None - -def fetch_report_from_db(task_id: str) -> Optional[bytes]: - """Fetch the report HTML stored in the PlanItem.""" - task = get_task_by_id(task_id) - if task and task.generated_report_html is not None: - return task.generated_report_html.encode("utf-8") - return None - -def fetch_zip_snapshot(task_id: str) -> Optional[bytes]: - """Fetch the zip snapshot stored in the PlanItem.""" - task = get_task_by_id(task_id) - if task and task.run_zip_snapshot is not None: - return task.run_zip_snapshot - return None - -def fetch_file_from_zip_snapshot(task_id: str, file_path: str) -> Optional[bytes]: - """Fetch a file from the PlanItem zip snapshot.""" - task = get_task_by_id(task_id) - if task and task.run_zip_snapshot is not None: - return extract_file_from_zip_bytes(task.run_zip_snapshot, file_path) - return None - -def list_files_from_zip_snapshot(task_id: str) -> Optional[list[str]]: - """List files from the PlanItem zip snapshot.""" - task = get_task_by_id(task_id) - if task and task.run_zip_snapshot is not None: - return list_files_from_zip_bytes(task.run_zip_snapshot) - return None - -async def fetch_artifact_from_worker_plan(run_id: str, file_path: str) -> Optional[bytes]: - """Fetch an artifact file from worker_plan via HTTP.""" - try: - async with httpx.AsyncClient(timeout=60.0) as client: - # For report.html, use the dedicated report endpoint (most efficient) - if ( - file_path == "report.html" - or file_path.endswith("/report.html") - or file_path == REPORT_FILENAME - or file_path.endswith(f"/{REPORT_FILENAME}") - ): - report_response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/report") - if report_response.status_code == 200: - return report_response.content - logger.warning(f"Worker plan returned {report_response.status_code} for report: {run_id}") - report_from_db = await asyncio.to_thread(fetch_report_from_db, run_id) - if report_from_db is not None: - return report_from_db - report_from_zip = await asyncio.to_thread( - fetch_file_from_zip_snapshot, run_id, REPORT_FILENAME - ) - if report_from_zip is not None: - return report_from_zip - return None - - # For other files, fetch the zip and extract the file - # This is less efficient but works without a file serving endpoint - async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as zip_response: - if zip_response.status_code != 200: - logger.warning(f"Worker plan returned {zip_response.status_code} for zip: {run_id}") - else: - zip_too_large = False - content_length = zip_response.headers.get("content-length") - if content_length: - try: - if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES: - logger.warning( - "Zip snapshot too large (%s bytes) for run %s; skipping.", - content_length, - run_id, - ) - zip_too_large = True - except ValueError: - logger.warning( - "Invalid Content-Length for zip snapshot: %s", content_length - ) - if not zip_too_large: - with tempfile.TemporaryFile() as tmp_file: - size = 0 - async for chunk in zip_response.aiter_bytes(): - size += len(chunk) - if size > ZIP_SNAPSHOT_MAX_BYTES: - logger.warning( - "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.", - ZIP_SNAPSHOT_MAX_BYTES, - run_id, - ) - zip_too_large = True - break - tmp_file.write(chunk) - if not zip_too_large: - tmp_file.seek(0) - file_data = extract_file_from_zip_file(tmp_file, file_path) - if file_data is not None: - return file_data - - snapshot_file = await asyncio.to_thread(fetch_file_from_zip_snapshot, run_id, file_path) - if snapshot_file is not None: - return snapshot_file - return None - - except Exception as e: - logger.error(f"Error fetching artifact from worker_plan: {e}", exc_info=True) - return None - -async def fetch_file_list_from_worker_plan(run_id: str) -> Optional[list[str]]: - """Fetch the list of files from worker_plan via HTTP.""" - try: - async with httpx.AsyncClient(timeout=30.0) as client: - response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/files") - if response.status_code == 200: - data = response.json() - files = data.get("files", []) - if files: - return files - fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id) - if fallback_files: - return fallback_files - return files - logger.warning(f"Worker plan returned {response.status_code} for files list: {run_id}") - fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id) - if fallback_files is not None: - return fallback_files - return None - except Exception as e: - logger.error(f"Error fetching file list from worker_plan: {e}", exc_info=True) - return None - - -def list_files_from_local_run_dir(run_id: str) -> Optional[list[str]]: - """ - List files from local run directory when this service shares PLANEXE_RUN_DIR - with the worker (e.g., Docker compose). - """ - run_dir = (BASE_DIR_RUN / run_id).resolve() - try: - if not run_dir.is_relative_to(BASE_DIR_RUN): - return None - except ValueError: - return None - if not run_dir.exists() or not run_dir.is_dir(): - return None - try: - return sorted([path.name for path in run_dir.iterdir() if path.is_file()]) - except Exception as exc: - logger.warning("Unable to list local run dir files for %s: %s", run_id, exc) - return None - -async def fetch_zip_from_worker_plan(run_id: str) -> Optional[bytes]: - """Fetch the zip snapshot from worker_plan via HTTP.""" - try: - async with httpx.AsyncClient(timeout=60.0) as client: - async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as response: - if response.status_code != 200: - logger.warning("Worker plan returned %s for zip: %s", response.status_code, run_id) - else: - zip_too_large = False - content_length = response.headers.get("content-length") - if content_length: - try: - if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES: - logger.warning( - "Zip snapshot too large (%s bytes) for run %s; skipping.", - content_length, - run_id, - ) - zip_too_large = True - except ValueError: - logger.warning( - "Invalid Content-Length for zip snapshot: %s", content_length - ) - if not zip_too_large: - buffer = BytesIO() - size = 0 - async for chunk in response.aiter_bytes(): - size += len(chunk) - if size > ZIP_SNAPSHOT_MAX_BYTES: - logger.warning( - "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.", - ZIP_SNAPSHOT_MAX_BYTES, - run_id, - ) - zip_too_large = True - break - buffer.write(chunk) - if not zip_too_large: - return buffer.getvalue() - - snapshot_bytes = await asyncio.to_thread(fetch_zip_snapshot, run_id) - if snapshot_bytes is not None: - return snapshot_bytes - return None - except Exception as e: - logger.error(f"Error fetching zip from worker_plan: {e}", exc_info=True) - return None - - -def _sanitize_legacy_zip_snapshot(zip_bytes: bytes) -> Optional[bytes]: - """Remove internal track_activity.jsonl files from legacy zip snapshots.""" - try: - with zipfile.ZipFile(BytesIO(zip_bytes), "r") as in_zip: - entries = [name for name in in_zip.namelist() if not name.endswith("/")] - if not any(name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl" for name in entries): - return zip_bytes - out_buffer = BytesIO() - with zipfile.ZipFile(out_buffer, "w", compression=zipfile.ZIP_DEFLATED) as out_zip: - for name in entries: - if name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl": - continue - out_zip.writestr(name, in_zip.read(name)) - return out_buffer.getvalue() - except Exception as exc: - logger.warning("Unable to sanitize legacy run zip snapshot: %s", exc) - return None - - -async def fetch_user_downloadable_zip(task_id: str) -> Optional[bytes]: - """ - Fetch a user-downloadable zip for a task. - New layout snapshots are served directly from PlanItem.run_zip_snapshot. - Legacy/task-dir fallbacks are sanitized to remove track_activity.jsonl. - """ - task = await asyncio.to_thread(get_task_by_id, task_id) - if task is None: - return None - - snapshot_bytes = task.run_zip_snapshot if task.run_zip_snapshot is not None else None - layout_version = task.run_artifact_layout_version or 0 - if snapshot_bytes is not None: - if layout_version >= 2: - return snapshot_bytes - return _sanitize_legacy_zip_snapshot(snapshot_bytes) - - worker_plan_zip = await fetch_zip_from_worker_plan(str(task.id)) - if worker_plan_zip is None: - return None - return _sanitize_legacy_zip_snapshot(worker_plan_zip) - -def compute_sha256(content: str | bytes) -> str: - """Compute SHA256 hash of content.""" - if isinstance(content, str): - content = content.encode('utf-8') - return hashlib.sha256(content).hexdigest() - -def get_task_state_mapping(task_state: PlanState) -> str: - """Map PlanState to MCP task state.""" - mapping = { - PlanState.pending: "pending", - PlanState.processing: "processing", - PlanState.completed: "completed", - PlanState.failed: "failed", - } - return mapping.get(task_state, "pending") - -def _extract_task_create_metadata_overrides(arguments: dict[str, Any]) -> dict[str, Any]: - """Extract plan_create runtime overrides from hidden metadata containers. - - Supported hidden containers: - - arguments.tool_metadata - - arguments.metadata - - arguments._meta - - If a container includes nested namespaces, these are checked first: - - plan_create - - task_create (legacy alias) - - planexe_task_create (legacy alias) - - planexe - """ - merged: dict[str, Any] = {} - metadata_candidates: list[dict[str, Any]] = [] - - for key in ("tool_metadata", "metadata", "_meta"): - candidate = arguments.get(key) - if isinstance(candidate, dict): - metadata_candidates.append(candidate) - - for candidate in metadata_candidates: - merged.update(candidate) - for nested_key in ("plan_create", "task_create", "planexe_task_create", "planexe"): - nested = candidate.get(nested_key) - if isinstance(nested, dict): - merged.update(nested) - - return merged - -def _merge_task_create_config( - config: Optional[dict[str, Any]], - model_profile: Optional[str], -) -> Optional[dict[str, Any]]: - merged = dict(config or {}) - if isinstance(model_profile, str): - candidate_profile = model_profile.strip() - if candidate_profile and "model_profile" not in merged: - merged["model_profile"] = candidate_profile - return merged or None - - -def _sort_llm_config_entries(items: list[tuple[str, Any]]) -> list[tuple[str, Any]]: - def sort_key(item: tuple[str, Any]) -> tuple[int, str]: - key, model_data = item - priority = None - if isinstance(model_data, dict): - maybe_priority = model_data.get("priority") - if isinstance(maybe_priority, int): - priority = maybe_priority - if priority is None: - priority = 999999 - return priority, key - - return sorted(items, key=sort_key) - - -def _extract_model_profile_entries( - model_map: dict[str, Any], - whitelist: Optional[set[str]], -) -> list[dict[str, Any]]: - models: list[dict[str, Any]] = [] - - for model_key, model_data in _sort_llm_config_entries(list(model_map.items())): - class_name = model_data.get("class") if isinstance(model_data, dict) else None - if not is_llm_class_allowed(class_name, whitelist): - continue - - model_name = None - priority = None - if isinstance(model_data, dict): - arguments = model_data.get("arguments") - if isinstance(arguments, dict): - maybe_model = arguments.get("model") - if isinstance(maybe_model, str): - model_name = maybe_model - maybe_priority = model_data.get("priority") - if isinstance(maybe_priority, int): - priority = maybe_priority - elif isinstance(model_data.get("prio"), int): - priority = model_data["prio"] - - models.append( - { - "key": model_key, - "provider_class": class_name if isinstance(class_name, str) else None, - "model": model_name, - "priority": priority, - } - ) - - return models - - -def _profile_models_payload( - profile: ModelProfileEnum, - whitelist: Optional[set[str]], -) -> dict[str, Any]: - config_filename = default_filename_for_profile(profile) - planexe_config_path = PlanExeConfig.resolve_planexe_config_path() - config_path = PlanExeConfig.find_file_in_search_order(config_filename, planexe_config_path) - if config_path is None: - return { - "profile": profile.value, - "title": MODEL_PROFILE_TITLES[profile.value], - "summary": MODEL_PROFILE_SUMMARIES[profile.value], - "model_count": 0, - "models": [], - } - - try: - with config_path.open("r", encoding="utf-8") as fh: - model_map = json.load(fh) - except Exception as exc: - logger.warning( - "Unable to read profile config %s for model profile %s: %s", - config_filename, - profile.value, - exc, - ) - return { - "profile": profile.value, - "title": MODEL_PROFILE_TITLES[profile.value], - "summary": MODEL_PROFILE_SUMMARIES[profile.value], - "model_count": 0, - "models": [], - } - - if not isinstance(model_map, dict): - return { - "profile": profile.value, - "title": MODEL_PROFILE_TITLES[profile.value], - "summary": MODEL_PROFILE_SUMMARIES[profile.value], - "model_count": 0, - "models": [], - } - - models = _extract_model_profile_entries(model_map, whitelist) - return { - "profile": profile.value, - "title": MODEL_PROFILE_TITLES[profile.value], - "summary": MODEL_PROFILE_SUMMARIES[profile.value], - "model_count": len(models), - "models": models, - } - - -def _get_model_profiles_sync() -> dict[str, Any]: - raw_whitelist = os.environ.get(ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES) - whitelist = parse_llm_class_whitelist(raw_whitelist) - default_profile = resolve_model_profile_from_env().value - profiles_all = [ - _profile_models_payload(profile, whitelist) - for profile in ModelProfileEnum - ] - profiles = [profile for profile in profiles_all if int(profile.get("model_count") or 0) > 0] - - return { - "default_profile": default_profile, - "profiles": profiles, - "message": ( - "Use one of these profile values in plan_create.model_profile. " - "Model lists show what is currently available in each profile." - ), - } - -# Context var set by HTTP server so download URLs use the request's host when -# PLANEXE_MCP_PUBLIC_BASE_URL is not set (avoids localhost for remote clients). -_download_base_url_ctx: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar( - "download_base_url", default=None +# -- zip_utils: zip extraction, sanitization, hashing ------------------------- +from mcp_cloud.zip_utils import ( # noqa: F401 + list_files_from_zip_bytes, + extract_file_from_zip_bytes, + extract_file_from_zip_file, + fetch_report_from_db, + fetch_zip_snapshot, + fetch_file_from_zip_snapshot, + list_files_from_zip_snapshot, + _sanitize_legacy_zip_snapshot, + compute_sha256, ) +# -- worker_fetchers: HTTP fetchers for worker_plan artifacts ------------------ +from mcp_cloud.worker_fetchers import ( # noqa: F401 + fetch_artifact_from_worker_plan, + fetch_file_list_from_worker_plan, + list_files_from_local_run_dir, + fetch_zip_from_worker_plan, + fetch_user_downloadable_zip, +) -def set_download_base_url(base_url: Optional[str]) -> None: - """Set the base URL used for download links for this request (e.g. from HTTP Request). - Cleared automatically when the request ends. Used when PLANEXE_MCP_PUBLIC_BASE_URL is unset.""" - if base_url is not None: - _download_base_url_ctx.set(base_url.rstrip("/")) - else: - try: - _download_base_url_ctx.set("") - except LookupError: - pass - - -def clear_download_base_url() -> None: - """Clear the request-scoped base URL (call when request ends).""" - try: - _download_base_url_ctx.set("") - except LookupError: - pass - - -def _get_download_base_url() -> Optional[str]: - """Return base URL for download links: env var, then request context, then None.""" - base_url = os.environ.get("PLANEXE_MCP_PUBLIC_BASE_URL") - if base_url: - return base_url.rstrip("/") - try: - ctx_url = _download_base_url_ctx.get() - return ctx_url if ctx_url else None - except LookupError: - return None - - -def build_report_download_path(task_id: str) -> str: - return f"/download/{task_id}/{REPORT_FILENAME}" - - -def build_zip_download_path(task_id: str) -> str: - return f"/download/{task_id}/{ZIP_FILENAME}" - - -# --------------------------------------------------------------------------- -# Signed, expiring download tokens -# --------------------------------------------------------------------------- - -# Default TTL for signed download tokens (seconds). Configurable via env var. -DOWNLOAD_TOKEN_TTL_SECONDS = int(os.environ.get("PLANEXE_DOWNLOAD_TOKEN_TTL", "900")) # 15 min - -# Per-process fallback secret when no env var is set. Tokens won't survive a -# server restart, but that is acceptable for the fallback case. -_random_token_secret: Optional[bytes] = None - - -def _get_download_token_secret() -> bytes: - """Return the HMAC-SHA256 secret used to sign download tokens. - - Priority: PLANEXE_DOWNLOAD_TOKEN_SECRET → PLANEXE_API_KEY_SECRET → - per-process random (with a warning logged once). - """ - global _random_token_secret - for env_var in ("PLANEXE_DOWNLOAD_TOKEN_SECRET", "PLANEXE_API_KEY_SECRET"): - value = os.environ.get(env_var) - if value: - return value.encode() - if _random_token_secret is None: - _random_token_secret = secrets.token_bytes(32) - logger.warning( - "PLANEXE_DOWNLOAD_TOKEN_SECRET is not set; using a random per-process secret. " - "Download tokens will be invalidated on server restart. " - "Set PLANEXE_DOWNLOAD_TOKEN_SECRET to a stable value." - ) - return _random_token_secret - - -def generate_download_token(task_id: str, filename: str) -> str: - """Return a signed, time-limited token for one task artifact download. - - Format: ``{expiry_unix_ts}.{hmac_hex}`` - The HMAC covers ``task_id:filename:expiry`` so the token is scoped to - exactly one file and cannot be reused for a different task. - """ - expiry = int(time.time()) + DOWNLOAD_TOKEN_TTL_SECONDS - message = f"{task_id}:{filename}:{expiry}".encode() - mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest() - return f"{expiry}.{mac}" - - -def validate_download_token(token: str, task_id: str, filename: str) -> bool: - """Return True when *token* is a valid, unexpired token for the given artifact.""" - try: - expiry_str, mac = token.split(".", 1) - expiry = int(expiry_str) - except (ValueError, AttributeError): - return False - if time.time() > expiry: - return False - message = f"{task_id}:{filename}:{expiry}".encode() - expected_mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest() - return hmac.compare_digest(mac, expected_mac) - - -def build_report_download_url(task_id: str) -> Optional[str]: - base_url = _get_download_base_url() - if not base_url: - return None - token = generate_download_token(task_id, REPORT_FILENAME) - return f"{base_url}{build_report_download_path(task_id)}?token={token}" - - -def build_zip_download_url(task_id: str) -> Optional[str]: - base_url = _get_download_base_url() - if not base_url: - return None - token = generate_download_token(task_id, ZIP_FILENAME) - return f"{base_url}{build_zip_download_path(task_id)}?token={token}" - - -def _load_mcp_example_prompts() -> list[str]: - """Load prompts from the catalog that are marked as MCP examples (mcp_example or mcp-example-prompt true). - - Uses worker_plan_api.PromptCatalog the same way as frontend_single_user and frontend_multi_user - (no env var). Tries repo-root import first, then adds worker_plan to sys.path so worker_plan_api - is top-level (same as frontends). Falls back to built-in examples if the catalog is unavailable. - """ - catalog = None - try: - from worker_plan.worker_plan_api.prompt_catalog import PromptCatalog - - catalog = PromptCatalog() - catalog.load_simple_plan_prompts() - except Exception: - try: - # Same as frontends when worker_plan exists; when not (e.g. Docker), repo_root has worker_plan_api - import sys - - repo_root = Path(__file__).resolve().parent.parent - worker_plan_dir = repo_root / "worker_plan" - path_to_add = str(worker_plan_dir if worker_plan_dir.exists() else repo_root) - if path_to_add not in sys.path: - sys.path.insert(0, path_to_add) - from worker_plan_api.prompt_catalog import PromptCatalog - - catalog = PromptCatalog() - catalog.load_simple_plan_prompts() - except Exception as e: - logger.warning( - "Prompt catalog unavailable (%s); using built-in examples.", - e, - ) - return _builtin_mcp_example_prompts() - - if catalog is None: - return _builtin_mcp_example_prompts() - - samples: list[str] = [] - for item in catalog.all(): - if item.extras.get("mcp_example") is True or item.extras.get("mcp-example-prompt") is True: - samples.append(item.prompt) - if not samples: - return _builtin_mcp_example_prompts() - return samples - - -def _builtin_mcp_example_prompts() -> list[str]: - """Fallback example prompts when the catalog file is missing or has no mcp_example entries.""" - return [ - ( - "Vegan Butcher Shop. That sells artificial meat (Plant-Based). Location Kødbyen, Copenhagen. " - "Sell sandwiches and sausages. Provocative marketing. Budget: 10 million DKK. Grand Opening in month 3. " - "Profitability Goal: month 12. Create a signature item that is a social media hit. " - "Pick a realistic scenario. I already have negotiated a 2 year lease inside Kødbyen. " - "Banned words: blockchain, VR, AR, AI, Robots." - ), - ( - "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. " - "Budget 2.5M DKK. Open within 12 months. Include equipment, staffing, permits, and marketing. " - "Pick a realistic scenario; avoid overly ambitious timelines." - ), - ] - - -PLAN_CREATE_INPUT_SCHEMA = PlanCreateInput.model_json_schema() -PLAN_CREATE_OUTPUT_SCHEMA = PlanCreateOutput.model_json_schema() -PLAN_STATUS_SUCCESS_SCHEMA = PlanStatusSuccess.model_json_schema() -PLAN_STATUS_OUTPUT_SCHEMA = { - "oneOf": [ - { - "type": "object", - "properties": {"error": ErrorDetail.model_json_schema()}, - "required": ["error"], - }, - PLAN_STATUS_SUCCESS_SCHEMA, - ] -} -PLAN_STOP_OUTPUT_SCHEMA = PlanStopOutput.model_json_schema() -PLAN_RETRY_OUTPUT_SCHEMA = PlanRetryOutput.model_json_schema() -PLAN_FILE_INFO_READY_OUTPUT_SCHEMA = PlanFileInfoReadyOutput.model_json_schema() -PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA = PlanFileInfoNotReadyOutput.model_json_schema() -PLAN_FILE_INFO_OUTPUT_SCHEMA = { - "oneOf": [ - { - "type": "object", - "properties": {"error": ErrorDetail.model_json_schema()}, - "required": ["error"], - }, - PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA, - PLAN_FILE_INFO_READY_OUTPUT_SCHEMA, - ] -} -PLAN_STATUS_INPUT_SCHEMA = PlanStatusInput.model_json_schema() -PLAN_STOP_INPUT_SCHEMA = PlanStopInput.model_json_schema() -PLAN_RETRY_INPUT_SCHEMA = PlanRetryInput.model_json_schema() -PLAN_FILE_INFO_INPUT_SCHEMA = PlanFileInfoInput.model_json_schema() - -PROMPT_EXAMPLES_INPUT_SCHEMA = PromptExamplesInput.model_json_schema() -PROMPT_EXAMPLES_OUTPUT_SCHEMA = PromptExamplesOutput.model_json_schema() -MODEL_PROFILES_INPUT_SCHEMA = ModelProfilesInput.model_json_schema() -MODEL_PROFILES_OUTPUT_SCHEMA = ModelProfilesOutput.model_json_schema() -PLAN_LIST_INPUT_SCHEMA = PlanListInput.model_json_schema() -PLAN_LIST_OUTPUT_SCHEMA = PlanListOutput.model_json_schema() - -# Backward-compatible aliases for tests that reference old TASK_* names -TASK_CREATE_INPUT_SCHEMA = PLAN_CREATE_INPUT_SCHEMA -TASK_CREATE_OUTPUT_SCHEMA = PLAN_CREATE_OUTPUT_SCHEMA -TASK_STATUS_INPUT_SCHEMA = PLAN_STATUS_INPUT_SCHEMA -TASK_STATUS_OUTPUT_SCHEMA = PLAN_STATUS_OUTPUT_SCHEMA -TASK_STOP_INPUT_SCHEMA = PLAN_STOP_INPUT_SCHEMA -TASK_STOP_OUTPUT_SCHEMA = PLAN_STOP_OUTPUT_SCHEMA -TASK_RETRY_INPUT_SCHEMA = PLAN_RETRY_INPUT_SCHEMA -TASK_RETRY_OUTPUT_SCHEMA = PLAN_RETRY_OUTPUT_SCHEMA -TASK_FILE_INFO_INPUT_SCHEMA = PLAN_FILE_INFO_INPUT_SCHEMA -TASK_FILE_INFO_OUTPUT_SCHEMA = PLAN_FILE_INFO_OUTPUT_SCHEMA -TASK_LIST_INPUT_SCHEMA = PLAN_LIST_INPUT_SCHEMA -TASK_LIST_OUTPUT_SCHEMA = PLAN_LIST_OUTPUT_SCHEMA - -@dataclass(frozen=True) -class ToolDefinition: - name: str - description: str - input_schema: dict[str, Any] - output_schema: Optional[dict[str, Any]] = None - annotations: Optional[dict[str, Any]] = None - -TOOL_DEFINITIONS = [ - ToolDefinition( - name="prompt_examples", - description=( - "Call this first. Returns example prompts that define what a good prompt looks like. " - "Do NOT call plan_create yet. Optional before plan_create: call model_profiles to choose model_profile. " - "Next is a non-tool step: formulate a detailed prompt (typically ~300-800 words; use examples as a baseline, similar structure) and get user approval. " - "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. " - "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. " - "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. " - "The examples demonstrate this prose style — match their tone and density. " - "Then call plan_create. " - "PlanExe is not for tiny one-shot outputs like a 5-point checklist; and it does not support selecting only some internal pipeline steps." - ), - input_schema=PROMPT_EXAMPLES_INPUT_SCHEMA, - output_schema=PROMPT_EXAMPLES_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": True, - "destructiveHint": False, - "idempotentHint": True, - "openWorldHint": False, - }, - ), - ToolDefinition( - name="model_profiles", - description=( - "Optional helper before plan_create. Returns model_profile options with plain-language guidance " - "and currently available models in each profile. " - "If no models are available, returns error code MODEL_PROFILES_UNAVAILABLE." - ), - input_schema=MODEL_PROFILES_INPUT_SCHEMA, - output_schema=MODEL_PROFILES_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": True, - "destructiveHint": False, - "idempotentHint": True, - "openWorldHint": False, - }, - ), - ToolDefinition( - name="plan_create", - description=( - "Call only after prompt_examples and after you have completed prompt drafting/approval (non-tool step). " - "PlanExe turns the approved prompt into a strategic project-plan draft (20+ sections) in ~10-20 min. " - "Sections include: executive summary, interactive Gantt charts, investor pitch, project plan with SMART criteria, " - "strategic decision analysis, scenario comparison, assumptions with expert review, governance structure, " - "SWOT analysis, team role profiles, simulated expert criticism, work breakdown structure, " - "plan review (critical issues, KPIs, financial strategy, automation opportunities), Q&A, " - "premortem with failure scenarios, self-audit checklist, and adversarial premise attacks that argue against the project. " - "The adversarial sections (premortem, self-audit, premise attacks) surface risks and questions the prompter may not have considered. " - "Returns task_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_file_info. " - "If you lose a task_id, call plan_list with your user_api_key to recover it. " - "Each plan_create call creates a new task_id (no server-side dedup). " - "If you are unsure which model_profile to choose, call model_profiles first. " - "If your deployment uses credits, include user_api_key to charge the correct account. " - "Common error codes: INVALID_USER_API_KEY, USER_API_KEY_REQUIRED, INSUFFICIENT_CREDITS." - ), - input_schema=PLAN_CREATE_INPUT_SCHEMA, - output_schema=PLAN_CREATE_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": False, - "destructiveHint": False, - "idempotentHint": False, - "openWorldHint": True, - }, - ), - ToolDefinition( - name="plan_status", - description=( - "Returns status and progress of the plan currently being created. " - "Poll at reasonable intervals only (e.g. every 5 minutes): plan generation typically takes 10-20 minutes " - "(baseline profile) and may take longer on higher-quality profiles. " - "State contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. " - "progress_percentage is 0-100 (integer-like float); 100 when completed. " - "files lists intermediate outputs produced so far; use their updated_at timestamps to detect stalls. " - "Unknown task_id returns error code TASK_NOT_FOUND. " - "Troubleshooting: pending for >5 minutes likely means queued but not picked up by a worker. " - "processing with no file-output changes for >20 minutes likely means failed/stalled. " - "Report these issues to https://github.com/PlanExeOrg/PlanExe/issues ." - ), - input_schema=PLAN_STATUS_INPUT_SCHEMA, - output_schema=PLAN_STATUS_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": True, - "destructiveHint": False, - "idempotentHint": True, - "openWorldHint": False, - }, - ), - ToolDefinition( - name="plan_stop", - description=( - "Request the plan generation to stop. Pass the task_id (the UUID returned by plan_create). " - "Stopping is asynchronous: the stop flag is set immediately but the task may continue briefly before halting. " - "A stopped task will eventually transition to the failed state. " - "If the task is already completed or failed, stop_requested returns false (the task already finished). " - "Unknown task_id returns error code TASK_NOT_FOUND." - ), - input_schema=PLAN_STOP_INPUT_SCHEMA, - output_schema=PLAN_STOP_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": False, - "destructiveHint": True, - "idempotentHint": True, - "openWorldHint": False, - }, - ), - ToolDefinition( - name="plan_retry", - description=( - "Retry a task that is currently in failed state. " - "Pass the failed task_id and optionally model_profile (defaults to baseline). " - "The task is reset to pending, prior artifacts are cleared, and the same task_id is requeued for processing. " - "Returns TASK_NOT_FOUND when task_id is unknown and TASK_NOT_FAILED when the task is not in failed state." - ), - input_schema=PLAN_RETRY_INPUT_SCHEMA, - output_schema=PLAN_RETRY_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": False, - "destructiveHint": False, - "idempotentHint": False, - "openWorldHint": True, - }, - ), - ToolDefinition( - name="plan_file_info", - description=( - "Returns file metadata (content_type, download_url, download_size) for the report or zip artifact. " - "Use artifact='report' (default) for the interactive HTML report (~700KB, self-contained with embedded JS " - "for collapsible sections and interactive Gantt charts — open in a browser). " - "Use artifact='zip' for the full pipeline output bundle (md, json, csv intermediary files that fed the report). " - "While the task is still pending or processing, returns {ready:false,reason:\"processing\"}. " - "Check readiness by testing whether download_url is present in the response. " - "Once ready, present download_url to the user or fetch and save the file locally. " - "If your client exposes plan_download (e.g. mcp_local), prefer that to save the file locally. " - "Terminal error codes: generation_failed (plan failed), content_unavailable (artifact missing). " - "Unknown task_id returns error code TASK_NOT_FOUND." - ), - input_schema=PLAN_FILE_INFO_INPUT_SCHEMA, - output_schema=PLAN_FILE_INFO_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": True, - "destructiveHint": False, - "idempotentHint": True, - "openWorldHint": False, - }, - ), - ToolDefinition( - name="plan_list", - description=( - "List the most recent tasks for an authenticated user. " - "Requires user_api_key (pex_...). " - "Returns up to `limit` tasks (default 10, max 50) newest-first, each with task_id, state, " - "progress_percentage, created_at (ISO 8601), and a prompt_excerpt (first 100 chars). " - "Use this to recover a lost task_id or to review recent activity." - ), - input_schema=PLAN_LIST_INPUT_SCHEMA, - output_schema=PLAN_LIST_OUTPUT_SCHEMA, - annotations={ - "readOnlyHint": True, - "destructiveHint": False, - "idempotentHint": True, - "openWorldHint": False, - }, - ), -] - -@mcp_cloud.list_tools() -async def handle_list_tools() -> list[Tool]: - """List all available MCP tools.""" - return [ - Tool( - name=definition.name, - description=definition.description, - outputSchema=definition.output_schema, - inputSchema=definition.input_schema, - annotations=ToolAnnotations(**definition.annotations) if definition.annotations else None, - ) - for definition in TOOL_DEFINITIONS - ] - -@mcp_cloud.call_tool() -async def handle_call_tool(name: str, arguments: dict[str, Any]) -> CallToolResult: - """Dispatch MCP tool calls and return structured JSON errors for unknown tools.""" - try: - handler = TOOL_HANDLERS.get(name) - if handler is None: - response = {"error": {"code": "INVALID_TOOL", "message": f"Unknown tool: {name}"}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - return await handler(arguments) - except Exception as e: - logger.error(f"Error handling tool {name}: {e}", exc_info=True) - response = {"error": {"code": "INTERNAL_ERROR", "message": str(e)}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - -async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult: - """Create a new PlanExe task and enqueue it for processing. - - Examples: - - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → returns task_id (UUID) + created_at - - Args: - - prompt: What the plan should cover (goal, context, constraints). - - model_profile: Optional profile ("baseline" | "premium" | "frontier" | "custom"). Call model_profiles to inspect options. - - Returns: - - content: JSON string matching structuredContent. - - structuredContent: {"task_id": "", "created_at": ...} - - isError: False on success. - """ - req = TaskCreateRequest(**arguments) - metadata_overrides = _extract_task_create_metadata_overrides(arguments) - metadata_model_profile = metadata_overrides.get("model_profile") - model_profile = req.model_profile - if model_profile is None and isinstance(metadata_model_profile, str): - model_profile = metadata_model_profile - - merged_config = _merge_task_create_config(None, model_profile) - require_user_key = os.environ.get("PLANEXE_MCP_REQUIRE_USER_KEY", "false").lower() in ("1", "true", "yes", "on") - user_context = None - if req.user_api_key: - user_context = _resolve_user_from_api_key(req.user_api_key.strip()) - if not user_context: - response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - elif require_user_key: - response = {"error": {"code": "USER_API_KEY_REQUIRED", "message": "user_api_key is required for plan_create."}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - - if user_context and float(user_context.get("credits_balance", 0.0)) <= 0.0: - response = {"error": {"code": "INSUFFICIENT_CREDITS", "message": "Not enough credits."}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - - response = await asyncio.to_thread( - _create_task_sync, - req.prompt, - merged_config, - {"user_id": str(user_context["user_id"])} if user_context else None, - ) - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - - -async def handle_prompt_examples(arguments: dict[str, Any]) -> CallToolResult: - """Return curated prompts from the catalog (mcp_example true) so LLMs can see example detail.""" - samples = _load_mcp_example_prompts() - payload = { - "samples": samples, - "message": ( - "Next: complete the non-tool step by drafting a detailed prompt (typically ~300-800 words) using these as a baseline (similar structure), then get user approval. " - "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. " - "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. " - "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. " - "The examples demonstrate this prose style — match their tone and density. " - "Only after approval, call plan_create. " - "Do not use PlanExe for tiny one-shot requests (e.g., rewrite this email, summarize this document). " - "PlanExe always runs the full fixed planning pipeline; callers cannot run only selected internal steps." - ), - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(payload))], - structuredContent=payload, - isError=False, - ) - - -async def handle_model_profiles(arguments: dict[str, Any]) -> CallToolResult: - """Return model profile options and currently available models in each profile.""" - _ = ModelProfilesRequest(**(arguments or {})) - payload = await asyncio.to_thread(_get_model_profiles_sync) - profiles = payload.get("profiles") - if not isinstance(profiles, list) or len(profiles) == 0: - response = { - "error": { - "code": "MODEL_PROFILES_UNAVAILABLE", - "message": ( - "No models are currently configured. " - "Inform the user that the server administrator needs to set up model profiles before plans can be created." - ), - } - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(payload))], - structuredContent=payload, - isError=False, - ) - - -async def handle_plan_status(arguments: dict[str, Any]) -> CallToolResult: - """Fetch the current task status, progress, and recent files for a task. - - Examples: - - {"task_id": "uuid"} → state/progress/timing + recent files - - Args: - - task_id: Task UUID returned by plan_create. - - Returns: - - content: JSON string matching structuredContent. - - structuredContent: status payload or error. - - isError: True only when task_id is unknown. - """ - req = TaskStatusRequest(**arguments) - task_id = req.task_id - - task_snapshot = await asyncio.to_thread(_get_task_status_snapshot_sync, task_id) - if task_snapshot is None: - response = { - "error": { - "code": "TASK_NOT_FOUND", - "message": f"Task not found: {task_id}", - } - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - - progress_percentage = float(task_snapshot.get("progress_percentage") or 0.0) - - task_state = task_snapshot["state"] - state = get_task_state_mapping(task_state) - if task_state == PlanState.completed: - progress_percentage = 100.0 - - # Collect files from worker_plan - task_uuid = task_snapshot["id"] - files = [] - if task_uuid: - files_list = await fetch_file_list_from_worker_plan(task_uuid) - if not files_list: - files_list = await asyncio.to_thread(list_files_from_zip_snapshot, task_uuid) - if not files_list: - files_list = await asyncio.to_thread(list_files_from_local_run_dir, task_uuid) - if files_list: - for file_name in files_list[:10]: # Limit to 10 most recent - if file_name != "log.txt": - updated_at = datetime.now(UTC).replace(microsecond=0) - files.append({ - "path": file_name, - "updated_at": updated_at.isoformat().replace("+00:00", "Z"), # Approximate - }) - - created_at = task_snapshot["timestamp_created"] - if created_at and created_at.tzinfo is None: - created_at = created_at.replace(tzinfo=UTC) - - response = { - "task_id": task_uuid, - "state": state, - "progress_percentage": progress_percentage, - "timing": { - "started_at": ( - created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z") - if created_at - else None - ), - "elapsed_sec": (datetime.now(UTC) - created_at).total_seconds() if created_at else 0, - }, - "files": files[:10], # Limit to 10 most recent - } - - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - -async def handle_plan_stop(arguments: dict[str, Any]) -> CallToolResult: - """Request an active task to stop. - - Examples: - - {"task_id": "uuid"} → stop request accepted - - Args: - - task_id: Task UUID returned by plan_create. - - Returns: - - content: JSON string matching structuredContent. - - structuredContent: {"state": "pending|processing|completed|failed", "stop_requested": bool} or error payload. - - isError: True only when task_id is unknown. - """ - req = TaskStopRequest(**arguments) - task_id = req.task_id - - stop_result = await asyncio.to_thread(_request_task_stop_sync, task_id) - if stop_result is None: - response = { - "error": { - "code": "TASK_NOT_FOUND", - "message": f"Task not found: {task_id}", - } - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - - response = stop_result - - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - - -async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult: - """Retry a failed task by resetting it back to pending.""" - req = TaskRetryRequest(**arguments) - task_id = req.task_id - retry_result = await asyncio.to_thread(_retry_failed_task_sync, task_id, req.model_profile) - - if retry_result is None: - response = { - "error": { - "code": "TASK_NOT_FOUND", - "message": f"Task not found: {task_id}", - } - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - - if isinstance(retry_result.get("error"), dict): - response = retry_result - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - - response = retry_result - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - - -async def handle_plan_file_info(arguments: dict[str, Any]) -> CallToolResult: - """Return download metadata for a task's report or zip artifact. - - Examples: - - {"task_id": "uuid"} → report metadata (default) - - {"task_id": "uuid", "artifact": "zip"} → zip metadata - - Args: - - task_id: Task UUID returned by plan_create. - - artifact: Optional "report" or "zip". - - Returns: - - content: JSON string matching structuredContent. - - structuredContent: metadata (content_type, sha256, download_size, - optional download_url) or {} if not ready, or error payload. - - isError: True only when task_id is unknown. - """ - req = TaskFileInfoRequest(**arguments) - task_id = req.task_id - artifact = req.artifact.strip().lower() if isinstance(req.artifact, str) else "report" - if artifact not in ("report", "zip"): - artifact = "report" - task_snapshot = await asyncio.to_thread(_get_task_for_report_sync, task_id) - if task_snapshot is None: - response = { - "error": { - "code": "TASK_NOT_FOUND", - "message": f"Task not found: {task_id}", - } - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - - run_id = task_snapshot["id"] - if artifact == "zip": - content_bytes = await fetch_user_downloadable_zip(run_id) - if content_bytes is None: - task_state = task_snapshot["state"] - if task_state in (PlanState.pending, PlanState.processing) or task_state is None: - response = {"ready": False, "reason": "processing"} - else: - response = { - "error": { - "code": "content_unavailable", - "message": "zip content_bytes is None", - }, - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - - total_size = len(content_bytes) - content_hash = compute_sha256(content_bytes) - response = { - "content_type": ZIP_CONTENT_TYPE, - "sha256": content_hash, - "download_size": total_size, - } - download_url = build_zip_download_url(run_id) - if download_url: - response["download_url"] = download_url - - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - - task_state = task_snapshot["state"] - if task_state in (PlanState.pending, PlanState.processing) or task_state is None: - response = {"ready": False, "reason": "processing"} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - if task_state == PlanState.failed: - message = task_snapshot["progress_message"] or "Plan generation failed." - response = {"ready": False, "reason": "failed", "error": {"code": "generation_failed", "message": message}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - - content_bytes = await fetch_artifact_from_worker_plan(run_id, REPORT_FILENAME) - if content_bytes is None: - response = { - "error": { - "code": "content_unavailable", - "message": "content_bytes is None", - }, - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) - - total_size = len(content_bytes) - content_hash = compute_sha256(content_bytes) - response = { - "content_type": REPORT_CONTENT_TYPE, - "sha256": content_hash, - "download_size": total_size, - } - download_url = build_report_download_url(run_id) - if download_url: - response["download_url"] = download_url +# -- model_profiles: model profile introspection ------------------------------ +from mcp_cloud.model_profiles import ( # noqa: F401 + _sort_llm_config_entries, + _extract_model_profile_entries, + _profile_models_payload, + _get_model_profiles_sync, +) - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) +# -- download_tokens: signed download tokens and URL builders ------------------ +from mcp_cloud.download_tokens import ( # noqa: F401 + _download_base_url_ctx, + set_download_base_url, + clear_download_base_url, + _get_download_base_url, + _get_download_token_secret, + generate_download_token, + validate_download_token, + build_report_download_url, + build_zip_download_url, + build_report_download_path, + build_zip_download_path, +) -async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult: - """Return recent tasks for an authenticated user.""" - try: - req = TaskListRequest(**arguments) - except Exception as exc: - response = {"error": {"code": "INVALID_ARGUMENTS", "message": str(exc)}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - user_context = _resolve_user_from_api_key(req.user_api_key.strip()) - if not user_context: - response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}} - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=True, - ) - limit = max(1, min(req.limit, 50)) - tasks = await asyncio.to_thread(_list_tasks_sync, str(user_context["user_id"]), limit) - response = { - "tasks": tasks, - "message": f"Returned {len(tasks)} task(s).", - } - return CallToolResult( - content=[TextContent(type="text", text=json.dumps(response))], - structuredContent=response, - isError=False, - ) +# -- prompt_examples: example prompt loading ----------------------------------- +from mcp_cloud.prompt_examples import ( # noqa: F401 + _load_mcp_example_prompts, + _builtin_mcp_example_prompts, +) +# -- schemas: tool schema constants and ToolDefinition ------------------------- +from mcp_cloud.schemas import ( # noqa: F401 + PLAN_CREATE_INPUT_SCHEMA, + PLAN_CREATE_OUTPUT_SCHEMA, + PLAN_STATUS_SUCCESS_SCHEMA, + PLAN_STATUS_OUTPUT_SCHEMA, + PLAN_STOP_OUTPUT_SCHEMA, + PLAN_RETRY_OUTPUT_SCHEMA, + PLAN_FILE_INFO_READY_OUTPUT_SCHEMA, + PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA, + PLAN_FILE_INFO_OUTPUT_SCHEMA, + PLAN_STATUS_INPUT_SCHEMA, + PLAN_STOP_INPUT_SCHEMA, + PLAN_RETRY_INPUT_SCHEMA, + PLAN_FILE_INFO_INPUT_SCHEMA, + PROMPT_EXAMPLES_INPUT_SCHEMA, + PROMPT_EXAMPLES_OUTPUT_SCHEMA, + MODEL_PROFILES_INPUT_SCHEMA, + MODEL_PROFILES_OUTPUT_SCHEMA, + PLAN_LIST_INPUT_SCHEMA, + PLAN_LIST_OUTPUT_SCHEMA, + ToolDefinition, + TOOL_DEFINITIONS, +) -TOOL_HANDLERS = { - "plan_create": handle_plan_create, - "plan_status": handle_plan_status, - "plan_stop": handle_plan_stop, - "plan_retry": handle_plan_retry, - "plan_file_info": handle_plan_file_info, - "plan_list": handle_plan_list, - "prompt_examples": handle_prompt_examples, - "model_profiles": handle_model_profiles, -} +# -- handlers: MCP tool handlers and dispatch ---------------------------------- +from mcp_cloud.handlers import ( # noqa: F401 + handle_list_tools, + handle_call_tool, + handle_plan_create, + handle_prompt_examples, + handle_model_profiles, + handle_plan_status, + handle_plan_stop, + handle_plan_retry, + handle_plan_file_info, + handle_plan_list, + TOOL_HANDLERS, +) -# Backward-compatible aliases so existing imports of handle_task_* still work -handle_task_create = handle_plan_create -handle_task_status = handle_plan_status -handle_task_stop = handle_plan_stop -handle_task_retry = handle_plan_retry -handle_task_file_info = handle_plan_file_info -handle_task_list = handle_plan_list async def main(): """Main entry point for MCP server.""" logger.info("Starting PlanExe MCP Cloud...") - + with app.app_context(): db.create_all() logger.info("Database initialized") - + async with stdio_server() as streams: await mcp_cloud.run( streams[0], diff --git a/mcp_cloud/auth.py b/mcp_cloud/auth.py new file mode 100644 index 000000000..c0140ad2b --- /dev/null +++ b/mcp_cloud/auth.py @@ -0,0 +1,50 @@ +"""PlanExe MCP Cloud – API-key hashing and user resolution.""" +import hashlib +import logging +import os +from datetime import UTC, datetime +from typing import Any, Optional + +from mcp_cloud.db_setup import app, db, UserApiKey, UserAccount + +logger = logging.getLogger(__name__) + + +def validate_api_key_secret() -> None: + """Raise if PLANEXE_API_KEY_SECRET is not set. + + Call at startup when authentication is required so the server + fails hard instead of silently falling back to a dev secret. + """ + if not os.environ.get("PLANEXE_API_KEY_SECRET"): + raise RuntimeError( + "PLANEXE_API_KEY_SECRET is not set. " + "Set this environment variable or disable auth with PLANEXE_MCP_REQUIRE_AUTH=false." + ) + + +def _hash_user_api_key(raw_key: str) -> str: + secret = os.environ.get("PLANEXE_API_KEY_SECRET", "dev-api-key-secret") + if secret == "dev-api-key-secret": + logger.warning("PLANEXE_API_KEY_SECRET not set. Using dev secret for API key hashing.") + return hashlib.sha256(f"{secret}:{raw_key}".encode("utf-8")).hexdigest() + +def _resolve_user_from_api_key(raw_key: str) -> Optional[dict[str, Any]]: + if not raw_key: + return None + key_hash = _hash_user_api_key(raw_key) + with app.app_context(): + api_key = UserApiKey.query.filter_by(key_hash=key_hash, revoked_at=None).first() + if not api_key: + return None + user = db.session.get(UserAccount, api_key.user_id) + if not user: + return None + + user_context = { + "user_id": str(user.id), + "credits_balance": float(user.credits_balance or 0), + } + api_key.last_used_at = datetime.now(UTC) + db.session.commit() + return user_context diff --git a/mcp_cloud/db_queries.py b/mcp_cloud/db_queries.py new file mode 100644 index 000000000..d68680a20 --- /dev/null +++ b/mcp_cloud/db_queries.py @@ -0,0 +1,304 @@ +"""PlanExe MCP Cloud – database query helpers.""" +import logging +import uuid +from datetime import UTC, datetime +from typing import Any, Optional + +from flask import has_app_context +from sqlalchemy import cast +from sqlalchemy.dialects.postgresql import JSONB +from worker_plan_api.model_profile import normalize_model_profile + +from mcp_cloud.db_setup import app, db, PlanItem, PlanState, EventItem, EventType + +logger = logging.getLogger(__name__) + +PROMPT_EXCERPT_MAX_LENGTH = 100 + + +# --------------------------------------------------------------------------- +# Plan lookup +# --------------------------------------------------------------------------- + +def find_plan_by_task_id(task_id: str) -> Optional[PlanItem]: + """Find PlanItem by MCP task_id (UUID), with legacy fallback.""" + plan = get_plan_by_id(task_id) + if plan is not None: + return plan + + def _query_legacy() -> Optional[PlanItem]: + query = db.session.query(PlanItem) + if db.engine.dialect.name == "postgresql": + plans = query.filter( + cast(PlanItem.parameters, JSONB).contains({"_mcp_task_id": task_id}) + ).all() + else: + plans = query.filter( + PlanItem.parameters.contains({"_mcp_task_id": task_id}) + ).all() + if plans: + return plans[0] + return None + + if has_app_context(): + legacy_plan = _query_legacy() + else: + with app.app_context(): + legacy_plan = _query_legacy() + if legacy_plan is not None: + logger.debug("Resolved legacy MCP task id %s to plan %s", task_id, legacy_plan.id) + return legacy_plan + +def get_plan_by_id(task_id: str) -> Optional[PlanItem]: + """Fetch a PlanItem by its UUID string.""" + def _query() -> Optional[PlanItem]: + try: + plan_uuid = uuid.UUID(task_id) + except ValueError: + return None + return db.session.get(PlanItem, plan_uuid) + + if has_app_context(): + return _query() + with app.app_context(): + return _query() + +def resolve_plan_for_task_id(task_id: str) -> Optional[PlanItem]: + """Resolve a PlanItem from a task_id (UUID), with legacy fallback.""" + return find_plan_by_task_id(task_id) + + +# --------------------------------------------------------------------------- +# Sync operations called from handlers via asyncio.to_thread +# --------------------------------------------------------------------------- + +def _create_plan_sync( + prompt: str, + config: Optional[dict[str, Any]], + metadata: Optional[dict[str, Any]], +) -> dict[str, Any]: + with app.app_context(): + parameters = dict(config or {}) + parameters["model_profile"] = normalize_model_profile(parameters.get("model_profile")).value + parameters["trigger_source"] = "mcp plan_create" + + plan = PlanItem( + prompt=prompt, + state=PlanState.pending, + user_id=metadata.get("user_id", "admin") if metadata else "admin", + parameters=parameters, + ) + db.session.add(plan) + db.session.commit() + + plan_id = str(plan.id) + event_context = { + "plan_id": plan_id, + "task_handle": plan_id, + "prompt": plan.prompt, + "user_id": plan.user_id, + "config": config, + "metadata": metadata, + "parameters": plan.parameters, + } + event = EventItem( + event_type=EventType.TASK_PENDING, + message="Enqueued task via MCP", + context=event_context, + ) + db.session.add(event) + db.session.commit() + + created_at = plan.timestamp_created + if created_at and created_at.tzinfo is None: + created_at = created_at.replace(tzinfo=UTC) + return { + "plan_id": plan_id, + "created_at": created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z"), + } + +def _get_plan_status_snapshot_sync(task_id: str) -> Optional[dict[str, Any]]: + with app.app_context(): + plan = find_plan_by_task_id(task_id) + if plan is None: + return None + return { + "id": str(plan.id), + "state": plan.state, + "stop_requested": bool(plan.stop_requested), + "progress_percentage": plan.progress_percentage, + "timestamp_created": plan.timestamp_created, + } + +def _request_plan_stop_sync(task_id: str) -> Optional[dict[str, Any]]: + with app.app_context(): + plan = find_plan_by_task_id(task_id) + if plan is None: + return None + stop_requested = False + if plan.state in (PlanState.pending, PlanState.processing): + plan.stop_requested = True + plan.stop_requested_timestamp = datetime.now(UTC) + plan.progress_message = "Stop requested by user." + db.session.commit() + logger.info("Stop requested for task %s; stop flag set on plan %s.", task_id, plan.id) + stop_requested = True + return { + "state": get_plan_state_mapping(plan.state), + "stop_requested": stop_requested, + } + + +def _retry_failed_plan_sync(task_id: str, model_profile: str) -> Optional[dict[str, Any]]: + with app.app_context(): + plan = find_plan_by_task_id(task_id) + if plan is None: + return None + if plan.state != PlanState.failed: + return { + "error": { + "code": "PLAN_NOT_FAILED", + "message": f"Plan is not in failed state: {task_id}", + } + } + + normalized_profile = normalize_model_profile(model_profile).value + now_utc = datetime.now(UTC) + parameters = dict(plan.parameters) if isinstance(plan.parameters, dict) else {} + parameters["model_profile"] = normalized_profile + parameters["trigger_source"] = "mcp plan_retry" + + # Reset plan state and clear prior run artifacts before requeueing. + plan.state = PlanState.pending + plan.timestamp_created = now_utc + plan.progress_percentage = 0.0 + plan.progress_message = "Retry requested via MCP." + plan.stop_requested = False + plan.stop_requested_timestamp = None + plan.generated_report_html = None + plan.run_zip_snapshot = None + plan.run_track_activity_jsonl = None + plan.run_track_activity_bytes = None + plan.run_activity_overview_json = None + plan.run_artifact_layout_version = None + plan.parameters = parameters + db.session.commit() + + event_context = { + "plan_id": str(plan.id), + "task_handle": str(plan.id), + "retry_of_plan_id": task_id, + "model_profile": normalized_profile, + "parameters": plan.parameters, + } + event = EventItem( + event_type=EventType.TASK_PENDING, + message="Retried failed task via MCP", + context=event_context, + ) + db.session.add(event) + db.session.commit() + + return { + "plan_id": str(plan.id), + "state": get_plan_state_mapping(plan.state), + "model_profile": normalized_profile, + "retried_at": now_utc.replace(microsecond=0).isoformat().replace("+00:00", "Z"), + } + + +def _get_plan_for_report_sync(task_id: str) -> Optional[dict[str, Any]]: + with app.app_context(): + plan = resolve_plan_for_task_id(task_id) + if plan is None: + return None + return { + "id": str(plan.id), + "state": plan.state, + "progress_message": plan.progress_message, + } + +def _list_plans_sync(user_id: Optional[str], limit: int) -> list[dict[str, Any]]: + with app.app_context(): + query = db.session.query(PlanItem) + if user_id is not None: + query = query.filter_by(user_id=user_id) + plans = ( + query + .order_by(PlanItem.timestamp_created.desc()) + .limit(max(1, min(limit, 50))) + .all() + ) + results = [] + for plan in plans: + created_at = plan.timestamp_created + if created_at and created_at.tzinfo is None: + created_at = created_at.replace(tzinfo=UTC) + results.append({ + "plan_id": str(plan.id), + "state": get_plan_state_mapping(plan.state), + "progress_percentage": float(plan.progress_percentage or 0.0), + "created_at": ( + created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z") + if created_at else None + ), + "prompt_excerpt": (plan.prompt or "")[:PROMPT_EXCERPT_MAX_LENGTH], + }) + return results + + +# --------------------------------------------------------------------------- +# Utilities +# --------------------------------------------------------------------------- + +def get_plan_state_mapping(plan_state: PlanState) -> str: + """Map PlanState to MCP task state.""" + mapping = { + PlanState.pending: "pending", + PlanState.processing: "processing", + PlanState.completed: "completed", + PlanState.failed: "failed", + } + return mapping.get(plan_state, "pending") + +def _extract_plan_create_metadata_overrides(arguments: dict[str, Any]) -> dict[str, Any]: + """Extract plan_create runtime overrides from hidden metadata containers. + + Supported hidden containers: + - arguments.tool_metadata + - arguments.metadata + - arguments._meta + + If a container includes nested namespaces, these are checked first: + - plan_create + - task_create (legacy alias) + - planexe_task_create (legacy alias) + - planexe + """ + merged: dict[str, Any] = {} + metadata_candidates: list[dict[str, Any]] = [] + + for key in ("tool_metadata", "metadata", "_meta"): + candidate = arguments.get(key) + if isinstance(candidate, dict): + metadata_candidates.append(candidate) + + for candidate in metadata_candidates: + merged.update(candidate) + for nested_key in ("plan_create", "task_create", "planexe_task_create", "planexe"): + nested = candidate.get(nested_key) + if isinstance(nested, dict): + merged.update(nested) + + return merged + +def _merge_plan_create_config( + config: Optional[dict[str, Any]], + model_profile: Optional[str], +) -> Optional[dict[str, Any]]: + merged = dict(config or {}) + if isinstance(model_profile, str): + candidate_profile = model_profile.strip() + if candidate_profile and "model_profile" not in merged: + merged["model_profile"] = candidate_profile + return merged or None diff --git a/mcp_cloud/db_setup.py b/mcp_cloud/db_setup.py new file mode 100644 index 000000000..81149e97d --- /dev/null +++ b/mcp_cloud/db_setup.py @@ -0,0 +1,170 @@ +"""PlanExe MCP Cloud – database setup, Flask app, constants, and request classes.""" +import logging +import os +from pathlib import Path +from typing import Literal, Optional +from urllib.parse import quote_plus + +from flask import Flask +from mcp.server import Server +from pydantic import BaseModel +from sqlalchemy import text +from worker_plan_api.model_profile import ModelProfileEnum + +from mcp_cloud.dotenv_utils import load_planexe_dotenv + +_dotenv_loaded, _dotenv_paths = load_planexe_dotenv(Path(__file__).parent) + +logging.basicConfig( + level=logging.INFO, + format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' +) +logger = logging.getLogger(__name__) +if not _dotenv_loaded: + logger.warning( + "No .env file found; searched: %s", + ", ".join(str(path) for path in _dotenv_paths), + ) + +from database_api.planexe_db_singleton import db +from database_api.model_planitem import PlanItem, PlanState +from database_api.model_event import EventItem, EventType +from database_api.model_user_account import UserAccount +from database_api.model_user_api_key import UserApiKey + +app = Flask(__name__) +app.config.from_pyfile('config.py') + +def build_postgres_uri_from_env(env: dict[str, str]) -> tuple[str, dict[str, str]]: + """Construct a SQLAlchemy URI for Postgres using environment variables.""" + host = env.get("PLANEXE_POSTGRES_HOST") or "database_postgres" + port = str(env.get("PLANEXE_POSTGRES_PORT") or "5432") + dbname = env.get("PLANEXE_POSTGRES_DB") or "planexe" + user = env.get("PLANEXE_POSTGRES_USER") or "planexe" + password = env.get("PLANEXE_POSTGRES_PASSWORD") or "planexe" + uri = f"postgresql+psycopg2://{quote_plus(user)}:{quote_plus(password)}@{host}:{port}/{dbname}" + safe_config = {"host": host, "port": port, "dbname": dbname, "user": user} + return uri, safe_config + +sqlalchemy_database_uri = os.environ.get("SQLALCHEMY_DATABASE_URI") +if sqlalchemy_database_uri is None: + sqlalchemy_database_uri, db_settings = build_postgres_uri_from_env(os.environ) + logger.info(f"SQLALCHEMY_DATABASE_URI not set. Using Postgres defaults: {db_settings}") +else: + logger.info("Using SQLALCHEMY_DATABASE_URI from environment.") + +app.config['SQLALCHEMY_DATABASE_URI'] = sqlalchemy_database_uri +app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {'pool_recycle': 280, 'pool_pre_ping': True} +db.init_app(app) + +def ensure_planitem_stop_columns() -> None: + statements = ( + "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_jsonl TEXT", + "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_bytes INTEGER", + "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_activity_overview_json JSON", + "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_artifact_layout_version INTEGER", + "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested BOOLEAN", + "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested_timestamp TIMESTAMP", + ) + with db.engine.begin() as conn: + for statement in statements: + try: + conn.execute(text(statement)) + except Exception as exc: + logger.warning("Schema update failed for %s: %s", statement, exc, exc_info=True) + +with app.app_context(): + ensure_planitem_stop_columns() + +# Shown in MCP initialize (e.g. Inspector) so clients know what PlanExe does. +PLANEXE_SERVER_INSTRUCTIONS = ( + "PlanExe generates strategic project-plan drafts from a natural-language prompt. " + "Output is a self-contained interactive HTML report (~700KB) with 20+ sections including " + "executive summary, interactive Gantt charts, risk analysis, SWOT, governance, investor pitch, " + "team profiles, work breakdown, scenario comparison, expert criticism, and adversarial sections " + "(premortem, self-audit checklist, premise attacks) that stress-test whether the plan holds up. " + "The output is a draft to refine, not final ground truth — but it surfaces hard questions the prompter may not have considered. " + "Use PlanExe for substantial multi-phase projects with constraints, stakeholders, budgets, and timelines. " + "Do not use PlanExe for tiny one-shot outputs (for example: 'give me a 5-point checklist'); use a normal LLM response for that. " + "The planning pipeline is fixed end-to-end; callers cannot select individual internal pipeline steps to run. " + "Required interaction order: call prompt_examples first. " + "Optional before plan_create: call model_profiles to see profile guidance and available models in each profile. " + "Then perform a non-tool step: draft a strong prompt as flowing prose (not structured markdown with headers or bullets), " + "typically ~300-800 words, and get user approval. " + "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. " + "Write the prompt as flowing prose — weave specs, constraints, and targets naturally into sentences. " + "Only after approval, call plan_create. " + "Each plan_create call creates a new plan_id; the server does not enforce a global per-client concurrency limit. " + "Then poll plan_status (about every 5 minutes); use plan_file_info when complete. " + "If a run fails, call plan_retry with the failed plan_id to requeue it (optional model_profile, defaults to baseline). " + "To stop, call plan_stop with the plan_id from plan_create; stopping is asynchronous and the plan will eventually transition to failed. " + "If model_profiles returns MODEL_PROFILES_UNAVAILABLE, inform the user that no models are currently configured and the server administrator needs to set up model profiles. " + "Tool errors use {error:{code,message}}. plan_file_info returns {ready:false,reason:...} while the artifact is not yet ready; check readiness by testing whether download_url is present in the response. " + "plan_file_info download_url is the absolute URL where the requested artifact can be downloaded. " + "To list recent plans for a user call plan_list; returns plan_id, state, progress_percentage, created_at, and prompt_excerpt for each plan. " + "plan_status state contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. " + "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the plan was likely queued but not picked up by a worker (server issue). " + "If plan_status is in processing and output files do not change for longer than 20 minutes, the plan_create likely failed/stalled. " + "In both cases, report the issue to PlanExe developers on GitHub: https://github.com/PlanExeOrg/PlanExe/issues . " + "Main output: a self-contained interactive HTML report (~700KB) with collapsible sections and interactive Gantt charts — open in a browser. " + "The zip contains the intermediary pipeline files (md, json, csv) that fed the report." +) + +mcp_cloud_server = Server("planexe-mcp-cloud", instructions=PLANEXE_SERVER_INSTRUCTIONS) + +# Base directory for run artifacts (not used directly, fetched via worker_plan HTTP API) +BASE_DIR_RUN = Path(os.environ.get("PLANEXE_RUN_DIR", Path(__file__).parent.parent / "run")).resolve() + +WORKER_PLAN_URL = os.environ.get("PLANEXE_WORKER_PLAN_URL", "http://worker_plan:8000") + +REPORT_FILENAME = "030-report.html" +REPORT_CONTENT_TYPE = "text/html; charset=utf-8" +ZIP_FILENAME = "run.zip" +ZIP_CONTENT_TYPE = "application/zip" +ZIP_SNAPSHOT_MAX_BYTES = 100_000_000 + +ModelProfileInput = Literal[ + "baseline", + "premium", + "frontier", + "custom", +] +MODEL_PROFILE_TITLES = { + ModelProfileEnum.BASELINE.value: "Baseline", + ModelProfileEnum.PREMIUM.value: "Premium", + ModelProfileEnum.FRONTIER.value: "Frontier", + ModelProfileEnum.CUSTOM.value: "Custom", +} +MODEL_PROFILE_SUMMARIES = { + ModelProfileEnum.BASELINE.value: "Cheap and fast; recommended default when creating a plan.", + ModelProfileEnum.PREMIUM.value: "Higher-cost profile tuned for stronger output quality.", + ModelProfileEnum.FRONTIER.value: "Most capable models first; usually slowest/most expensive.", + ModelProfileEnum.CUSTOM.value: "User-managed profile file for custom model ordering.", +} + +class PlanCreateRequest(BaseModel): + prompt: str + model_profile: Optional[ModelProfileInput] = None + user_api_key: Optional[str] = None + +class PlanStatusRequest(BaseModel): + plan_id: str + +class PlanStopRequest(BaseModel): + plan_id: str + +class PlanRetryRequest(BaseModel): + plan_id: str + model_profile: ModelProfileInput = "baseline" + +class PlanFileInfoRequest(BaseModel): + plan_id: str + artifact: Optional[str] = None + +class PlanListRequest(BaseModel): + user_api_key: Optional[str] = None + limit: int = 10 + +class ModelProfilesRequest(BaseModel): + """No input parameters.""" + pass diff --git a/mcp_cloud/download_tokens.py b/mcp_cloud/download_tokens.py new file mode 100644 index 000000000..9dccc6f95 --- /dev/null +++ b/mcp_cloud/download_tokens.py @@ -0,0 +1,152 @@ +"""PlanExe MCP Cloud – signed download tokens and URL builders.""" +import contextvars +import hashlib +import hmac +import logging +import os +import secrets +import time +from typing import Optional + +from mcp_cloud.db_setup import REPORT_FILENAME, ZIP_FILENAME + +logger = logging.getLogger(__name__) + + +# Context var set by HTTP server so download URLs use the request's host when +# PLANEXE_MCP_PUBLIC_BASE_URL is not set (avoids localhost for remote clients). +_download_base_url_ctx: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar( + "download_base_url", default=None +) + + +def set_download_base_url(base_url: Optional[str]) -> None: + """Set the base URL used for download links for this request (e.g. from HTTP Request). + Cleared automatically when the request ends. Used when PLANEXE_MCP_PUBLIC_BASE_URL is unset.""" + if base_url is not None: + _download_base_url_ctx.set(base_url.rstrip("/")) + else: + try: + _download_base_url_ctx.set("") + except LookupError: + pass + + +def clear_download_base_url() -> None: + """Clear the request-scoped base URL (call when request ends).""" + try: + _download_base_url_ctx.set("") + except LookupError: + pass + + +def _get_download_base_url() -> Optional[str]: + """Return base URL for download links: env var, then request context, then None.""" + base_url = os.environ.get("PLANEXE_MCP_PUBLIC_BASE_URL") + if base_url: + return base_url.rstrip("/") + try: + ctx_url = _download_base_url_ctx.get() + return ctx_url if ctx_url else None + except LookupError: + return None + + +def build_report_download_path(task_id: str) -> str: + return f"/download/{task_id}/{REPORT_FILENAME}" + + +def build_zip_download_path(task_id: str) -> str: + return f"/download/{task_id}/{ZIP_FILENAME}" + + +# --------------------------------------------------------------------------- +# Signed, expiring download tokens +# --------------------------------------------------------------------------- + +# Default TTL for signed download tokens (seconds). Configurable via env var. +DOWNLOAD_TOKEN_TTL_SECONDS = int(os.environ.get("PLANEXE_DOWNLOAD_TOKEN_TTL", "900")) # 15 min + +# Per-process fallback secret when no env var is set. Tokens won't survive a +# server restart, but that is acceptable for the fallback case. +_random_token_secret: Optional[bytes] = None + + +def validate_download_token_secret() -> None: + """Raise if no stable download-token secret is configured. + + Call at startup when authentication is required so the server + fails hard instead of silently using a random per-process secret + that invalidates tokens on restart. + """ + for env_var in ("PLANEXE_DOWNLOAD_TOKEN_SECRET", "PLANEXE_API_KEY_SECRET"): + if os.environ.get(env_var): + return + raise RuntimeError( + "Neither PLANEXE_DOWNLOAD_TOKEN_SECRET nor PLANEXE_API_KEY_SECRET is set. " + "Set at least one or disable auth with PLANEXE_MCP_REQUIRE_AUTH=false." + ) + + +def _get_download_token_secret() -> bytes: + """Return the HMAC-SHA256 secret used to sign download tokens. + + Priority: PLANEXE_DOWNLOAD_TOKEN_SECRET → PLANEXE_API_KEY_SECRET → + per-process random (with a warning logged once). + """ + global _random_token_secret + for env_var in ("PLANEXE_DOWNLOAD_TOKEN_SECRET", "PLANEXE_API_KEY_SECRET"): + value = os.environ.get(env_var) + if value: + return value.encode() + if _random_token_secret is None: + _random_token_secret = secrets.token_bytes(32) + logger.warning( + "PLANEXE_DOWNLOAD_TOKEN_SECRET is not set; using a random per-process secret. " + "Download tokens will be invalidated on server restart. " + "Set PLANEXE_DOWNLOAD_TOKEN_SECRET to a stable value." + ) + return _random_token_secret + + +def generate_download_token(task_id: str, filename: str) -> str: + """Return a signed, time-limited token for one task artifact download. + + Format: ``{expiry_unix_ts}.{hmac_hex}`` + The HMAC covers ``task_id:filename:expiry`` so the token is scoped to + exactly one file and cannot be reused for a different task. + """ + expiry = int(time.time()) + DOWNLOAD_TOKEN_TTL_SECONDS + message = f"{task_id}:{filename}:{expiry}".encode() + mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest() + return f"{expiry}.{mac}" + + +def validate_download_token(token: str, task_id: str, filename: str) -> bool: + """Return True when *token* is a valid, unexpired token for the given artifact.""" + try: + expiry_str, mac = token.split(".", 1) + expiry = int(expiry_str) + except (ValueError, AttributeError): + return False + if time.time() > expiry: + return False + message = f"{task_id}:{filename}:{expiry}".encode() + expected_mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest() + return hmac.compare_digest(mac, expected_mac) + + +def build_report_download_url(task_id: str) -> Optional[str]: + base_url = _get_download_base_url() + if not base_url: + return None + token = generate_download_token(task_id, REPORT_FILENAME) + return f"{base_url}{build_report_download_path(task_id)}?token={token}" + + +def build_zip_download_url(task_id: str) -> Optional[str]: + base_url = _get_download_base_url() + if not base_url: + return None + token = generate_download_token(task_id, ZIP_FILENAME) + return f"{base_url}{build_zip_download_path(task_id)}?token={token}" diff --git a/mcp_cloud/handlers.py b/mcp_cloud/handlers.py new file mode 100644 index 000000000..870c007c1 --- /dev/null +++ b/mcp_cloud/handlers.py @@ -0,0 +1,554 @@ +"""PlanExe MCP Cloud – MCP tool handlers and dispatch.""" +import asyncio +import json +import logging +import os +import time +from datetime import UTC, datetime +from typing import Any + +from mcp.types import CallToolResult, Tool, TextContent, ToolAnnotations + +from mcp_cloud.db_setup import ( + PlanState, + REPORT_CONTENT_TYPE, + REPORT_FILENAME, + ZIP_CONTENT_TYPE, + ModelProfileInput, + PlanCreateRequest, + PlanStatusRequest, + PlanStopRequest, + PlanRetryRequest, + PlanFileInfoRequest, + PlanListRequest, + ModelProfilesRequest, + mcp_cloud_server, +) +from mcp_cloud.auth import _resolve_user_from_api_key +from mcp_cloud.db_queries import ( + _create_plan_sync, + _get_plan_status_snapshot_sync, + _request_plan_stop_sync, + _retry_failed_plan_sync, + _get_plan_for_report_sync, + _list_plans_sync, + get_plan_state_mapping, + _extract_plan_create_metadata_overrides, + _merge_plan_create_config, +) +from mcp_cloud.zip_utils import ( + list_files_from_zip_snapshot, + compute_sha256, +) +from mcp_cloud.worker_fetchers import ( + fetch_artifact_from_worker_plan, + fetch_file_list_from_worker_plan, + list_files_from_local_run_dir, + fetch_user_downloadable_zip, +) +from mcp_cloud.model_profiles import _get_model_profiles_sync +from mcp_cloud.download_tokens import build_report_download_url, build_zip_download_url +from mcp_cloud.prompt_examples import _load_mcp_example_prompts +from mcp_cloud.schemas import TOOL_DEFINITIONS + +logger = logging.getLogger(__name__) + + +@mcp_cloud_server.list_tools() +async def handle_list_tools() -> list[Tool]: + """List all available MCP tools.""" + return [ + Tool( + name=definition.name, + description=definition.description, + outputSchema=definition.output_schema, + inputSchema=definition.input_schema, + annotations=ToolAnnotations(**definition.annotations) if definition.annotations else None, + ) + for definition in TOOL_DEFINITIONS + ] + +@mcp_cloud_server.call_tool() +async def handle_call_tool(name: str, arguments: dict[str, Any]) -> CallToolResult: + """Dispatch MCP tool calls and return structured JSON errors for unknown tools.""" + start = time.monotonic() + try: + handler = TOOL_HANDLERS.get(name) + if handler is None: + logger.warning("tool_call tool=%s result=unknown_tool", name) + response = {"error": {"code": "INVALID_TOOL", "message": f"Unknown tool: {name}"}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + result = await handler(arguments) + elapsed_ms = (time.monotonic() - start) * 1000 + if result.isError: + logger.info("tool_call tool=%s result=error duration_ms=%.0f", name, elapsed_ms) + else: + logger.info("tool_call tool=%s result=ok duration_ms=%.0f", name, elapsed_ms) + return result + except Exception as e: + elapsed_ms = (time.monotonic() - start) * 1000 + logger.error("tool_call tool=%s result=exception duration_ms=%.0f error=%s", name, elapsed_ms, e, exc_info=True) + response = {"error": {"code": "INTERNAL_ERROR", "message": str(e)}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + +async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult: + """Create a new PlanExe task and enqueue it for processing. + + Examples: + - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → returns plan_id (UUID) + created_at + + Args: + - prompt: What the plan should cover (goal, context, constraints). + - model_profile: Optional profile ("baseline" | "premium" | "frontier" | "custom"). Call model_profiles to inspect options. + + Returns: + - content: JSON string matching structuredContent. + - structuredContent: {"plan_id": "", "created_at": ...} + - isError: False on success. + """ + req = PlanCreateRequest(**arguments) + metadata_overrides = _extract_plan_create_metadata_overrides(arguments) + metadata_model_profile = metadata_overrides.get("model_profile") + model_profile = req.model_profile + if model_profile is None and isinstance(metadata_model_profile, str): + model_profile = metadata_model_profile + + merged_config = _merge_plan_create_config(None, model_profile) + require_user_key = os.environ.get("PLANEXE_MCP_REQUIRE_USER_KEY", "false").lower() in ("1", "true", "yes", "on") + user_context = None + if req.user_api_key: + user_context = _resolve_user_from_api_key(req.user_api_key.strip()) + if not user_context: + response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + elif require_user_key: + response = {"error": {"code": "USER_API_KEY_REQUIRED", "message": "user_api_key is required for plan_create."}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + + if user_context and float(user_context.get("credits_balance", 0.0)) <= 0.0: + response = {"error": {"code": "INSUFFICIENT_CREDITS", "message": "Not enough credits."}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + + response = await asyncio.to_thread( + _create_plan_sync, + req.prompt, + merged_config, + {"user_id": str(user_context["user_id"])} if user_context else None, + ) + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + +async def handle_prompt_examples(arguments: dict[str, Any]) -> CallToolResult: + """Return curated prompts from the catalog (mcp_example true) so LLMs can see example detail.""" + samples = _load_mcp_example_prompts() + payload = { + "samples": samples, + "message": ( + "Next: complete the non-tool step by drafting a detailed prompt (typically ~300-800 words) using these as a baseline (similar structure), then get user approval. " + "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. " + "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. " + "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. " + "The examples demonstrate this prose style — match their tone and density. " + "Only after approval, call plan_create. " + "Do not use PlanExe for tiny one-shot requests (e.g., rewrite this email, summarize this document). " + "PlanExe always runs the full fixed planning pipeline; callers cannot run only selected internal steps." + ), + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(payload))], + structuredContent=payload, + isError=False, + ) + + +async def handle_model_profiles(arguments: dict[str, Any]) -> CallToolResult: + """Return model profile options and currently available models in each profile.""" + _ = ModelProfilesRequest(**(arguments or {})) + payload = await asyncio.to_thread(_get_model_profiles_sync) + profiles = payload.get("profiles") + if not isinstance(profiles, list) or len(profiles) == 0: + response = { + "error": { + "code": "MODEL_PROFILES_UNAVAILABLE", + "message": ( + "No models are currently configured. " + "Inform the user that the server administrator needs to set up model profiles before plans can be created." + ), + } + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(payload))], + structuredContent=payload, + isError=False, + ) + + +async def handle_plan_status(arguments: dict[str, Any]) -> CallToolResult: + """Fetch the current plan status, progress, and recent files for a plan. + + Examples: + - {"plan_id": "uuid"} → state/progress/timing + recent files + + Args: + - plan_id: Plan UUID returned by plan_create. + + Returns: + - content: JSON string matching structuredContent. + - structuredContent: status payload or error. + - isError: True only when plan_id is unknown. + """ + req = PlanStatusRequest(**arguments) + task_id = req.plan_id + + plan_snapshot = await asyncio.to_thread(_get_plan_status_snapshot_sync, task_id) + if plan_snapshot is None: + response = { + "error": { + "code": "PLAN_NOT_FOUND", + "message": f"Plan not found: {task_id}", + } + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + + progress_percentage = float(plan_snapshot.get("progress_percentage") or 0.0) + + plan_state = plan_snapshot["state"] + state = get_plan_state_mapping(plan_state) + if plan_state == PlanState.completed: + progress_percentage = 100.0 + + # Collect files from worker_plan + plan_uuid = plan_snapshot["id"] + files = [] + if plan_uuid: + files_list = await fetch_file_list_from_worker_plan(plan_uuid) + if not files_list: + files_list = await asyncio.to_thread(list_files_from_zip_snapshot, plan_uuid) + if not files_list: + files_list = await asyncio.to_thread(list_files_from_local_run_dir, plan_uuid) + if files_list: + for file_name in files_list[:10]: # Limit to 10 most recent + if file_name != "log.txt": + updated_at = datetime.now(UTC).replace(microsecond=0) + files.append({ + "path": file_name, + "updated_at": updated_at.isoformat().replace("+00:00", "Z"), # Approximate + }) + + created_at = plan_snapshot["timestamp_created"] + if created_at and created_at.tzinfo is None: + created_at = created_at.replace(tzinfo=UTC) + + response = { + "plan_id": plan_uuid, + "state": state, + "progress_percentage": progress_percentage, + "timing": { + "started_at": ( + created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z") + if created_at + else None + ), + "elapsed_sec": (datetime.now(UTC) - created_at).total_seconds() if created_at else 0, + }, + "files": files[:10], # Limit to 10 most recent + } + + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + +async def handle_plan_stop(arguments: dict[str, Any]) -> CallToolResult: + """Request an active plan to stop. + + Examples: + - {"plan_id": "uuid"} → stop request accepted + + Args: + - plan_id: Plan UUID returned by plan_create. + + Returns: + - content: JSON string matching structuredContent. + - structuredContent: {"state": "pending|processing|completed|failed", "stop_requested": bool} or error payload. + - isError: True only when plan_id is unknown. + """ + req = PlanStopRequest(**arguments) + task_id = req.plan_id + + stop_result = await asyncio.to_thread(_request_plan_stop_sync, task_id) + if stop_result is None: + response = { + "error": { + "code": "PLAN_NOT_FOUND", + "message": f"Plan not found: {task_id}", + } + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + + response = stop_result + + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + +async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult: + """Retry a failed plan by resetting it back to pending.""" + req = PlanRetryRequest(**arguments) + task_id = req.plan_id + retry_result = await asyncio.to_thread(_retry_failed_plan_sync, task_id, req.model_profile) + + if retry_result is None: + response = { + "error": { + "code": "PLAN_NOT_FOUND", + "message": f"Plan not found: {task_id}", + } + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + + if isinstance(retry_result.get("error"), dict): + response = retry_result + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + + response = retry_result + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + +async def handle_plan_file_info(arguments: dict[str, Any]) -> CallToolResult: + """Return download metadata for a plan's report or zip artifact. + + Examples: + - {"plan_id": "uuid"} → report metadata (default) + - {"plan_id": "uuid", "artifact": "zip"} → zip metadata + + Args: + - plan_id: Plan UUID returned by plan_create. + - artifact: Optional "report" or "zip". + + Returns: + - content: JSON string matching structuredContent. + - structuredContent: metadata (content_type, sha256, download_size, + optional download_url) or {} if not ready, or error payload. + - isError: True only when plan_id is unknown. + """ + req = PlanFileInfoRequest(**arguments) + task_id = req.plan_id + artifact = req.artifact.strip().lower() if isinstance(req.artifact, str) else "report" + if artifact not in ("report", "zip"): + response = { + "error": { + "code": "INVALID_ARGUMENT", + "message": f"Invalid artifact type: {req.artifact!r}. Must be 'report' or 'zip'.", + } + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + plan_snapshot = await asyncio.to_thread(_get_plan_for_report_sync, task_id) + if plan_snapshot is None: + response = { + "error": { + "code": "PLAN_NOT_FOUND", + "message": f"Plan not found: {task_id}", + } + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + + run_id = plan_snapshot["id"] + if artifact == "zip": + content_bytes = await fetch_user_downloadable_zip(run_id) + if content_bytes is None: + plan_state = plan_snapshot["state"] + if plan_state in (PlanState.pending, PlanState.processing) or plan_state is None: + response = {"ready": False, "reason": "processing"} + else: + response = { + "error": { + "code": "content_unavailable", + "message": "zip content_bytes is None", + }, + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + total_size = len(content_bytes) + content_hash = compute_sha256(content_bytes) + response = { + "content_type": ZIP_CONTENT_TYPE, + "sha256": content_hash, + "download_size": total_size, + } + download_url = build_zip_download_url(run_id) + if download_url: + response["download_url"] = download_url + + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + plan_state = plan_snapshot["state"] + if plan_state in (PlanState.pending, PlanState.processing) or plan_state is None: + response = {"ready": False, "reason": "processing"} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + if plan_state == PlanState.failed: + message = plan_snapshot["progress_message"] or "Plan generation failed." + response = {"ready": False, "reason": "failed", "error": {"code": "generation_failed", "message": message}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + content_bytes = await fetch_artifact_from_worker_plan(run_id, REPORT_FILENAME) + if content_bytes is None: + response = { + "error": { + "code": "content_unavailable", + "message": "content_bytes is None", + }, + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + total_size = len(content_bytes) + content_hash = compute_sha256(content_bytes) + response = { + "content_type": REPORT_CONTENT_TYPE, + "sha256": content_hash, + "download_size": total_size, + } + download_url = build_report_download_url(run_id) + if download_url: + response["download_url"] = download_url + + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + +async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult: + """Return recent plans for an authenticated user.""" + try: + req = PlanListRequest(**arguments) + except Exception as exc: + response = {"error": {"code": "INVALID_ARGUMENTS", "message": str(exc)}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + require_user_key = os.environ.get("PLANEXE_MCP_REQUIRE_USER_KEY", "false").lower() in ("1", "true", "yes", "on") + user_context = None + if req.user_api_key: + user_context = _resolve_user_from_api_key(req.user_api_key.strip()) + if not user_context: + response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + elif require_user_key: + response = {"error": {"code": "USER_API_KEY_REQUIRED", "message": "user_api_key is required for plan_list."}} + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=True, + ) + user_id = str(user_context["user_id"]) if user_context else None + limit = max(1, min(req.limit, 50)) + plans = await asyncio.to_thread(_list_plans_sync, user_id, limit) + response = { + "plans": plans, + "message": f"Returned {len(plans)} plan(s).", + } + return CallToolResult( + content=[TextContent(type="text", text=json.dumps(response))], + structuredContent=response, + isError=False, + ) + + +TOOL_HANDLERS = { + "plan_create": handle_plan_create, + "plan_status": handle_plan_status, + "plan_stop": handle_plan_stop, + "plan_retry": handle_plan_retry, + "plan_file_info": handle_plan_file_info, + "plan_list": handle_plan_list, + "prompt_examples": handle_prompt_examples, + "model_profiles": handle_model_profiles, +} diff --git a/mcp_cloud/http_server.py b/mcp_cloud/http_server.py index c1ea7f7d1..55c9ab37d 100644 --- a/mcp_cloud/http_server.py +++ b/mcp_cloud/http_server.py @@ -28,6 +28,7 @@ ModelProfilesOutput, PlanCreateOutput, PlanFileInfoOutput, + PlanListOutput, PlanRetryOutput, PlanStatusOutput, PlanStopOutput, @@ -65,11 +66,13 @@ handle_plan_stop, handle_plan_file_info, handle_prompt_examples, - resolve_task_for_task_id, + resolve_plan_for_task_id, set_download_base_url, validate_download_token, _resolve_user_from_api_key, ) +from mcp_cloud.auth import validate_api_key_secret +from mcp_cloud.download_tokens import validate_download_token_secret REQUIRED_API_KEY = os.environ.get("PLANEXE_MCP_API_KEY") @@ -78,6 +81,8 @@ MAX_BODY_BYTES = int(os.environ.get("PLANEXE_MCP_MAX_BODY_BYTES", "1048576")) RATE_LIMIT_REQUESTS = int(os.environ.get("PLANEXE_MCP_RATE_LIMIT", "60")) RATE_LIMIT_WINDOW_SECONDS = float(os.environ.get("PLANEXE_MCP_RATE_WINDOW_SECONDS", "60")) +DOWNLOAD_RATE_LIMIT_REQUESTS = int(os.environ.get("PLANEXE_MCP_DOWNLOAD_RATE_LIMIT", "10")) +DOWNLOAD_RATE_LIMIT_WINDOW_SECONDS = float(os.environ.get("PLANEXE_MCP_DOWNLOAD_RATE_WINDOW_SECONDS", "60")) GLAMA_MAINTAINER_EMAIL = os.environ.get( "PLANEXE_MCP_GLAMA_MAINTAINER_EMAIL", "neoneye@gmail.com", @@ -99,6 +104,10 @@ def _parse_bool_env(name: str, default: bool) -> bool: AUTH_REQUIRED = _parse_bool_env("PLANEXE_MCP_REQUIRE_AUTH", default=True) +if AUTH_REQUIRED: + validate_api_key_secret() + validate_download_token_secret() + def _split_csv_env(value: Optional[str]) -> list[str]: if not value: @@ -136,10 +145,18 @@ def _split_csv_env(value: Optional[str]) -> list[str]: CORS_ORIGINS = _split_csv_env(os.environ.get("PLANEXE_MCP_CORS_ORIGINS")) if not CORS_ORIGINS: - # Use wildcard so that browser-based tools (e.g. MCP Inspector at - # localhost:6274) can connect directly. API-key auth is the primary - # access control; CORS is defence-in-depth only. - CORS_ORIGINS = ["*"] + if AUTH_REQUIRED: + # Production default: only allow known PlanExe origins. + # Override via PLANEXE_MCP_CORS_ORIGINS if additional origins are needed. + CORS_ORIGINS = [ + "https://mcp.planexe.org", + "https://home.planexe.org", + ] + else: + # Dev mode: allow any origin so browser-based tools (e.g. MCP Inspector + # at localhost:6274) can connect without extra configuration. + CORS_ORIGINS = ["*"] + logger.info("CORS wildcard enabled (PLANEXE_MCP_REQUIRE_AUTH=false)") PUBLIC_JSONRPC_METHODS_NO_AUTH = { "initialize", @@ -317,6 +334,7 @@ async def _log_auth_rejection(request: Request, reason: str) -> None: _rate_lock = asyncio.Lock() _rate_buckets: dict[str, deque[float]] = defaultdict(deque) +_download_rate_buckets: dict[str, deque[float]] = defaultdict(deque) _authenticated_user_api_key_ctx: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar( "authenticated_user_api_key", default=None ) @@ -448,6 +466,27 @@ async def _enforce_rate_limit(request: Request) -> Optional[JSONResponse]: return None +async def _enforce_download_rate_limit(request: Request) -> Optional[JSONResponse]: + if DOWNLOAD_RATE_LIMIT_REQUESTS <= 0: + return None + if not request.url.path.startswith("/download"): + return None + + identifier = _client_identifier(request) + now = monotonic() + async with _rate_lock: + bucket = _download_rate_buckets[identifier] + while bucket and now - bucket[0] > DOWNLOAD_RATE_LIMIT_WINDOW_SECONDS: + bucket.popleft() + if len(bucket) >= DOWNLOAD_RATE_LIMIT_REQUESTS: + return JSONResponse( + status_code=429, + content={"detail": "Download rate limit exceeded"}, + ) + bucket.append(now) + return None + + async def _sweep_rate_buckets(stop_event: asyncio.Event) -> None: while not stop_event.is_set(): try: @@ -462,18 +501,30 @@ async def _sweep_rate_buckets(stop_event: asyncio.Event) -> None: bucket.popleft() if not bucket: del _rate_buckets[key] + for key in list(_download_rate_buckets): + bucket = _download_rate_buckets[key] + while bucket and now - bucket[0] > DOWNLOAD_RATE_LIMIT_WINDOW_SECONDS: + bucket.popleft() + if not bucket: + del _download_rate_buckets[key] async def _enforce_body_size(request: Request) -> Optional[JSONResponse]: - if request.method != "POST" or request.url.path != "/mcp/tools/call": + if request.method != "POST": + return None + if request.url.path not in ("/mcp/tools/call", "/mcp/"): return None content_length = request.headers.get("content-length") if not content_length: - return JSONResponse( - status_code=411, - content={"detail": "Length Required"}, - ) + # Streamable HTTP (/mcp/) may use chunked encoding without Content-Length. + # Only require it on the REST endpoint. + if request.url.path == "/mcp/tools/call": + return JSONResponse( + status_code=411, + content={"detail": "Length Required"}, + ) + return None try: if int(content_length) > MAX_BODY_BYTES: @@ -597,35 +648,35 @@ async def plan_create( async def plan_status( - task_id: str = Field(..., description="Task UUID returned by plan_create."), + plan_id: str = Field(..., description="Plan UUID returned by plan_create."), ) -> Annotated[CallToolResult, PlanStatusOutput]: - return await handle_plan_status({"task_id": task_id}) + return await handle_plan_status({"plan_id": plan_id}) async def plan_stop( - task_id: str = Field(..., description="Task UUID returned by plan_create. Use it to stop the plan creation."), + plan_id: str = Field(..., description="Plan UUID returned by plan_create. Use it to stop the plan creation."), ) -> Annotated[CallToolResult, PlanStopOutput]: - return await handle_plan_stop({"task_id": task_id}) + return await handle_plan_stop({"plan_id": plan_id}) async def plan_retry( - task_id: str = Field(..., description="UUID of the failed task to retry."), + plan_id: str = Field(..., description="UUID of the failed plan to retry."), model_profile: Annotated[ ModelProfileInput, Field(description="Model profile used for retry. Defaults to baseline."), ] = "baseline", ) -> Annotated[CallToolResult, PlanRetryOutput]: - return await handle_plan_retry({"task_id": task_id, "model_profile": model_profile}) + return await handle_plan_retry({"plan_id": plan_id, "model_profile": model_profile}) async def plan_file_info( - task_id: str = Field(..., description="Task UUID returned by plan_create. Use it to download the created plan."), + plan_id: str = Field(..., description="Plan UUID returned by plan_create. Use it to download the created plan."), artifact: Annotated[ ResultArtifactInput, Field(description="Download artifact type: report or zip."), ] = "report", ) -> Annotated[CallToolResult, PlanFileInfoOutput]: - return await handle_plan_file_info({"task_id": task_id, "artifact": artifact}) + return await handle_plan_file_info({"plan_id": plan_id, "artifact": artifact}) async def prompt_examples() -> CallToolResult: @@ -638,6 +689,17 @@ async def model_profiles() -> Annotated[CallToolResult, ModelProfilesOutput]: return await handle_model_profiles({}) +async def plan_list( + limit: int = Field(default=10, ge=1, le=50, description="Maximum number of plans to return (1–50). Newest plans are returned first."), +) -> Annotated[CallToolResult, PlanListOutput]: + """List the most recent plans for an authenticated user.""" + authenticated_user_api_key = _get_authenticated_user_api_key() + arguments: dict[str, Any] = {"limit": limit} + if authenticated_user_api_key: + arguments["user_api_key"] = authenticated_user_api_key + return await handle_plan_list(arguments) + + def _register_tools(server: FastMCP) -> None: handler_map = { "plan_create": plan_create, @@ -645,6 +707,7 @@ def _register_tools(server: FastMCP) -> None: "plan_stop": plan_stop, "plan_retry": plan_retry, "plan_file_info": plan_file_info, + "plan_list": plan_list, "prompt_examples": prompt_examples, "model_profiles": model_profiles, } @@ -769,6 +832,10 @@ async def enforce_api_key( if error_response: return _append_cors_headers(request, error_response) + error_response = await _enforce_download_rate_limit(request) + if error_response: + return _append_cors_headers(request, error_response) + if request.url.path.startswith("/mcp"): set_download_base_url(_request_origin(request)) try: @@ -845,10 +912,12 @@ async def call_tool( This endpoint wraps the stdio-based MCP tool handlers for HTTP access. """ arguments = dict(payload.arguments or {}) - if payload.tool == "plan_create": + if payload.tool in ("plan_create", "plan_list"): authenticated_user_api_key = _get_authenticated_user_api_key() if authenticated_user_api_key and not arguments.get("user_api_key"): arguments["user_api_key"] = authenticated_user_api_key + + if payload.tool == "plan_create": if isinstance(payload.metadata, dict): arguments["metadata"] = dict(payload.metadata) @@ -905,17 +974,17 @@ async def download_report( # Defence-in-depth: if a token was supplied, it must be valid for this artifact. if token is not None and not validate_download_token(token, task_id, filename): raise HTTPException(status_code=401, detail="Invalid or expired download token") - task = await asyncio.to_thread(resolve_task_for_task_id, task_id) - if task is None: + plan = await asyncio.to_thread(resolve_plan_for_task_id, task_id) + if plan is None: raise HTTPException(status_code=404, detail="Task not found") if filename == ZIP_FILENAME: - content_bytes = await fetch_user_downloadable_zip(str(task.id)) + content_bytes = await fetch_user_downloadable_zip(str(plan.id)) if content_bytes is None: raise HTTPException(status_code=404, detail="Report not found") headers = {"Content-Disposition": f'attachment; filename="{task_id}.zip"'} return Response(content=content_bytes, media_type=ZIP_CONTENT_TYPE, headers=headers) - content_bytes = await fetch_artifact_from_worker_plan(str(task.id), REPORT_FILENAME) + content_bytes = await fetch_artifact_from_worker_plan(str(plan.id), REPORT_FILENAME) if content_bytes is None: raise HTTPException(status_code=404, detail="Report not found") headers = {"Content-Disposition": f'inline; filename="{REPORT_FILENAME}"'} @@ -945,7 +1014,7 @@ def root() -> dict[str, Any]: "call": "/mcp/tools/call", "health": "/healthcheck", "glama_connector": "/.well-known/glama.json", - "download": f"/download/{{task_id}}/{REPORT_FILENAME}", + "download": f"/download/{{plan_id}}/{REPORT_FILENAME}", "llms_txt": "/llms.txt", }, "documentation": "See /docs for OpenAPI documentation", diff --git a/mcp_cloud/model_profiles.py b/mcp_cloud/model_profiles.py new file mode 100644 index 000000000..83d8ea1df --- /dev/null +++ b/mcp_cloud/model_profiles.py @@ -0,0 +1,146 @@ +"""PlanExe MCP Cloud – model profile introspection.""" +import json +import logging +import os +from typing import Any, Optional + +from worker_plan_api.model_profile import ( + ModelProfileEnum, + default_filename_for_profile, + resolve_model_profile_from_env, +) +from worker_plan_api.planexe_config import PlanExeConfig +from worker_plan_api.llm_class_filter import ( + ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES, + is_llm_class_allowed, + parse_llm_class_whitelist, +) + +from mcp_cloud.db_setup import MODEL_PROFILE_TITLES, MODEL_PROFILE_SUMMARIES + +logger = logging.getLogger(__name__) + + +def _sort_llm_config_entries(items: list[tuple[str, Any]]) -> list[tuple[str, Any]]: + def sort_key(item: tuple[str, Any]) -> tuple[int, str]: + key, model_data = item + priority = None + if isinstance(model_data, dict): + maybe_priority = model_data.get("priority") + if isinstance(maybe_priority, int): + priority = maybe_priority + if priority is None: + priority = 999999 + return priority, key + + return sorted(items, key=sort_key) + + +def _extract_model_profile_entries( + model_map: dict[str, Any], + whitelist: Optional[set[str]], +) -> list[dict[str, Any]]: + models: list[dict[str, Any]] = [] + + for model_key, model_data in _sort_llm_config_entries(list(model_map.items())): + class_name = model_data.get("class") if isinstance(model_data, dict) else None + if not is_llm_class_allowed(class_name, whitelist): + continue + + model_name = None + priority = None + if isinstance(model_data, dict): + arguments = model_data.get("arguments") + if isinstance(arguments, dict): + maybe_model = arguments.get("model") + if isinstance(maybe_model, str): + model_name = maybe_model + maybe_priority = model_data.get("priority") + if isinstance(maybe_priority, int): + priority = maybe_priority + elif isinstance(model_data.get("prio"), int): + priority = model_data["prio"] + + models.append( + { + "key": model_key, + "provider_class": class_name if isinstance(class_name, str) else None, + "model": model_name, + "priority": priority, + } + ) + + return models + + +def _profile_models_payload( + profile: ModelProfileEnum, + whitelist: Optional[set[str]], +) -> dict[str, Any]: + config_filename = default_filename_for_profile(profile) + planexe_config_path = PlanExeConfig.resolve_planexe_config_path() + config_path = PlanExeConfig.find_file_in_search_order(config_filename, planexe_config_path) + if config_path is None: + return { + "profile": profile.value, + "title": MODEL_PROFILE_TITLES[profile.value], + "summary": MODEL_PROFILE_SUMMARIES[profile.value], + "model_count": 0, + "models": [], + } + + try: + with config_path.open("r", encoding="utf-8") as fh: + model_map = json.load(fh) + except Exception as exc: + logger.warning( + "Unable to read profile config %s for model profile %s: %s", + config_filename, + profile.value, + exc, + ) + return { + "profile": profile.value, + "title": MODEL_PROFILE_TITLES[profile.value], + "summary": MODEL_PROFILE_SUMMARIES[profile.value], + "model_count": 0, + "models": [], + } + + if not isinstance(model_map, dict): + return { + "profile": profile.value, + "title": MODEL_PROFILE_TITLES[profile.value], + "summary": MODEL_PROFILE_SUMMARIES[profile.value], + "model_count": 0, + "models": [], + } + + models = _extract_model_profile_entries(model_map, whitelist) + return { + "profile": profile.value, + "title": MODEL_PROFILE_TITLES[profile.value], + "summary": MODEL_PROFILE_SUMMARIES[profile.value], + "model_count": len(models), + "models": models, + } + + +def _get_model_profiles_sync() -> dict[str, Any]: + raw_whitelist = os.environ.get(ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES) + whitelist = parse_llm_class_whitelist(raw_whitelist) + default_profile = resolve_model_profile_from_env().value + profiles_all = [ + _profile_models_payload(profile, whitelist) + for profile in ModelProfileEnum + ] + profiles = [profile for profile in profiles_all if int(profile.get("model_count") or 0) > 0] + + return { + "default_profile": default_profile, + "profiles": profiles, + "message": ( + "Use one of these profile values in plan_create.model_profile. " + "Model lists show what is currently available in each profile." + ), + } diff --git a/mcp_cloud/prompt_examples.py b/mcp_cloud/prompt_examples.py new file mode 100644 index 000000000..e55bf5ae1 --- /dev/null +++ b/mcp_cloud/prompt_examples.py @@ -0,0 +1,69 @@ +"""PlanExe MCP Cloud – example prompt loading.""" +import logging +from pathlib import Path + +logger = logging.getLogger(__name__) + + +def _load_mcp_example_prompts() -> list[str]: + """Load prompts from the catalog that are marked as MCP examples (mcp_example or mcp-example-prompt true). + + Uses worker_plan_api.PromptCatalog the same way as frontend_single_user and frontend_multi_user + (no env var). Tries repo-root import first, then adds worker_plan to sys.path so worker_plan_api + is top-level (same as frontends). Falls back to built-in examples if the catalog is unavailable. + """ + catalog = None + try: + from worker_plan.worker_plan_api.prompt_catalog import PromptCatalog + + catalog = PromptCatalog() + catalog.load_simple_plan_prompts() + except Exception: + try: + # Same as frontends when worker_plan exists; when not (e.g. Docker), repo_root has worker_plan_api + import sys + + repo_root = Path(__file__).resolve().parent.parent + worker_plan_dir = repo_root / "worker_plan" + path_to_add = str(worker_plan_dir if worker_plan_dir.exists() else repo_root) + if path_to_add not in sys.path: + sys.path.insert(0, path_to_add) + from worker_plan_api.prompt_catalog import PromptCatalog + + catalog = PromptCatalog() + catalog.load_simple_plan_prompts() + except Exception as e: + logger.warning( + "Prompt catalog unavailable (%s); using built-in examples.", + e, + ) + return _builtin_mcp_example_prompts() + + if catalog is None: + return _builtin_mcp_example_prompts() + + samples: list[str] = [] + for item in catalog.all(): + if item.extras.get("mcp_example") is True or item.extras.get("mcp-example-prompt") is True: + samples.append(item.prompt) + if not samples: + return _builtin_mcp_example_prompts() + return samples + + +def _builtin_mcp_example_prompts() -> list[str]: + """Fallback example prompts when the catalog file is missing or has no mcp_example entries.""" + return [ + ( + "Vegan Butcher Shop. That sells artificial meat (Plant-Based). Location Kødbyen, Copenhagen. " + "Sell sandwiches and sausages. Provocative marketing. Budget: 10 million DKK. Grand Opening in month 3. " + "Profitability Goal: month 12. Create a signature item that is a social media hit. " + "Pick a realistic scenario. I already have negotiated a 2 year lease inside Kødbyen. " + "Banned words: blockchain, VR, AR, AI, Robots." + ), + ( + "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. " + "Budget 2.5M DKK. Open within 12 months. Include equipment, staffing, permits, and marketing. " + "Pick a realistic scenario; avoid overly ambitious timelines." + ), + ] diff --git a/mcp_cloud/schemas.py b/mcp_cloud/schemas.py new file mode 100644 index 000000000..2c2f98077 --- /dev/null +++ b/mcp_cloud/schemas.py @@ -0,0 +1,239 @@ +"""PlanExe MCP Cloud – tool schema constants and ToolDefinition.""" +from dataclasses import dataclass +from typing import Any, Optional + +from mcp_cloud.tool_models import ( + ModelProfilesInput, + ModelProfilesOutput, + PromptExamplesInput, + PromptExamplesOutput, + PlanCreateInput, + PlanCreateOutput, + PlanRetryInput, + PlanRetryOutput, + PlanStopOutput, + PlanStatusInput, + PlanStopInput, + PlanFileInfoInput, + PlanFileInfoNotReadyOutput, + PlanStatusSuccess, + PlanFileInfoReadyOutput, + PlanListInput, + PlanListOutput, + ErrorDetail, +) + +PLAN_CREATE_INPUT_SCHEMA = PlanCreateInput.model_json_schema() +PLAN_CREATE_OUTPUT_SCHEMA = PlanCreateOutput.model_json_schema() +PLAN_STATUS_SUCCESS_SCHEMA = PlanStatusSuccess.model_json_schema() +PLAN_STATUS_OUTPUT_SCHEMA = { + "oneOf": [ + { + "type": "object", + "properties": {"error": ErrorDetail.model_json_schema()}, + "required": ["error"], + }, + PLAN_STATUS_SUCCESS_SCHEMA, + ] +} +PLAN_STOP_OUTPUT_SCHEMA = PlanStopOutput.model_json_schema() +PLAN_RETRY_OUTPUT_SCHEMA = PlanRetryOutput.model_json_schema() +PLAN_FILE_INFO_READY_OUTPUT_SCHEMA = PlanFileInfoReadyOutput.model_json_schema() +PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA = PlanFileInfoNotReadyOutput.model_json_schema() +PLAN_FILE_INFO_OUTPUT_SCHEMA = { + "oneOf": [ + { + "type": "object", + "properties": {"error": ErrorDetail.model_json_schema()}, + "required": ["error"], + }, + PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA, + PLAN_FILE_INFO_READY_OUTPUT_SCHEMA, + ] +} +PLAN_STATUS_INPUT_SCHEMA = PlanStatusInput.model_json_schema() +PLAN_STOP_INPUT_SCHEMA = PlanStopInput.model_json_schema() +PLAN_RETRY_INPUT_SCHEMA = PlanRetryInput.model_json_schema() +PLAN_FILE_INFO_INPUT_SCHEMA = PlanFileInfoInput.model_json_schema() + +PROMPT_EXAMPLES_INPUT_SCHEMA = PromptExamplesInput.model_json_schema() +PROMPT_EXAMPLES_OUTPUT_SCHEMA = PromptExamplesOutput.model_json_schema() +MODEL_PROFILES_INPUT_SCHEMA = ModelProfilesInput.model_json_schema() +MODEL_PROFILES_OUTPUT_SCHEMA = ModelProfilesOutput.model_json_schema() +PLAN_LIST_INPUT_SCHEMA = PlanListInput.model_json_schema() +PLAN_LIST_OUTPUT_SCHEMA = PlanListOutput.model_json_schema() + + +@dataclass(frozen=True) +class ToolDefinition: + name: str + description: str + input_schema: dict[str, Any] + output_schema: Optional[dict[str, Any]] = None + annotations: Optional[dict[str, Any]] = None + +TOOL_DEFINITIONS = [ + ToolDefinition( + name="prompt_examples", + description=( + "Call this first. Returns example prompts that define what a good prompt looks like. " + "Do NOT call plan_create yet. Optional before plan_create: call model_profiles to choose model_profile. " + "Next is a non-tool step: formulate a detailed prompt (typically ~300-800 words; use examples as a baseline, similar structure) and get user approval. " + "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. " + "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. " + "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. " + "The examples demonstrate this prose style — match their tone and density. " + "Then call plan_create. " + "PlanExe is not for tiny one-shot outputs like a 5-point checklist; and it does not support selecting only some internal pipeline steps." + ), + input_schema=PROMPT_EXAMPLES_INPUT_SCHEMA, + output_schema=PROMPT_EXAMPLES_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": True, + "destructiveHint": False, + "idempotentHint": True, + "openWorldHint": False, + }, + ), + ToolDefinition( + name="model_profiles", + description=( + "Optional helper before plan_create. Returns model_profile options with plain-language guidance " + "and currently available models in each profile. " + "If no models are available, returns error code MODEL_PROFILES_UNAVAILABLE." + ), + input_schema=MODEL_PROFILES_INPUT_SCHEMA, + output_schema=MODEL_PROFILES_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": True, + "destructiveHint": False, + "idempotentHint": True, + "openWorldHint": False, + }, + ), + ToolDefinition( + name="plan_create", + description=( + "Call only after prompt_examples and after you have completed prompt drafting/approval (non-tool step). " + "PlanExe turns the approved prompt into a strategic project-plan draft (20+ sections) in ~10-20 min. " + "Sections include: executive summary, interactive Gantt charts, investor pitch, project plan with SMART criteria, " + "strategic decision analysis, scenario comparison, assumptions with expert review, governance structure, " + "SWOT analysis, team role profiles, simulated expert criticism, work breakdown structure, " + "plan review (critical issues, KPIs, financial strategy, automation opportunities), Q&A, " + "premortem with failure scenarios, self-audit checklist, and adversarial premise attacks that argue against the project. " + "The adversarial sections (premortem, self-audit, premise attacks) surface risks and questions the prompter may not have considered. " + "Returns plan_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_file_info. " + "If you lose a plan_id, call plan_list to recover it. " + "Each plan_create call creates a new plan_id (no server-side dedup). " + "If you are unsure which model_profile to choose, call model_profiles first. " + "If your deployment uses credits, include user_api_key to charge the correct account. " + "Common error codes: INVALID_USER_API_KEY, USER_API_KEY_REQUIRED, INSUFFICIENT_CREDITS." + ), + input_schema=PLAN_CREATE_INPUT_SCHEMA, + output_schema=PLAN_CREATE_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": False, + "destructiveHint": False, + "idempotentHint": False, + "openWorldHint": True, + }, + ), + ToolDefinition( + name="plan_status", + description=( + "Returns status and progress of the plan currently being created. " + "Poll at reasonable intervals only (e.g. every 5 minutes): plan generation typically takes 10-20 minutes " + "(baseline profile) and may take longer on higher-quality profiles. " + "State contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. " + "progress_percentage is 0-100 (integer-like float); 100 when completed. " + "files lists intermediate outputs produced so far; use their updated_at timestamps to detect stalls. " + "Unknown plan_id returns error code PLAN_NOT_FOUND. " + "Troubleshooting: pending for >5 minutes likely means queued but not picked up by a worker. " + "processing with no file-output changes for >20 minutes likely means failed/stalled. " + "Report these issues to https://github.com/PlanExeOrg/PlanExe/issues ." + ), + input_schema=PLAN_STATUS_INPUT_SCHEMA, + output_schema=PLAN_STATUS_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": True, + "destructiveHint": False, + "idempotentHint": True, + "openWorldHint": False, + }, + ), + ToolDefinition( + name="plan_stop", + description=( + "Request the plan generation to stop. Pass the plan_id (the UUID returned by plan_create). " + "Stopping is asynchronous: the stop flag is set immediately but the plan may continue briefly before halting. " + "A stopped plan will eventually transition to the failed state. " + "If the plan is already completed or failed, stop_requested returns false (the plan already finished). " + "Unknown plan_id returns error code PLAN_NOT_FOUND." + ), + input_schema=PLAN_STOP_INPUT_SCHEMA, + output_schema=PLAN_STOP_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": False, + "destructiveHint": True, + "idempotentHint": True, + "openWorldHint": False, + }, + ), + ToolDefinition( + name="plan_retry", + description=( + "Retry a plan that is currently in failed state. " + "Pass the failed plan_id and optionally model_profile (defaults to baseline). " + "The plan is reset to pending, prior artifacts are cleared, and the same plan_id is requeued for processing. " + "Returns PLAN_NOT_FOUND when plan_id is unknown and PLAN_NOT_FAILED when the plan is not in failed state." + ), + input_schema=PLAN_RETRY_INPUT_SCHEMA, + output_schema=PLAN_RETRY_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": False, + "destructiveHint": False, + "idempotentHint": False, + "openWorldHint": True, + }, + ), + ToolDefinition( + name="plan_file_info", + description=( + "Returns file metadata (content_type, download_url, download_size) for the report or zip artifact. " + "Use artifact='report' (default) for the interactive HTML report (~700KB, self-contained with embedded JS " + "for collapsible sections and interactive Gantt charts — open in a browser). " + "Use artifact='zip' for the full pipeline output bundle (md, json, csv intermediary files that fed the report). " + "While the task is still pending or processing, returns {ready:false,reason:\"processing\"}. " + "Check readiness by testing whether download_url is present in the response. " + "Once ready, present download_url to the user or fetch and save the file locally. " + "If your client exposes plan_download (e.g. mcp_local), prefer that to save the file locally. " + "Terminal error codes: generation_failed (plan failed), content_unavailable (artifact missing). " + "Unknown plan_id returns error code PLAN_NOT_FOUND." + ), + input_schema=PLAN_FILE_INFO_INPUT_SCHEMA, + output_schema=PLAN_FILE_INFO_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": True, + "destructiveHint": False, + "idempotentHint": True, + "openWorldHint": False, + }, + ), + ToolDefinition( + name="plan_list", + description=( + "List the most recent plans for an authenticated user. " + "Returns up to `limit` plans (default 10, max 50) newest-first, each with plan_id, state, " + "progress_percentage, created_at (ISO 8601), and a prompt_excerpt (first 100 chars). " + "Use this to recover a lost plan_id or to review recent activity." + ), + input_schema=PLAN_LIST_INPUT_SCHEMA, + output_schema=PLAN_LIST_OUTPUT_SCHEMA, + annotations={ + "readOnlyHint": True, + "destructiveHint": False, + "idempotentHint": True, + "openWorldHint": False, + }, + ), +] diff --git a/mcp_cloud/tests/test_download_rate_limit.py b/mcp_cloud/tests/test_download_rate_limit.py new file mode 100644 index 000000000..03eb8417a --- /dev/null +++ b/mcp_cloud/tests/test_download_rate_limit.py @@ -0,0 +1,60 @@ +import asyncio +import unittest +from unittest.mock import MagicMock, patch + +import mcp_cloud.http_server as http_server + + +def _fake_request(path: str, client_host: str = "10.0.0.1") -> MagicMock: + request = MagicMock() + request.url.path = path + request.headers = {} + request.client.host = client_host + return request + + +class TestDownloadRateLimit(unittest.TestCase): + def setUp(self): + """Clear download rate buckets between tests.""" + http_server._download_rate_buckets.clear() + + def test_non_download_path_is_not_rate_limited(self): + request = _fake_request("/mcp/tools/call") + result = asyncio.run(http_server._enforce_download_rate_limit(request)) + self.assertIsNone(result) + + def test_download_path_is_rate_limited(self): + request = _fake_request("/download/abc-123/030-report.html") + for _ in range(http_server.DOWNLOAD_RATE_LIMIT_REQUESTS): + result = asyncio.run(http_server._enforce_download_rate_limit(request)) + self.assertIsNone(result) + # Next request should be rejected + result = asyncio.run(http_server._enforce_download_rate_limit(request)) + self.assertIsNotNone(result) + self.assertEqual(result.status_code, 429) + + def test_different_clients_have_separate_buckets(self): + req_a = _fake_request("/download/abc/030-report.html", client_host="10.0.0.1") + req_b = _fake_request("/download/abc/030-report.html", client_host="10.0.0.2") + for _ in range(http_server.DOWNLOAD_RATE_LIMIT_REQUESTS): + asyncio.run(http_server._enforce_download_rate_limit(req_a)) + # Client A is exhausted + result_a = asyncio.run(http_server._enforce_download_rate_limit(req_a)) + self.assertIsNotNone(result_a) + # Client B still has quota + result_b = asyncio.run(http_server._enforce_download_rate_limit(req_b)) + self.assertIsNone(result_b) + + def test_disabled_when_limit_is_zero(self): + request = _fake_request("/download/abc/030-report.html") + original = http_server.DOWNLOAD_RATE_LIMIT_REQUESTS + try: + http_server.DOWNLOAD_RATE_LIMIT_REQUESTS = 0 + result = asyncio.run(http_server._enforce_download_rate_limit(request)) + self.assertIsNone(result) + finally: + http_server.DOWNLOAD_RATE_LIMIT_REQUESTS = original + + +if __name__ == "__main__": + unittest.main() diff --git a/mcp_cloud/tests/test_download_token.py b/mcp_cloud/tests/test_download_token.py index c3b62471c..cd23983e8 100644 --- a/mcp_cloud/tests/test_download_token.py +++ b/mcp_cloud/tests/test_download_token.py @@ -3,13 +3,14 @@ from unittest.mock import patch import mcp_cloud.app as cloud_app +import mcp_cloud.download_tokens as _dt_mod class TestGenerateAndValidateDownloadToken(unittest.TestCase): def setUp(self): # Pin the secret so tests are deterministic regardless of env vars. self._secret_patch = patch.object( - cloud_app, + _dt_mod, "_get_download_token_secret", return_value=b"test-secret-for-unit-tests", ) @@ -69,27 +70,27 @@ def test_different_tasks_get_different_tokens(self): self.assertNotEqual(t1, t2) def test_report_url_contains_token(self): - with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"): + with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"): url = cloud_app.build_report_download_url("task-abc") self.assertIsNotNone(url) self.assertIn("?token=", url) self.assertIn("/download/task-abc/030-report.html", url) def test_zip_url_contains_token(self): - with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"): + with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"): url = cloud_app.build_zip_download_url("task-abc") self.assertIsNotNone(url) self.assertIn("?token=", url) self.assertIn("/download/task-abc/run.zip", url) def test_token_embedded_in_report_url_is_valid(self): - with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"): + with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"): url = cloud_app.build_report_download_url("task-abc") token = url.split("?token=")[1] self.assertTrue(cloud_app.validate_download_token(token, "task-abc", "030-report.html")) def test_token_embedded_in_zip_url_is_valid(self): - with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"): + with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"): url = cloud_app.build_zip_download_url("task-abc") token = url.split("?token=")[1] self.assertTrue(cloud_app.validate_download_token(token, "task-abc", "run.zip")) diff --git a/mcp_cloud/tests/test_model_profiles_tool.py b/mcp_cloud/tests/test_model_profiles_tool.py index c8650a352..4838c7b0e 100644 --- a/mcp_cloud/tests/test_model_profiles_tool.py +++ b/mcp_cloud/tests/test_model_profiles_tool.py @@ -33,7 +33,7 @@ def test_model_profiles_returns_structured_content(self): "message": "Use one of these profile values in plan_create.model_profile.", } - with patch("mcp_cloud.app._get_model_profiles_sync", return_value=payload): + with patch("mcp_cloud.handlers._get_model_profiles_sync", return_value=payload): result = asyncio.run(handle_model_profiles({})) self.assertFalse(result.isError) @@ -48,7 +48,7 @@ def test_model_profiles_returns_error_when_none_available(self): "message": "Use one of these profile values in plan_create.model_profile.", } - with patch("mcp_cloud.app._get_model_profiles_sync", return_value=payload): + with patch("mcp_cloud.handlers._get_model_profiles_sync", return_value=payload): result = asyncio.run(handle_model_profiles({})) self.assertTrue(result.isError) diff --git a/mcp_cloud/tests/test_task_create_tool.py b/mcp_cloud/tests/test_plan_create_tool.py similarity index 84% rename from mcp_cloud/tests/test_task_create_tool.py rename to mcp_cloud/tests/test_plan_create_tool.py index f4a278182..8f429e4f2 100644 --- a/mcp_cloud/tests/test_task_create_tool.py +++ b/mcp_cloud/tests/test_plan_create_tool.py @@ -29,18 +29,18 @@ def __init__(self, prompt: str, state, user_id: str, parameters): self.parameters = parameters self.timestamp_created = datetime.now(UTC) - with patch("mcp_cloud.app.app.app_context", return_value=nullcontext()), patch( - "mcp_cloud.app.db.session", fake_session + with patch("mcp_cloud.db_queries.app.app_context", return_value=nullcontext()), patch( + "mcp_cloud.db_queries.db.session", fake_session ), patch( - "mcp_cloud.app.PlanItem", StubPlanItem + "mcp_cloud.db_queries.PlanItem", StubPlanItem ): result = asyncio.run(handle_plan_create(arguments)) self.assertIsInstance(result, CallToolResult) self.assertIsInstance(result.structuredContent, dict) - self.assertIn("task_id", result.structuredContent) + self.assertIn("plan_id", result.structuredContent) self.assertIn("created_at", result.structuredContent) - self.assertIsInstance(uuid.UUID(result.structuredContent["task_id"]), uuid.UUID) + self.assertIsInstance(uuid.UUID(result.structuredContent["plan_id"]), uuid.UUID) if __name__ == "__main__": diff --git a/mcp_cloud/tests/test_task_file_info_tool.py b/mcp_cloud/tests/test_plan_file_info_tool.py similarity index 76% rename from mcp_cloud/tests/test_task_file_info_tool.py rename to mcp_cloud/tests/test_plan_file_info_tool.py index 7f7d2b835..3af33f2b6 100644 --- a/mcp_cloud/tests/test_task_file_info_tool.py +++ b/mcp_cloud/tests/test_plan_file_info_tool.py @@ -40,17 +40,17 @@ def test_zip_helpers(self): def test_report_read_defaults_to_metadata(self): task_id = str(uuid.uuid4()) content_bytes = b"a" * 10 - task_snapshot = { + plan_snapshot = { "id": "task-id", "state": PlanState.completed, "progress_message": None, } - with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot): + with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot): with patch( - "mcp_cloud.app.fetch_artifact_from_worker_plan", + "mcp_cloud.handlers.fetch_artifact_from_worker_plan", new=AsyncMock(return_value=content_bytes), ): - result = asyncio.run(handle_plan_file_info({"task_id": task_id})) + result = asyncio.run(handle_plan_file_info({"plan_id": task_id})) payload = result.structuredContent self.assertEqual(payload["download_size"], len(content_bytes)) @@ -62,17 +62,17 @@ def test_report_read_defaults_to_metadata(self): def test_report_read_zip(self): task_id = str(uuid.uuid4()) content_bytes = b"zipdata" - task_snapshot = { + plan_snapshot = { "id": "task-id", "state": PlanState.completed, "progress_message": None, } - with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot): + with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot): with patch( - "mcp_cloud.app.fetch_user_downloadable_zip", + "mcp_cloud.handlers.fetch_user_downloadable_zip", new=AsyncMock(return_value=content_bytes), ): - result = asyncio.run(handle_plan_file_info({"task_id": task_id, "artifact": "zip"})) + result = asyncio.run(handle_plan_file_info({"plan_id": task_id, "artifact": "zip"})) payload = result.structuredContent self.assertEqual(payload["download_size"], len(content_bytes)) @@ -81,17 +81,17 @@ def test_report_read_zip(self): def test_report_read_zip_for_failed_task(self): task_id = str(uuid.uuid4()) content_bytes = b"zipdata" - task_snapshot = { + plan_snapshot = { "id": "task-id", "state": PlanState.failed, "progress_message": "Stopped", } - with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot): + with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot): with patch( - "mcp_cloud.app.fetch_user_downloadable_zip", + "mcp_cloud.handlers.fetch_user_downloadable_zip", new=AsyncMock(return_value=content_bytes), ): - result = asyncio.run(handle_plan_file_info({"task_id": task_id, "artifact": "zip"})) + result = asyncio.run(handle_plan_file_info({"plan_id": task_id, "artifact": "zip"})) payload = result.structuredContent self.assertEqual(payload["download_size"], len(content_bytes)) @@ -99,26 +99,26 @@ def test_report_read_zip_for_failed_task(self): def test_plan_file_info_returns_empty_object_when_pending(self): task_id = str(uuid.uuid4()) - task_snapshot = { + plan_snapshot = { "id": "task-id", "state": PlanState.pending, "progress_message": None, } - with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot): - result = asyncio.run(handle_plan_file_info({"task_id": task_id})) + with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot): + result = asyncio.run(handle_plan_file_info({"plan_id": task_id})) self.assertFalse(result.isError) - self.assertEqual(result.structuredContent, {}) + self.assertEqual(result.structuredContent, {"ready": False, "reason": "processing"}) def test_plan_file_info_returns_generation_failed_payload(self): task_id = str(uuid.uuid4()) - task_snapshot = { + plan_snapshot = { "id": "task-id", "state": PlanState.failed, "progress_message": "Pipeline failed", } - with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot): - result = asyncio.run(handle_plan_file_info({"task_id": task_id, "artifact": "report"})) + with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot): + result = asyncio.run(handle_plan_file_info({"plan_id": task_id, "artifact": "report"})) self.assertFalse(result.isError) self.assertEqual(result.structuredContent["error"]["code"], "generation_failed") diff --git a/mcp_cloud/tests/test_plan_list_tool.py b/mcp_cloud/tests/test_plan_list_tool.py new file mode 100644 index 000000000..4d7b46b1c --- /dev/null +++ b/mcp_cloud/tests/test_plan_list_tool.py @@ -0,0 +1,99 @@ +import asyncio +import unittest +from unittest.mock import patch + +from mcp.types import CallToolResult +from mcp_cloud.app import handle_list_tools, handle_plan_list + + +class TestPlanListTool(unittest.TestCase): + def test_plan_list_tool_listed(self): + tools = asyncio.run(handle_list_tools()) + tool_names = {tool.name for tool in tools} + self.assertIn("plan_list", tool_names) + + def test_plan_list_returns_plans(self): + fake_plans = [ + { + "plan_id": "aaa-111", + "state": "completed", + "progress_percentage": 100.0, + "created_at": "2026-01-01T00:00:00Z", + "prompt_excerpt": "Build a rocket", + }, + { + "plan_id": "bbb-222", + "state": "processing", + "progress_percentage": 42.0, + "created_at": "2026-01-02T00:00:00Z", + "prompt_excerpt": "Open a bakery", + }, + ] + user_context = {"user_id": "user-1", "credits_balance": 10.0} + with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \ + patch("mcp_cloud.handlers._list_plans_sync", return_value=fake_plans): + result = asyncio.run(handle_plan_list({"user_api_key": "pex_test", "limit": 10})) + + self.assertIsInstance(result, CallToolResult) + self.assertFalse(result.isError) + self.assertEqual(len(result.structuredContent["plans"]), 2) + self.assertIn("Returned 2 plan(s)", result.structuredContent["message"]) + + def test_plan_list_empty_result(self): + user_context = {"user_id": "user-1", "credits_balance": 10.0} + with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \ + patch("mcp_cloud.handlers._list_plans_sync", return_value=[]): + result = asyncio.run(handle_plan_list({"user_api_key": "pex_test"})) + + self.assertFalse(result.isError) + self.assertEqual(result.structuredContent["plans"], []) + self.assertIn("Returned 0 plan(s)", result.structuredContent["message"]) + + def test_plan_list_clamps_limit(self): + """Limit is clamped to [1, 50].""" + user_context = {"user_id": "user-1", "credits_balance": 10.0} + with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \ + patch("mcp_cloud.handlers._list_plans_sync", return_value=[]) as mock_list: + asyncio.run(handle_plan_list({"user_api_key": "pex_test", "limit": 999})) + _, call_args = mock_list.call_args[0][0], mock_list.call_args[0][1] + self.assertEqual(call_args, 50) + + asyncio.run(handle_plan_list({"user_api_key": "pex_test", "limit": -5})) + _, call_args = mock_list.call_args[0][0], mock_list.call_args[0][1] + self.assertEqual(call_args, 1) + + def test_plan_list_invalid_user_api_key(self): + with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=None): + result = asyncio.run(handle_plan_list({"user_api_key": "pex_bad"})) + + self.assertTrue(result.isError) + self.assertEqual(result.structuredContent["error"]["code"], "INVALID_USER_API_KEY") + + def test_plan_list_requires_key_when_env_set(self): + with patch.dict("os.environ", {"PLANEXE_MCP_REQUIRE_USER_KEY": "true"}): + result = asyncio.run(handle_plan_list({"limit": 5})) + + self.assertTrue(result.isError) + self.assertEqual(result.structuredContent["error"]["code"], "USER_API_KEY_REQUIRED") + + def test_plan_list_no_key_when_not_required(self): + """When key is not required and not provided, returns all tasks (user_id=None).""" + with patch.dict("os.environ", {"PLANEXE_MCP_REQUIRE_USER_KEY": "false"}), \ + patch("mcp_cloud.handlers._list_plans_sync", return_value=[]) as mock_list: + result = asyncio.run(handle_plan_list({"limit": 5})) + + self.assertFalse(result.isError) + # user_id should be None + self.assertIsNone(mock_list.call_args[0][0]) + + def test_plan_list_uses_default_limit(self): + user_context = {"user_id": "user-1", "credits_balance": 10.0} + with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \ + patch("mcp_cloud.handlers._list_plans_sync", return_value=[]) as mock_list: + asyncio.run(handle_plan_list({"user_api_key": "pex_test"})) + _, call_args = mock_list.call_args[0][0], mock_list.call_args[0][1] + self.assertEqual(call_args, 10) + + +if __name__ == "__main__": + unittest.main() diff --git a/mcp_cloud/tests/test_task_retry_tool.py b/mcp_cloud/tests/test_plan_retry_tool.py similarity index 61% rename from mcp_cloud/tests/test_task_retry_tool.py rename to mcp_cloud/tests/test_plan_retry_tool.py index 1c0ef4c9d..3b1c1bf7a 100644 --- a/mcp_cloud/tests/test_task_retry_tool.py +++ b/mcp_cloud/tests/test_plan_retry_tool.py @@ -16,36 +16,36 @@ def test_plan_retry_tool_listed(self): def test_plan_retry_returns_structured_content(self): task_id = str(uuid.uuid4()) payload = { - "task_id": task_id, + "plan_id": task_id, "state": "pending", "model_profile": "baseline", "retried_at": "2026-01-01T00:00:00Z", } - with patch("mcp_cloud.app._retry_failed_task_sync", return_value=payload): - result = asyncio.run(handle_plan_retry({"task_id": task_id})) + with patch("mcp_cloud.handlers._retry_failed_plan_sync", return_value=payload): + result = asyncio.run(handle_plan_retry({"plan_id": task_id})) self.assertIsInstance(result, CallToolResult) self.assertFalse(result.isError) - self.assertEqual(result.structuredContent["task_id"], task_id) + self.assertEqual(result.structuredContent["plan_id"], task_id) self.assertEqual(result.structuredContent["state"], "pending") self.assertEqual(result.structuredContent["model_profile"], "baseline") - def test_plan_retry_returns_task_not_found(self): + def test_plan_retry_returns_plan_not_found(self): task_id = str(uuid.uuid4()) - with patch("mcp_cloud.app._retry_failed_task_sync", return_value=None): - result = asyncio.run(handle_plan_retry({"task_id": task_id})) + with patch("mcp_cloud.handlers._retry_failed_plan_sync", return_value=None): + result = asyncio.run(handle_plan_retry({"plan_id": task_id})) self.assertTrue(result.isError) - self.assertEqual(result.structuredContent["error"]["code"], "TASK_NOT_FOUND") + self.assertEqual(result.structuredContent["error"]["code"], "PLAN_NOT_FOUND") - def test_plan_retry_returns_task_not_failed(self): + def test_plan_retry_returns_plan_not_failed(self): task_id = str(uuid.uuid4()) - payload = {"error": {"code": "TASK_NOT_FAILED", "message": "Task is not failed."}} - with patch("mcp_cloud.app._retry_failed_task_sync", return_value=payload): - result = asyncio.run(handle_plan_retry({"task_id": task_id})) + payload = {"error": {"code": "PLAN_NOT_FAILED", "message": "Plan is not failed."}} + with patch("mcp_cloud.handlers._retry_failed_plan_sync", return_value=payload): + result = asyncio.run(handle_plan_retry({"plan_id": task_id})) self.assertTrue(result.isError) - self.assertEqual(result.structuredContent["error"]["code"], "TASK_NOT_FAILED") + self.assertEqual(result.structuredContent["error"]["code"], "PLAN_NOT_FAILED") if __name__ == "__main__": diff --git a/mcp_cloud/tests/test_task_status_tool.py b/mcp_cloud/tests/test_plan_status_tool.py similarity index 66% rename from mcp_cloud/tests/test_task_status_tool.py rename to mcp_cloud/tests/test_plan_status_tool.py index bff6aa3f5..6d44f0f27 100644 --- a/mcp_cloud/tests/test_task_status_tool.py +++ b/mcp_cloud/tests/test_plan_status_tool.py @@ -12,7 +12,7 @@ class TestPlanStatusTool(unittest.TestCase): def test_plan_status_returns_structured_content(self): task_id = str(uuid.uuid4()) - task_snapshot = { + plan_snapshot = { "id": task_id, "state": PlanState.completed, "stop_requested": False, @@ -20,16 +20,16 @@ def test_plan_status_returns_structured_content(self): "timestamp_created": datetime.now(UTC), } with patch( - "mcp_cloud.app._get_task_status_snapshot_sync", - return_value=task_snapshot, + "mcp_cloud.handlers._get_plan_status_snapshot_sync", + return_value=plan_snapshot, ), patch( - "mcp_cloud.app.fetch_file_list_from_worker_plan", new=AsyncMock(return_value=[]) + "mcp_cloud.handlers.fetch_file_list_from_worker_plan", new=AsyncMock(return_value=[]) ): - result = asyncio.run(handle_plan_status({"task_id": task_id})) + result = asyncio.run(handle_plan_status({"plan_id": task_id})) self.assertIsInstance(result, CallToolResult) self.assertIsInstance(result.structuredContent, dict) - self.assertEqual(result.structuredContent["task_id"], task_id) + self.assertEqual(result.structuredContent["plan_id"], task_id) self.assertIn("state", result.structuredContent) self.assertIn("progress_percentage", result.structuredContent) self.assertIsInstance(result.structuredContent["progress_percentage"], float) @@ -37,7 +37,7 @@ def test_plan_status_returns_structured_content(self): def test_plan_status_falls_back_to_zip_snapshot_files_when_primary_source_empty(self): task_id = str(uuid.uuid4()) - task_snapshot = { + plan_snapshot = { "id": task_id, "state": PlanState.processing, "stop_requested": False, @@ -45,19 +45,19 @@ def test_plan_status_falls_back_to_zip_snapshot_files_when_primary_source_empty( "timestamp_created": datetime.now(UTC), } with patch( - "mcp_cloud.app._get_task_status_snapshot_sync", - return_value=task_snapshot, + "mcp_cloud.handlers._get_plan_status_snapshot_sync", + return_value=plan_snapshot, ), patch( - "mcp_cloud.app.fetch_file_list_from_worker_plan", + "mcp_cloud.handlers.fetch_file_list_from_worker_plan", new=AsyncMock(return_value=[]), ), patch( - "mcp_cloud.app.list_files_from_zip_snapshot", + "mcp_cloud.handlers.list_files_from_zip_snapshot", return_value=["001-2-plan.txt", "log.txt"], ), patch( - "mcp_cloud.app.list_files_from_local_run_dir", + "mcp_cloud.handlers.list_files_from_local_run_dir", return_value=None, ): - result = asyncio.run(handle_plan_status({"task_id": task_id})) + result = asyncio.run(handle_plan_status({"plan_id": task_id})) files = result.structuredContent["files"] self.assertEqual(len(files), 1) @@ -65,7 +65,7 @@ def test_plan_status_falls_back_to_zip_snapshot_files_when_primary_source_empty( def test_plan_status_uses_processing_state_name(self): task_id = str(uuid.uuid4()) - task_snapshot = { + plan_snapshot = { "id": task_id, "state": PlanState.processing, "stop_requested": True, @@ -73,23 +73,23 @@ def test_plan_status_uses_processing_state_name(self): "timestamp_created": datetime.now(UTC), } with patch( - "mcp_cloud.app._get_task_status_snapshot_sync", - return_value=task_snapshot, + "mcp_cloud.handlers._get_plan_status_snapshot_sync", + return_value=plan_snapshot, ), patch( - "mcp_cloud.app.fetch_file_list_from_worker_plan", + "mcp_cloud.handlers.fetch_file_list_from_worker_plan", new=AsyncMock(return_value=[]), ): - result = asyncio.run(handle_plan_status({"task_id": task_id})) + result = asyncio.run(handle_plan_status({"plan_id": task_id})) self.assertEqual(result.structuredContent["state"], "processing") - def test_plan_status_returns_task_not_found_error(self): + def test_plan_status_returns_plan_not_found_error(self): task_id = str(uuid.uuid4()) - with patch("mcp_cloud.app._get_task_status_snapshot_sync", return_value=None): - result = asyncio.run(handle_plan_status({"task_id": task_id})) + with patch("mcp_cloud.handlers._get_plan_status_snapshot_sync", return_value=None): + result = asyncio.run(handle_plan_status({"plan_id": task_id})) self.assertTrue(result.isError) - self.assertEqual(result.structuredContent["error"]["code"], "TASK_NOT_FOUND") + self.assertEqual(result.structuredContent["error"]["code"], "PLAN_NOT_FOUND") if __name__ == "__main__": diff --git a/mcp_cloud/tests/test_secret_validation.py b/mcp_cloud/tests/test_secret_validation.py new file mode 100644 index 000000000..3c1222842 --- /dev/null +++ b/mcp_cloud/tests/test_secret_validation.py @@ -0,0 +1,38 @@ +"""Tests for startup secret validation (4.1 fail-hard on missing secrets).""" +import unittest +from unittest.mock import patch + +from mcp_cloud.auth import validate_api_key_secret +from mcp_cloud.download_tokens import validate_download_token_secret + + +class TestValidateApiKeySecret(unittest.TestCase): + def test_raises_when_not_set(self): + with patch.dict("os.environ", {}, clear=True): + with self.assertRaises(RuntimeError) as ctx: + validate_api_key_secret() + self.assertIn("PLANEXE_API_KEY_SECRET", str(ctx.exception)) + + def test_passes_when_set(self): + with patch.dict("os.environ", {"PLANEXE_API_KEY_SECRET": "my-secret"}): + validate_api_key_secret() # should not raise + + +class TestValidateDownloadTokenSecret(unittest.TestCase): + def test_raises_when_neither_set(self): + with patch.dict("os.environ", {}, clear=True): + with self.assertRaises(RuntimeError) as ctx: + validate_download_token_secret() + self.assertIn("PLANEXE_DOWNLOAD_TOKEN_SECRET", str(ctx.exception)) + + def test_passes_with_download_token_secret(self): + with patch.dict("os.environ", {"PLANEXE_DOWNLOAD_TOKEN_SECRET": "tok-secret"}, clear=True): + validate_download_token_secret() + + def test_passes_with_api_key_secret(self): + with patch.dict("os.environ", {"PLANEXE_API_KEY_SECRET": "api-secret"}, clear=True): + validate_download_token_secret() + + +if __name__ == "__main__": + unittest.main() diff --git a/mcp_cloud/tests/test_tool_surface_consistency.py b/mcp_cloud/tests/test_tool_surface_consistency.py index 9adbf8660..245056dca 100644 --- a/mcp_cloud/tests/test_tool_surface_consistency.py +++ b/mcp_cloud/tests/test_tool_surface_consistency.py @@ -51,19 +51,19 @@ def test_local_plan_create_schema_has_user_api_key(self): class TestPlanListInputSchemaHasUserApiKey(unittest.TestCase): - """user_api_key must be required in the plan_list input schema.""" + """user_api_key must be in plan_list input schema but NOT required.""" - def test_cloud_plan_list_schema_requires_user_api_key(self): + def test_cloud_plan_list_schema_has_optional_user_api_key(self): props = cloud_app.PLAN_LIST_INPUT_SCHEMA.get("properties", {}) self.assertIn("user_api_key", props) required = cloud_app.PLAN_LIST_INPUT_SCHEMA.get("required", []) - self.assertIn("user_api_key", required) + self.assertNotIn("user_api_key", required) - def test_local_plan_list_schema_requires_user_api_key(self): + def test_local_plan_list_schema_has_optional_user_api_key(self): props = local_app.PLAN_LIST_INPUT_SCHEMA.get("properties", {}) self.assertIn("user_api_key", props) required = local_app.PLAN_LIST_INPUT_SCHEMA.get("required", []) - self.assertIn("user_api_key", required) + self.assertNotIn("user_api_key", required) class TestPlanRetryInputSchemaDefaults(unittest.TestCase): diff --git a/mcp_cloud/tool_models.py b/mcp_cloud/tool_models.py index 4925f0360..13355a0df 100644 --- a/mcp_cloud/tool_models.py +++ b/mcp_cloud/tool_models.py @@ -72,23 +72,23 @@ class ModelProfilesOutput(BaseModel): class PlanStatusInput(BaseModel): - task_id: str = Field( + plan_id: str = Field( ..., - description="Task UUID returned by plan_create. Use it to reference the plan being created.", + description="Plan UUID returned by plan_create. Use it to reference the plan being created.", ) class PlanStopInput(BaseModel): - task_id: str = Field( + plan_id: str = Field( ..., - description="The UUID returned by plan_create. Call plan_stop with this task_id to request the plan generation to stop.", + description="The UUID returned by plan_create. Call plan_stop with this plan_id to request the plan generation to stop.", ) class PlanRetryInput(BaseModel): - task_id: str = Field( + plan_id: str = Field( ..., - description="UUID of the failed task to retry.", + description="UUID of the failed plan to retry.", ) model_profile: Literal["baseline", "premium", "frontier", "custom"] = Field( default="baseline", @@ -99,9 +99,9 @@ class PlanRetryInput(BaseModel): class PlanFileInfoInput(BaseModel): - task_id: str = Field( + plan_id: str = Field( ..., - description="Task UUID returned by plan_create. Use it to download the created plan.", + description="Plan UUID returned by plan_create. Use it to download the created plan.", ) artifact: str = Field( default="report", @@ -110,9 +110,9 @@ class PlanFileInfoInput(BaseModel): class PlanCreateOutput(BaseModel): - task_id: str = Field( + plan_id: str = Field( ..., - description="Task UUID returned by plan_create. Stable across plan_status/plan_stop/plan_file_info." + description="Plan UUID returned by plan_create. Stable across plan_status/plan_stop/plan_file_info." ) created_at: str @@ -128,9 +128,9 @@ class PlanStatusFile(BaseModel): class PlanStatusSuccess(BaseModel): - task_id: str = Field( + plan_id: str = Field( ..., - description="Task UUID returned by plan_create." + description="Plan UUID returned by plan_create." ) state: Literal["pending", "processing", "completed", "failed"] = Field( ..., @@ -149,15 +149,15 @@ class PlanStatusSuccess(BaseModel): description=( "Intermediate output files produced so far. " "Use updated_at timestamps to detect stalls. " - "These files are included in the zip artifact when the task completes." + "These files are included in the zip artifact when the plan completes." ), ) class PlanStatusOutput(BaseModel): - task_id: str | None = Field( + plan_id: str | None = Field( default=None, - description="Task UUID returned by plan_create." + description="Plan UUID returned by plan_create." ) state: Literal["pending", "processing", "completed", "failed"] | None = Field( default=None, @@ -176,7 +176,7 @@ class PlanStatusOutput(BaseModel): description=( "Intermediate output files produced so far. " "Use updated_at timestamps to detect stalls. " - "These files are included in the zip artifact when the task completes." + "These files are included in the zip artifact when the plan completes." ), ) error: ErrorDetail | None = None @@ -185,7 +185,7 @@ class PlanStatusOutput(BaseModel): class PlanStopOutput(BaseModel): state: Literal["pending", "processing", "completed", "failed"] | None = Field( default=None, - description="Current task state after stop request.", + description="Current plan state after stop request.", ) stop_requested: bool | None = Field( default=None, @@ -195,13 +195,13 @@ class PlanStopOutput(BaseModel): class PlanRetryOutput(BaseModel): - task_id: str | None = Field( + plan_id: str | None = Field( default=None, - description="Task UUID that was retried (same ID as the failed task).", + description="Plan UUID that was retried (same ID as the failed plan).", ) state: Literal["pending", "processing", "completed", "failed"] | None = Field( default=None, - description="Current task state after retry request.", + description="Current plan state after retry request.", ) model_profile: Literal["baseline", "premium", "frontier", "custom"] | None = Field( default=None, @@ -241,23 +241,23 @@ class PlanFileInfoOutput(BaseModel): class PlanListInput(BaseModel): - user_api_key: str = Field( - ..., - description="User API key (pex_...) to scope the task list to the authenticated user.", + user_api_key: str | None = Field( + default=None, + description="Optional user API key for credits and attribution.", ) limit: int = Field( default=10, ge=1, le=50, - description="Maximum number of tasks to return (1–50). Newest tasks are returned first.", + description="Maximum number of plans to return (1–50). Newest plans are returned first.", ) class PlanListItem(BaseModel): - task_id: str = Field(..., description="Task UUID.") + plan_id: str = Field(..., description="Plan UUID.") state: Literal["pending", "processing", "completed", "failed"] = Field( ..., - description="Current task state.", + description="Current plan state.", ) progress_percentage: float = Field(..., description="Progress from 0 to 100.") created_at: str = Field(..., description="UTC creation timestamp (ISO 8601).") @@ -265,8 +265,8 @@ class PlanListItem(BaseModel): class PlanListOutput(BaseModel): - tasks: list[PlanListItem] = Field(..., description="Tasks for the authenticated user, newest first.") - message: str = Field(..., description="Human-readable summary (e.g. how many tasks were returned).") + plans: list[PlanListItem] = Field(..., description="Plans for the authenticated user, newest first.") + message: str = Field(..., description="Human-readable summary (e.g. how many plans were returned).") class PlanCreateInput(BaseModel): @@ -296,25 +296,3 @@ class PlanCreateInput(BaseModel): description="Optional user API key for credits and attribution.", ) - -# --------------------------------------------------------------------------- -# Backward-compatible aliases for old Task* names (used internally in app.py) -# --------------------------------------------------------------------------- -TaskCreateInput = PlanCreateInput -TaskCreateOutput = PlanCreateOutput -TaskStatusInput = PlanStatusInput -TaskStatusOutput = PlanStatusOutput -TaskStatusTiming = PlanStatusTiming -TaskStatusFile = PlanStatusFile -TaskStatusSuccess = PlanStatusSuccess -TaskStopInput = PlanStopInput -TaskStopOutput = PlanStopOutput -TaskRetryInput = PlanRetryInput -TaskRetryOutput = PlanRetryOutput -TaskFileInfoInput = PlanFileInfoInput -TaskFileInfoOutput = PlanFileInfoOutput -TaskFileInfoNotReadyOutput = PlanFileInfoNotReadyOutput -TaskFileInfoReadyOutput = PlanFileInfoReadyOutput -TaskListInput = PlanListInput -TaskListItem = PlanListItem -TaskListOutput = PlanListOutput diff --git a/mcp_cloud/worker_fetchers.py b/mcp_cloud/worker_fetchers.py new file mode 100644 index 000000000..0819b61db --- /dev/null +++ b/mcp_cloud/worker_fetchers.py @@ -0,0 +1,216 @@ +"""PlanExe MCP Cloud – HTTP fetchers for worker_plan artifacts.""" +import asyncio +import logging +import tempfile +from io import BytesIO +from typing import Optional + +import httpx + +from mcp_cloud.db_setup import ( + BASE_DIR_RUN, + REPORT_FILENAME, + WORKER_PLAN_URL, + ZIP_SNAPSHOT_MAX_BYTES, +) +from mcp_cloud.db_queries import get_plan_by_id +from mcp_cloud.zip_utils import ( + _sanitize_legacy_zip_snapshot, + extract_file_from_zip_file, + fetch_file_from_zip_snapshot, + fetch_report_from_db, + fetch_zip_snapshot, + list_files_from_zip_snapshot, +) + +logger = logging.getLogger(__name__) + + +async def fetch_artifact_from_worker_plan(run_id: str, file_path: str) -> Optional[bytes]: + """Fetch an artifact file from worker_plan via HTTP.""" + try: + async with httpx.AsyncClient(timeout=60.0) as client: + # For report.html, use the dedicated report endpoint (most efficient) + if ( + file_path == "report.html" + or file_path.endswith("/report.html") + or file_path == REPORT_FILENAME + or file_path.endswith(f"/{REPORT_FILENAME}") + ): + report_response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/report") + if report_response.status_code == 200: + return report_response.content + logger.warning(f"Worker plan returned {report_response.status_code} for report: {run_id}") + report_from_db = await asyncio.to_thread(fetch_report_from_db, run_id) + if report_from_db is not None: + return report_from_db + report_from_zip = await asyncio.to_thread( + fetch_file_from_zip_snapshot, run_id, REPORT_FILENAME + ) + if report_from_zip is not None: + return report_from_zip + return None + + # For other files, fetch the zip and extract the file + # This is less efficient but works without a file serving endpoint + async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as zip_response: + if zip_response.status_code != 200: + logger.warning(f"Worker plan returned {zip_response.status_code} for zip: {run_id}") + else: + zip_too_large = False + content_length = zip_response.headers.get("content-length") + if content_length: + try: + if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES: + logger.warning( + "Zip snapshot too large (%s bytes) for run %s; skipping.", + content_length, + run_id, + ) + zip_too_large = True + except ValueError: + logger.warning( + "Invalid Content-Length for zip snapshot: %s", content_length + ) + if not zip_too_large: + with tempfile.TemporaryFile() as tmp_file: + size = 0 + async for chunk in zip_response.aiter_bytes(): + size += len(chunk) + if size > ZIP_SNAPSHOT_MAX_BYTES: + logger.warning( + "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.", + ZIP_SNAPSHOT_MAX_BYTES, + run_id, + ) + zip_too_large = True + break + tmp_file.write(chunk) + if not zip_too_large: + tmp_file.seek(0) + file_data = extract_file_from_zip_file(tmp_file, file_path) + if file_data is not None: + return file_data + + snapshot_file = await asyncio.to_thread(fetch_file_from_zip_snapshot, run_id, file_path) + if snapshot_file is not None: + return snapshot_file + return None + + except Exception as e: + logger.error(f"Error fetching artifact from worker_plan: {e}", exc_info=True) + return None + +async def fetch_file_list_from_worker_plan(run_id: str) -> Optional[list[str]]: + """Fetch the list of files from worker_plan via HTTP.""" + try: + async with httpx.AsyncClient(timeout=30.0) as client: + response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/files") + if response.status_code == 200: + data = response.json() + files = data.get("files", []) + if files: + return files + fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id) + if fallback_files: + return fallback_files + return files + logger.warning(f"Worker plan returned {response.status_code} for files list: {run_id}") + fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id) + if fallback_files is not None: + return fallback_files + return None + except Exception as e: + logger.error(f"Error fetching file list from worker_plan: {e}", exc_info=True) + return None + + +def list_files_from_local_run_dir(run_id: str) -> Optional[list[str]]: + """ + List files from local run directory when this service shares PLANEXE_RUN_DIR + with the worker (e.g., Docker compose). + """ + run_dir = (BASE_DIR_RUN / run_id).resolve() + try: + if not run_dir.is_relative_to(BASE_DIR_RUN): + return None + except ValueError: + return None + if not run_dir.exists() or not run_dir.is_dir(): + return None + try: + return sorted([path.name for path in run_dir.iterdir() if path.is_file()]) + except Exception as exc: + logger.warning("Unable to list local run dir files for %s: %s", run_id, exc) + return None + +async def fetch_zip_from_worker_plan(run_id: str) -> Optional[bytes]: + """Fetch the zip snapshot from worker_plan via HTTP.""" + try: + async with httpx.AsyncClient(timeout=60.0) as client: + async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as response: + if response.status_code != 200: + logger.warning("Worker plan returned %s for zip: %s", response.status_code, run_id) + else: + zip_too_large = False + content_length = response.headers.get("content-length") + if content_length: + try: + if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES: + logger.warning( + "Zip snapshot too large (%s bytes) for run %s; skipping.", + content_length, + run_id, + ) + zip_too_large = True + except ValueError: + logger.warning( + "Invalid Content-Length for zip snapshot: %s", content_length + ) + if not zip_too_large: + buffer = BytesIO() + size = 0 + async for chunk in response.aiter_bytes(): + size += len(chunk) + if size > ZIP_SNAPSHOT_MAX_BYTES: + logger.warning( + "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.", + ZIP_SNAPSHOT_MAX_BYTES, + run_id, + ) + zip_too_large = True + break + buffer.write(chunk) + if not zip_too_large: + return buffer.getvalue() + + snapshot_bytes = await asyncio.to_thread(fetch_zip_snapshot, run_id) + if snapshot_bytes is not None: + return snapshot_bytes + return None + except Exception as e: + logger.error(f"Error fetching zip from worker_plan: {e}", exc_info=True) + return None + + +async def fetch_user_downloadable_zip(task_id: str) -> Optional[bytes]: + """ + Fetch a user-downloadable zip for a task. + New layout snapshots are served directly from PlanItem.run_zip_snapshot. + Legacy/task-dir fallbacks are sanitized to remove track_activity.jsonl. + """ + plan = await asyncio.to_thread(get_plan_by_id, task_id) + if plan is None: + return None + + snapshot_bytes = plan.run_zip_snapshot if plan.run_zip_snapshot is not None else None + layout_version = plan.run_artifact_layout_version or 0 + if snapshot_bytes is not None: + if layout_version >= 2: + return snapshot_bytes + return _sanitize_legacy_zip_snapshot(snapshot_bytes) + + worker_plan_zip = await fetch_zip_from_worker_plan(str(plan.id)) + if worker_plan_zip is None: + return None + return _sanitize_legacy_zip_snapshot(worker_plan_zip) diff --git a/mcp_cloud/zip_utils.py b/mcp_cloud/zip_utils.py new file mode 100644 index 000000000..0a10ef546 --- /dev/null +++ b/mcp_cloud/zip_utils.py @@ -0,0 +1,100 @@ +"""PlanExe MCP Cloud – zip extraction, sanitization, and hashing utilities.""" +import hashlib +import io +import logging +import zipfile +from io import BytesIO +from typing import Optional + +from mcp_cloud.db_queries import get_plan_by_id + +logger = logging.getLogger(__name__) + + +def list_files_from_zip_bytes(zip_bytes: bytes) -> list[str]: + """List file entries from an in-memory zip archive.""" + try: + with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file: + files = [name for name in zip_file.namelist() if not name.endswith("/")] + return sorted(files) + except Exception as exc: + logger.warning("Unable to list files from zip snapshot: %s", exc) + return [] + +def extract_file_from_zip_bytes(zip_bytes: bytes, file_path: str) -> Optional[bytes]: + """Extract a file from an in-memory zip archive.""" + try: + with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file: + file_path_normalized = file_path.lstrip('/') + try: + return zip_file.read(file_path_normalized) + except KeyError: + return None + except Exception as exc: + logger.warning("Unable to read %s from zip snapshot: %s", file_path, exc) + return None + +def extract_file_from_zip_file(file_handle: io.BufferedIOBase, file_path: str) -> Optional[bytes]: + """Extract a file from a seekable zip file handle.""" + try: + with zipfile.ZipFile(file_handle, 'r') as zip_file: + file_path_normalized = file_path.lstrip('/') + try: + return zip_file.read(file_path_normalized) + except KeyError: + return None + except Exception as exc: + logger.warning("Unable to read %s from zip stream: %s", file_path, exc) + return None + +def fetch_report_from_db(task_id: str) -> Optional[bytes]: + """Fetch the report HTML stored in the PlanItem.""" + plan = get_plan_by_id(task_id) + if plan and plan.generated_report_html is not None: + return plan.generated_report_html.encode("utf-8") + return None + +def fetch_zip_snapshot(task_id: str) -> Optional[bytes]: + """Fetch the zip snapshot stored in the PlanItem.""" + plan = get_plan_by_id(task_id) + if plan and plan.run_zip_snapshot is not None: + return plan.run_zip_snapshot + return None + +def fetch_file_from_zip_snapshot(task_id: str, file_path: str) -> Optional[bytes]: + """Fetch a file from the PlanItem zip snapshot.""" + plan = get_plan_by_id(task_id) + if plan and plan.run_zip_snapshot is not None: + return extract_file_from_zip_bytes(plan.run_zip_snapshot, file_path) + return None + +def list_files_from_zip_snapshot(task_id: str) -> Optional[list[str]]: + """List files from the PlanItem zip snapshot.""" + plan = get_plan_by_id(task_id) + if plan and plan.run_zip_snapshot is not None: + return list_files_from_zip_bytes(plan.run_zip_snapshot) + return None + +def _sanitize_legacy_zip_snapshot(zip_bytes: bytes) -> Optional[bytes]: + """Remove internal track_activity.jsonl files from legacy zip snapshots.""" + try: + with zipfile.ZipFile(BytesIO(zip_bytes), "r") as in_zip: + entries = [name for name in in_zip.namelist() if not name.endswith("/")] + if not any(name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl" for name in entries): + return zip_bytes + out_buffer = BytesIO() + with zipfile.ZipFile(out_buffer, "w", compression=zipfile.ZIP_DEFLATED) as out_zip: + for name in entries: + if name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl": + continue + out_zip.writestr(name, in_zip.read(name)) + return out_buffer.getvalue() + except Exception as exc: + logger.warning("Unable to sanitize legacy run zip snapshot: %s", exc) + return None + +def compute_sha256(content: str | bytes) -> str: + """Compute SHA256 hash of content.""" + if isinstance(content, str): + content = content.encode('utf-8') + return hashlib.sha256(content).hexdigest() diff --git a/mcp_local/README.md b/mcp_local/README.md index c082b983e..a21c5d06e 100644 --- a/mcp_local/README.md +++ b/mcp_local/README.md @@ -13,7 +13,7 @@ proxy forwards tool calls over HTTP and downloads artifacts from `/download/{tas `plan_create` - Initiate creation of a plan. `plan_status` - Get status and progress about the creation of a plan. `plan_stop` - Abort creation of a plan. -`plan_retry` - Retry a failed task using the same task id (optional model_profile, defaults to baseline). +`plan_retry` - Retry a failed plan using the same plan id (optional model_profile, defaults to baseline). `plan_download` - Download the plan, either html report or a zip with everything, and save it to disk. `plan_status` caller contract: @@ -22,15 +22,15 @@ proxy forwards tool calls over HTTP and downloads artifacts from `/download/{tas - `failed`: terminal error. Concurrency semantics: -- Each `plan_create` call creates a new `task_id`. -- `plan_retry` reuses the same failed `task_id`. -- Server does not enforce a global one-task-at-a-time cap per client. -- Local clients should track task ids explicitly when running tasks in parallel. +- Each `plan_create` call creates a new `plan_id`. +- `plan_retry` reuses the same failed `plan_id`. +- Server does not enforce a global one-plan-at-a-time cap per client. +- Local clients should track plan ids explicitly when running plans in parallel. Minimal error contract: - Tool errors use `{"error":{"code","message","details?"}}`. -- Common proxied cloud codes include: `TASK_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`. -- `plan_retry` may return `TASK_NOT_FAILED` if the task is not currently failed. +- Common proxied cloud codes include: `PLAN_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`. +- `plan_retry` may return `PLAN_NOT_FAILED` if the task is not currently failed. - Local proxy specific codes: `REMOTE_ERROR`, `DOWNLOAD_FAILED`. - `plan_file_info` (called under the hood by plan_download) may return `{}` while output is not ready. @@ -44,14 +44,14 @@ file locally into `PLANEXE_PATH`. - If unset, downloads are saved to the current working directory. - If the path does not exist, it is created. - If the path points to a file (not a directory), download fails. -- Filenames are `-030-report.html` or `-run.zip` (with `-1`, `-2`, ... suffixes on collisions). +- Filenames are `-030-report.html` or `-run.zip` (with `-1`, `-2`, ... suffixes on collisions). - `plan_download` returns `saved_path` with the final file location. ## Run as task (MCP tasks protocol) Some MCP clients (e.g. the MCP Inspector) show a **"Run as task"** option for tools. That refers to the MCP **tasks** protocol: a separate mechanism where the client runs a tool in the background using RPC methods like `tasks/run`, `tasks/get`, `tasks/result`, and `tasks/cancel`, instead of a single blocking tool call. -**PlanExe does not use or advertise the MCP tasks protocol.** Our interface is **tool-based** only: the agent calls `prompt_examples` and `model_profiles` for setup, completes a non-tool prompt drafting/approval step, then `plan_create` → gets a `task_id` → polls `plan_status` → optionally calls `plan_retry` if failed → uses `plan_download`. That flow is defined in `docs/mcp/planexe_mcp_interface.md` and is the intended design. +**PlanExe does not use or advertise the MCP tasks protocol.** Our interface is **tool-based** only: the agent calls `prompt_examples` and `model_profiles` for setup, completes a non-tool prompt drafting/approval step, then `plan_create` → gets a `plan_id` → polls `plan_status` → optionally calls `plan_retry` if failed → uses `plan_download`. That flow is defined in `docs/mcp/planexe_mcp_interface.md` and is the intended design. You should **not** enable "Run as task" for PlanExe. The Python MCP SDK and clients like Cursor do not properly support the tasks protocol (method registration and initialization fail). Use the tools directly: create a task, poll status, then download when done. diff --git a/mcp_local/planexe_mcp_local.py b/mcp_local/planexe_mcp_local.py index 81e306291..1033ccb53 100644 --- a/mcp_local/planexe_mcp_local.py +++ b/mcp_local/planexe_mcp_local.py @@ -37,32 +37,32 @@ ] -class TaskCreateRequest(BaseModel): +class PlanCreateRequest(BaseModel): prompt: str model_profile: Optional[ModelProfileInput] = None user_api_key: Optional[str] = None -class TaskStatusRequest(BaseModel): - task_id: str +class PlanStatusRequest(BaseModel): + plan_id: str -class TaskStopRequest(BaseModel): - task_id: str +class PlanStopRequest(BaseModel): + plan_id: str -class TaskRetryRequest(BaseModel): - task_id: str +class PlanRetryRequest(BaseModel): + plan_id: str model_profile: ModelProfileInput = "baseline" -class TaskDownloadRequest(BaseModel): - task_id: str +class PlanDownloadRequest(BaseModel): + plan_id: str artifact: str = "report" -class TaskListRequest(BaseModel): - user_api_key: str +class PlanListRequest(BaseModel): + user_api_key: Optional[str] = None limit: int = 10 @@ -365,31 +365,31 @@ class ToolDefinition: PLAN_STATUS_INPUT_SCHEMA = { "type": "object", "properties": { - "task_id": { + "plan_id": { "type": "string", - "description": "UUID of the task (returned by plan_create).", + "description": "UUID of the plan (returned by plan_create).", }, }, - "required": ["task_id"], + "required": ["plan_id"], } PLAN_STOP_INPUT_SCHEMA = { "type": "object", "properties": { - "task_id": { + "plan_id": { "type": "string", - "description": "UUID of the task to stop (returned by plan_create).", + "description": "UUID of the plan to stop (returned by plan_create).", }, }, - "required": ["task_id"], + "required": ["plan_id"], } PLAN_RETRY_INPUT_SCHEMA = { "type": "object", "properties": { - "task_id": { + "plan_id": { "type": "string", - "description": "UUID of the failed task to retry.", + "description": "UUID of the failed plan to retry.", }, "model_profile": { "type": "string", @@ -398,15 +398,15 @@ class ToolDefinition: "description": "Model profile used for retry. Defaults to baseline.", }, }, - "required": ["task_id"], + "required": ["plan_id"], } PLAN_DOWNLOAD_INPUT_SCHEMA = { "type": "object", "properties": { - "task_id": { + "plan_id": { "type": "string", - "description": "UUID of the task (returned by plan_create).", + "description": "UUID of the plan (returned by plan_create).", }, "artifact": { "type": "string", @@ -415,16 +415,9 @@ class ToolDefinition: "description": "What to download: 'report' = HTML report, 'zip' = full output bundle.", }, }, - "required": ["task_id"], + "required": ["plan_id"], } -# Backward-compatible aliases -TASK_CREATE_INPUT_SCHEMA = PLAN_CREATE_INPUT_SCHEMA -TASK_STATUS_INPUT_SCHEMA = PLAN_STATUS_INPUT_SCHEMA -TASK_STOP_INPUT_SCHEMA = PLAN_STOP_INPUT_SCHEMA -TASK_RETRY_INPUT_SCHEMA = PLAN_RETRY_INPUT_SCHEMA -TASK_DOWNLOAD_INPUT_SCHEMA = PLAN_DOWNLOAD_INPUT_SCHEMA - PROMPT_EXAMPLES_INPUT_SCHEMA = { "type": "object", "properties": {}, @@ -501,16 +494,16 @@ class ToolDefinition: PLAN_CREATE_OUTPUT_SCHEMA = { "type": "object", "properties": { - "task_id": {"type": "string"}, + "plan_id": {"type": "string"}, "created_at": {"type": "string"}, }, - "required": ["task_id", "created_at"], + "required": ["plan_id", "created_at"], } PLAN_STATUS_OUTPUT_SCHEMA = { "type": "object", "properties": { - "task_id": {"type": ["string", "null"]}, + "plan_id": {"type": ["string", "null"]}, "state": {"type": ["string", "null"]}, "progress_percentage": {"type": ["number", "null"]}, "timing": { @@ -545,7 +538,7 @@ class ToolDefinition: PLAN_RETRY_OUTPUT_SCHEMA = { "type": "object", "properties": { - "task_id": {"type": "string"}, + "plan_id": {"type": "string"}, "state": {"type": "string"}, "model_profile": { "type": "string", @@ -576,35 +569,36 @@ class ToolDefinition: "type": "object", "properties": { "user_api_key": { - "type": "string", - "description": "User API key (pex_...) to scope the task list to the authenticated user.", + "type": ["string", "null"], + "default": None, + "description": "Optional user API key for credits and attribution.", }, "limit": { "type": "integer", "default": 10, "minimum": 1, "maximum": 50, - "description": "Maximum number of tasks to return (1-50). Newest tasks are returned first.", + "description": "Maximum number of plans to return (1-50). Newest plans are returned first.", }, }, - "required": ["user_api_key"], + "required": [], } PLAN_LIST_OUTPUT_SCHEMA = { "type": "object", "properties": { - "tasks": { + "plans": { "type": "array", "items": { "type": "object", "properties": { - "task_id": {"type": "string"}, + "plan_id": {"type": "string"}, "state": {"type": "string"}, "progress_percentage": {"type": "number"}, "created_at": {"type": "string"}, "prompt_excerpt": {"type": "string"}, }, }, - "description": "Tasks for the authenticated user, newest first.", + "description": "Plans for the authenticated user, newest first.", }, "message": {"type": "string"}, "error": ERROR_SCHEMA, @@ -612,15 +606,6 @@ class ToolDefinition: "additionalProperties": False, } -# Backward-compatible aliases -TASK_CREATE_OUTPUT_SCHEMA = PLAN_CREATE_OUTPUT_SCHEMA -TASK_STATUS_OUTPUT_SCHEMA = PLAN_STATUS_OUTPUT_SCHEMA -TASK_STOP_OUTPUT_SCHEMA = PLAN_STOP_OUTPUT_SCHEMA -TASK_RETRY_OUTPUT_SCHEMA = PLAN_RETRY_OUTPUT_SCHEMA -TASK_DOWNLOAD_OUTPUT_SCHEMA = PLAN_DOWNLOAD_OUTPUT_SCHEMA -TASK_LIST_INPUT_SCHEMA = PLAN_LIST_INPUT_SCHEMA -TASK_LIST_OUTPUT_SCHEMA = PLAN_LIST_OUTPUT_SCHEMA - TOOL_DEFINITIONS = [ ToolDefinition( name="prompt_examples", @@ -671,9 +656,9 @@ class ToolDefinition: "plan review (critical issues, KPIs, financial strategy, automation opportunities), Q&A, " "premortem with failure scenarios, self-audit checklist, and adversarial premise attacks that argue against the project. " "The adversarial sections (premortem, self-audit, premise attacks) surface risks and questions the prompter may not have considered. " - "Returns task_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_download. " - "If you lose a task_id, call plan_list with your user_api_key to recover it. " - "Each plan_create call creates a new task_id (proxied to cloud; no server-side dedup). " + "Returns plan_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_download. " + "If you lose a plan_id, call plan_list to recover it. " + "Each plan_create call creates a new plan_id (proxied to cloud; no server-side dedup). " "If you are unsure which model_profile to choose, call model_profiles first. " "If your deployment uses credits, include user_api_key to charge the correct account. " "Common proxied error codes: INVALID_USER_API_KEY, USER_API_KEY_REQUIRED, INSUFFICIENT_CREDITS, REMOTE_ERROR." @@ -696,7 +681,7 @@ class ToolDefinition: "State contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. " "progress_percentage is 0-100 (integer-like float); 100 when completed. " "files lists intermediate outputs produced so far; use their updated_at timestamps to detect stalls. " - "Unknown task_id returns TASK_NOT_FOUND (or REMOTE_ERROR when transport fails). " + "Unknown plan_id returns PLAN_NOT_FOUND (or REMOTE_ERROR when transport fails). " "Troubleshooting: pending for >5 minutes likely means queued but not picked up by a worker. " "processing with no file-output changes for >20 minutes likely means failed/stalled. " "Report these issues to https://github.com/PlanExeOrg/PlanExe/issues ." @@ -713,11 +698,11 @@ class ToolDefinition: ToolDefinition( name="plan_stop", description=( - "Request the plan generation to stop. Pass the task_id (the UUID returned by plan_create). " - "Stopping is asynchronous: the stop flag is set immediately but the task may continue briefly before halting. " - "A stopped task will eventually transition to the failed state. " - "If the task is already completed or failed, stop_requested returns false (the task already finished). " - "Unknown task_id returns TASK_NOT_FOUND (or REMOTE_ERROR when transport fails)." + "Request the plan generation to stop. Pass the plan_id (the UUID returned by plan_create). " + "Stopping is asynchronous: the stop flag is set immediately but the plan may continue briefly before halting. " + "A stopped plan will eventually transition to the failed state. " + "If the plan is already completed or failed, stop_requested returns false (the plan already finished). " + "Unknown plan_id returns PLAN_NOT_FOUND (or REMOTE_ERROR when transport fails)." ), input_schema=PLAN_STOP_INPUT_SCHEMA, output_schema=PLAN_STOP_OUTPUT_SCHEMA, @@ -731,10 +716,10 @@ class ToolDefinition: ToolDefinition( name="plan_retry", description=( - "Retry a task that is currently in failed state. " - "Pass the failed task_id and optionally model_profile (defaults to baseline). " - "The same task_id is requeued and reset to pending on the cloud service. " - "Unknown task_id returns TASK_NOT_FOUND; non-failed tasks return TASK_NOT_FAILED." + "Retry a plan that is currently in failed state. " + "Pass the failed plan_id and optionally model_profile (defaults to baseline). " + "The same plan_id is requeued and reset to pending on the cloud service. " + "Unknown plan_id returns PLAN_NOT_FOUND; non-failed plans return PLAN_NOT_FAILED." ), input_schema=PLAN_RETRY_INPUT_SCHEMA, output_schema=PLAN_RETRY_OUTPUT_SCHEMA, @@ -753,7 +738,7 @@ class ToolDefinition: "for collapsible sections and interactive Gantt charts — open in a browser). " "Use artifact='zip' for the full pipeline output bundle (md, json, csv intermediary files that fed the report). " "If PLANEXE_PATH is unset, files are saved to the current working directory. " - "Filename format is - with numeric suffixes when collisions occur. " + "Filename format is - with numeric suffixes when collisions occur. " "Common local error codes: DOWNLOAD_FAILED, REMOTE_ERROR." ), input_schema=PLAN_DOWNLOAD_INPUT_SCHEMA, @@ -768,11 +753,10 @@ class ToolDefinition: ToolDefinition( name="plan_list", description=( - "List the most recent tasks for an authenticated user. " - "Requires user_api_key (pex_...). " - "Returns up to `limit` tasks (default 10, max 50) newest-first, each with task_id, state, " + "List the most recent plans for an authenticated user. " + "Returns up to `limit` plans (default 10, max 50) newest-first, each with plan_id, state, " "progress_percentage, created_at (ISO 8601), and a prompt_excerpt (first 100 chars). " - "Use this to recover a lost task_id or to review recent activity." + "Use this to recover a lost plan_id or to review recent activity." ), input_schema=PLAN_LIST_INPUT_SCHEMA, output_schema=PLAN_LIST_OUTPUT_SCHEMA, @@ -803,16 +787,16 @@ class ToolDefinition: "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. " "Write the prompt as flowing prose — weave specs, constraints, and targets naturally into sentences. " "Only after approval, call plan_create. " - "Each plan_create call creates a new task_id; the server does not enforce a global per-client concurrency limit. " + "Each plan_create call creates a new plan_id; the server does not enforce a global per-client concurrency limit. " "Then poll plan_status (about every 5 minutes); use plan_download when complete. " - "If a run fails, call plan_retry with the failed task_id to requeue it (optional model_profile, defaults to baseline). " - "To stop, call plan_stop with the task_id from plan_create; stopping is asynchronous and the task will eventually transition to failed. " + "If a run fails, call plan_retry with the failed plan_id to requeue it (optional model_profile, defaults to baseline). " + "To stop, call plan_stop with the plan_id from plan_create; stopping is asynchronous and the plan will eventually transition to failed. " "If model_profiles returns MODEL_PROFILES_UNAVAILABLE, inform the user that no models are currently configured and the server administrator needs to set up model profiles. " "Tool errors use {error:{code,message}}. plan_download may return REMOTE_ERROR or DOWNLOAD_FAILED. " "plan_download saves to PLANEXE_PATH (default: current working directory) and returns saved_path. " - "To list recent tasks for a user call plan_list with user_api_key; returns task_id, state, progress_percentage, created_at, and prompt_excerpt. " + "To list recent plans for a user call plan_list; returns plan_id, state, progress_percentage, created_at, and prompt_excerpt. " "plan_status state contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. " - "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the task was likely queued but not picked up by a worker (server issue). " + "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the plan was likely queued but not picked up by a worker (server issue). " "If plan_status is in processing and output files do not change for longer than 20 minutes, the run likely failed/stalled. " "In both cases, report the issue to PlanExe developers on GitHub: https://github.com/PlanExeOrg/PlanExe/issues . " "Main output: a self-contained interactive HTML report (~700KB) with collapsible sections and interactive Gantt charts — open in a browser. " @@ -860,10 +844,10 @@ async def handle_call_tool(name: str, arguments: dict[str, Any]) -> CallToolResu async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult: - """Create a task in mcp_cloud via the local HTTP proxy. + """Create a plan in mcp_cloud via the local HTTP proxy. Examples: - - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → task_id + created_at + - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → plan_id + created_at Args: - prompt: What the plan should cover (goal, context, constraints). @@ -871,10 +855,10 @@ async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult: Returns: - content: JSON string matching structuredContent. - - structuredContent: task_id/created_at payload or error. + - structuredContent: plan_id/created_at payload or error. - isError: True when the remote tool call fails. """ - req = TaskCreateRequest(**arguments) + req = PlanCreateRequest(**arguments) payload: dict[str, Any] = {"prompt": req.prompt} if req.model_profile: payload["model_profile"] = req.model_profile @@ -911,53 +895,53 @@ async def handle_model_profiles(arguments: dict[str, Any]) -> CallToolResult: async def handle_plan_status(arguments: dict[str, Any]) -> CallToolResult: - """Fetch status/progress for a task from mcp_cloud. + """Fetch status/progress for a plan from mcp_cloud. Examples: - - {"task_id": "uuid"} → state/progress/timing + - {"plan_id": "uuid"} → state/progress/timing Args: - - task_id: Task UUID returned by plan_create. + - plan_id: Plan UUID returned by plan_create. Returns: - content: JSON string matching structuredContent. - structuredContent: status payload or error. - isError: True when the remote tool call fails. """ - req = TaskStatusRequest(**arguments) - payload, error = _call_remote_tool("plan_status", {"task_id": req.task_id}) + req = PlanStatusRequest(**arguments) + payload, error = _call_remote_tool("plan_status", {"plan_id": req.plan_id}) if error: return _wrap_response({"error": error}, is_error=True) return _wrap_response(payload) async def handle_plan_stop(arguments: dict[str, Any]) -> CallToolResult: - """Request mcp_cloud to stop an active task. + """Request mcp_cloud to stop an active plan. Examples: - - {"task_id": "uuid"} → stop request acknowledged + - {"plan_id": "uuid"} → stop request acknowledged Args: - - task_id: Task UUID returned by plan_create. + - plan_id: Plan UUID returned by plan_create. Returns: - content: JSON string matching structuredContent. - structuredContent: {"state": "pending|processing|completed|failed", "stop_requested": bool} or error. - isError: True when the remote tool call fails. """ - req = TaskStopRequest(**arguments) - payload, error = _call_remote_tool("plan_stop", {"task_id": req.task_id}) + req = PlanStopRequest(**arguments) + payload, error = _call_remote_tool("plan_stop", {"plan_id": req.plan_id}) if error: return _wrap_response({"error": error}, is_error=True) return _wrap_response(payload) async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult: - """Request mcp_cloud to retry a failed task.""" - req = TaskRetryRequest(**arguments) + """Request mcp_cloud to retry a failed plan.""" + req = PlanRetryRequest(**arguments) payload, error = _call_remote_tool( "plan_retry", - {"task_id": req.task_id, "model_profile": req.model_profile}, + {"plan_id": req.plan_id, "model_profile": req.model_profile}, ) if error: return _wrap_response({"error": error}, is_error=True) @@ -965,14 +949,14 @@ async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult: async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult: - """Download report/zip for a task from mcp_cloud and save it locally. + """Download report/zip for a plan from mcp_cloud and save it locally. Examples: - - {"task_id": "uuid"} → download report (default) - - {"task_id": "uuid", "artifact": "zip"} → download zip + - {"plan_id": "uuid"} → download report (default) + - {"plan_id": "uuid", "artifact": "zip"} → download zip Args: - - task_id: Task UUID returned by plan_create. + - plan_id: Plan UUID returned by plan_create. - artifact: Optional "report" or "zip". Returns: @@ -980,14 +964,17 @@ async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult: - structuredContent: metadata + saved_path or error. - isError: True when download fails or remote tool errors. """ - req = TaskDownloadRequest(**arguments) + req = PlanDownloadRequest(**arguments) artifact = (req.artifact or "report").strip().lower() if artifact not in ("report", "zip"): - artifact = "report" + return _wrap_response( + {"error": {"code": "INVALID_ARGUMENT", "message": f"Invalid artifact type: {req.artifact!r}. Must be 'report' or 'zip'."}}, + is_error=True, + ) payload, error = _call_remote_tool( "plan_file_info", - {"task_id": req.task_id, "artifact": artifact}, + {"plan_id": req.plan_id, "artifact": artifact}, ) if error: return _wrap_response({"error": error}, is_error=True) @@ -998,10 +985,10 @@ async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult: if isinstance(download_url, str) and download_url.startswith("/"): download_url = urljoin(_get_download_base_url().rstrip("/") + "/", download_url.lstrip("/")) if not download_url: - download_url = _derive_download_url(req.task_id, artifact) + download_url = _derive_download_url(req.plan_id, artifact) try: - destination = _choose_output_path(req.task_id, download_url, artifact) + destination = _choose_output_path(req.plan_id, download_url, artifact) downloaded_size = _download_to_path(download_url, destination) except Exception as exc: return _wrap_response( @@ -1031,9 +1018,11 @@ async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult: async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult: - """List recent tasks for an authenticated user via mcp_cloud.""" - req = TaskListRequest(**arguments) - payload_args: dict[str, Any] = {"user_api_key": req.user_api_key, "limit": req.limit} + """List recent plans for an authenticated user via mcp_cloud.""" + req = PlanListRequest(**arguments) + payload_args: dict[str, Any] = {"limit": req.limit} + if req.user_api_key: + payload_args["user_api_key"] = req.user_api_key payload, error = _call_remote_tool("plan_list", payload_args) if error: return _wrap_response({"error": error}, is_error=True) @@ -1051,14 +1040,6 @@ async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult: "model_profiles": handle_model_profiles, } -# Backward-compatible aliases -handle_task_create = handle_plan_create -handle_task_status = handle_plan_status -handle_task_stop = handle_plan_stop -handle_task_retry = handle_plan_retry -handle_task_download = handle_plan_download -handle_task_list = handle_plan_list - async def main() -> None: logger.info("Starting PlanExe MCP local proxy using %s", _get_mcp_base_url())