diff --git a/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md b/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md
index 3cece72c1..1e9106f92 100644
--- a/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md
+++ b/docs/proposals/70-mcp-interface-evaluation-and-roadmap.md
@@ -1,159 +1,212 @@
 ---
-title: MCP Interface — Evaluation and Roadmap
-date: 2026-02-25
----
+
+## title: MCP Interface — Evaluation and Roadmap
+date: 2026-02-26
 
 # MCP Interface — Evaluation and Roadmap
 
-An honest audit of the current MCP surface (mcp_cloud + mcp_local), followed by concrete improvements and promotion ideas.
+An honest audit of the current MCP surface (`mcp_cloud` + `mcp_local`), followed by concrete improvements and promotion ideas.
+
+**Revision history:**
+- **2026-02-26 (rev 1):** Initial version after `task_*` → `plan_*` rename.
+- **2026-02-26 (rev 2):** Updated after `app.py` refactor into modules, `plan_list` `user_api_key` made optional in schema (auto-injected by HTTP layer), and re-evaluation of all open issues.
+- **2026-02-26 (rev 3):** Updated after completing 4.9 — all stale `task` variable names, request classes, helper functions, and backward-compat aliases renamed/removed across `mcp_cloud` and `mcp_local`. Test files renamed from `test_task_*` to `test_plan_*`.
+- **2026-02-26 (rev 4):** Updated after completing 4.2 — added separate download rate limiter with configurable limits (default 10 req/60s).
+- **2026-02-26 (rev 5):** Renamed external-facing fields: `task_id` → `plan_id`, `tasks` → `plans`, error codes `TASK_NOT_FOUND` → `PLAN_NOT_FOUND`, `TASK_NOT_FAILED` → `PLAN_NOT_FAILED`. Internal function names and download URL paths unchanged.
+
+---
+
+## 1. Current Tool Surface
+
+Nine tools, split across two transports:
+
+
+| Tool              | Cloud (`mcp_cloud`) | Local (`mcp_local`) | Auth     | Annotations             |
+| ----------------- | ------------------- | ------------------- | -------- | ----------------------- |
+| `prompt_examples` | yes                 | yes                 | Public   | readOnly, idempotent    |
+| `model_profiles`  | yes                 | yes                 | Public   | readOnly, idempotent    |
+| `plan_create`     | yes                 | yes                 | Required | openWorld               |
+| `plan_status`     | yes                 | yes                 | Required | readOnly, idempotent    |
+| `plan_stop`       | yes                 | yes                 | Required | destructive, idempotent |
+| `plan_retry`      | yes                 | yes                 | Required | openWorld               |
+| `plan_file_info`  | yes                 | —                   | Required | readOnly, idempotent    |
+| `plan_download`   | —                   | yes                 | Required | openWorld               |
+| `plan_list`       | yes                 | yes                 | Required | readOnly, idempotent    |
+
+
+`plan_download` is a local-only synthetic tool that internally proxies to `plan_file_info` on the cloud, then downloads and saves the artifact to the user's filesystem. This intentional asymmetry is tested in `test_tool_surface_consistency.py`.
+
+**Auth model for `plan_create` and `plan_list`:** Both tools accept an optional `user_api_key` in the visible MCP input schema. When called over HTTP, the middleware authenticates the caller via the `X-API-Key` header and auto-injects `user_api_key` into handler arguments. This means MCP clients never need to pass `user_api_key` explicitly — the key is invisible in the tool's published schema but enforced at runtime. Both handlers return `USER_API_KEY_REQUIRED` if no key arrives by either path.
 
 ---
 
-## 1. What's Working Well
+## 2. What's Working Well
 
 **Dual transport.** `mcp_cloud` (stateless HTTP / Railway) and `mcp_local` (stdio proxy) cover the two major deployment patterns. Most users can pick one without reading source code.
 
-**Layered authentication.** Two distinct auth paths — a server-wide `PLANEXE_MCP_API_KEY` for self-hosters, and per-user `pex_…` keys issued by home.planexe.org — are a good design. The key-normalisation fix (`_normalize_api_key_value`) makes the second path robust against copy-paste artefacts.
+**Clean module structure.** `mcp_cloud/app.py` is now a thin re-export facade (~195 lines). Logic lives in focused modules: `handlers.py` (tool handlers), `schemas.py` (tool definitions), `tool_models.py` (Pydantic models), `db_queries.py` (DB operations), `auth.py` (key hashing/user resolution), `download_tokens.py` (signed tokens), `model_profiles.py`, `worker_fetchers.py`, `zip_utils.py`, `prompt_examples.py`. This makes PRs reviewable and bugs easy to isolate.
+
+**Consistent `plan_*` naming throughout.** The rename from `task_*` to `plan_*` covers the full stack: external tool names, handler functions, request classes (`PlanCreateRequest`, etc.), DB query helpers (`_create_plan_sync`, `get_plan_by_id`, etc.), local variable names, and test file names. No backward-compat aliases remain.
+
+**Layered authentication.** Two distinct auth paths — a server-wide `PLANEXE_MCP_API_KEY` for self-hosters, and per-user `pex_…` keys issued by home.planexe.org — are a good design. The key-normalisation helper (`_normalize_api_key_value` in `http_server.py`) handles common copy-paste artefacts (Bearer prefix, surrounding quotes, full header line pasted as value).
+
+**Auto-injected `user_api_key`.** For `plan_create` and `plan_list`, the HTTP layer reads the authenticated user from the request context and injects `user_api_key` into handler arguments automatically. Callers never see `user_api_key` as a required field in the MCP schema — a clean separation between transport-level auth and tool-level logic.
 
-**Structured output schemas.** Every tool declares an `output_schema`, so MCP clients can validate responses without guessing. The `TestAllToolsHaveOutputSchema` test enforces this at CI time.
+**Structured output schemas.** Every tool declares an `output_schema`, so MCP clients can validate responses without guessing. `TestAllToolsHaveOutputSchema` enforces this at CI time.
 
 **Tool annotations.** `readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint` are set on every tool and tested. This is ahead of most MCP servers.
 
-**task_retry with model_profile selection.** Allowing the caller to re-run a failed task with a stronger model (e.g. upgrade from `baseline` to `thorough`) at retry time is genuinely useful.
+**`plan_retry` with model_profile selection.** Allowing the caller to re-run a failed task with a stronger model (e.g. upgrade from `baseline` to `premium`) at retry time is genuinely useful.
+
+**Signed download tokens.** `plan_file_info` returns download URLs with HMAC-SHA256 signed, time-limited tokens (15-min default TTL) scoped to one artifact (`task_id:filename:expiry`). Tokens work in a browser without an API key header. Defence-in-depth: the download endpoint re-validates even after middleware has passed the token. The secret fallback chain is: `PLANEXE_DOWNLOAD_TOKEN_SECRET` → `PLANEXE_API_KEY_SECRET` → per-process random (with warning).
 
 **Glama + llms.txt.** Being listed in the Glama registry and providing `llms.txt` lowers the discovery barrier for new users.
 
-**Rate limiting on REST endpoints.** `slowapi` limits `/tasks` create/retry endpoints, protecting the backend from burst abuse.
+**Rate limiting on all MCP endpoints.** `_enforce_rate_limit` in `http_server.py` applies to `/mcp`, `/mcp/`, and `/mcp/tools/call`. The default limit (60 req / 60 s per client, keyed by API key or IP) is high enough that normal `plan_status` polling is never affected.
 
 **Prompt guidance in schema.** The `prompt` field description ("300–800 words … objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria") sets user expectations up front.
 
+**`plan_list` for plan recovery.** Authenticated users can list their most recent plans (up to 50, newest-first) to recover a lost `plan_id`. Each entry includes `plan_id`, `state`, `progress_percentage`, `created_at`, and `prompt_excerpt`.
+
+**Comprehensive test suite.** 12 test files covering tool surface consistency, auth key parsing, CORS config, download tokens, HTTP routing, and individual tool behaviour (`test_plan_create_tool.py`, `test_plan_status_tool.py`, `test_plan_retry_tool.py`, `test_plan_file_info_tool.py`, `test_model_profiles_tool.py`).
+
 ---
 
-## 2. What's Broken or Inconsistent
+## 3. What's Been Fixed (Previously Reported)
+
+### 3.1 ~~`skills/planexe-mcp/SKILL.md` says "5 tools"~~ (FIXED)
+
+Updated to nine tools; SKILL.md now lists all tools with example JSON-RPC calls.
 
-### 2.1 ~~`skills/planexe-mcp/SKILL.md` says "5 tools"~~ (FIXED)
+### 3.2 ~~Trailing-slash inconsistency~~ (FIXED)
 
-Updated to "seven core tools"; added Tool 5 (`model_profiles`) and Tool 7 (`task_retry`) sections; updated the typical workflow to reference both. Note: with `task_list` now added the total is eight — SKILL.md updated accordingly.
+The canonical URL (`https://mcp.planexe.org/mcp`, no trailing slash) is used in all JSON config files and registry entries.
 
-### 2.2 ~~Trailing-slash inconsistency~~ (FIXED)
+### 3.3 ~~`speed_vs_detail` documented but hidden from agents~~ (FIXED)
 
-The canonical URL (`https://mcp.planexe.org/mcp`, no trailing slash) is used in all JSON config files and registry entries. The MCP Inspector CLI command in `docs/mcp/inspector.md` intentionally keeps the trailing slash (the inspector appends sub-paths; without `/` it sends requests to the wrong path). A note clarifying this distinction was added to `inspector.md`.
+Removed entirely from the MCP interface.
 
-### 2.3 ~~`speed_vs_detail` is documented but hidden from agents~~ (FIXED)
+### 3.4 ~~`plan_file_info` returns `{}` on success instead of `isError`~~ (FIXED)
 
-The `speed_vs_detail` parameter was a developer-only hidden override that was rarely used and created a docs/schema mismatch. It has been removed from the MCP interface entirely: the dead code was deleted from `mcp_cloud/app.py` and `mcp_cloud/http_server.py`, the legacy backward-compat forwarding block was removed from `mcp_local/planexe_mcp_local.py`, and all references were purged from docs.
+Now returns `{"ready": false, "reason": "processing"}` while running and `{"ready": false, "reason": "failed", "error": {...}}` on failure.
 
-### 2.4 ~~`task_file_info` returns `{}` on success instead of `isError`~~ (FIXED)
+### 3.5 ~~Rate limiting covers REST but not Streamable HTTP `/mcp`~~ (FIXED)
 
-`task_file_info` now returns `{"ready": false, "reason": "processing"}` when the task is still running, and `{"ready": false, "reason": "failed", "error": {...}}` when it has failed. The output schema was updated (replaced the empty-dict variant with `TaskFileInfoNotReadyOutput`), and both `PLANEXE_SERVER_INSTRUCTIONS` and the tool description were updated accordingly.
+`_enforce_rate_limit` now covers `/mcp`, `/mcp/`, and `/mcp/tools/call`.
 
-### 2.5 ~~Rate limiting covers REST but not the Streamable HTTP `/mcp` endpoint~~ (FIXED)
+### 3.6 ~~No `plan_list` tool — lost `task_id` = lost task~~ (FIXED)
 
-`_enforce_rate_limit` in `mcp_cloud/http_server.py` now applies to `/mcp` and `/mcp/` in addition to `/mcp/tools/call`. The default limit (60 req/60 s per client) is high enough that normal polling of `task_status` is never affected.
+Added `plan_list` to both `mcp_cloud` and `mcp_local`. Returns up to 50 tasks newest-first.
 
-### 2.6 ~~No `task_list` tool — lost `task_id` = lost task~~ (FIXED)
+### 3.7 ~~Signed, expiring download tokens~~ (FIXED)
 
-Added `task_list` to both `mcp_cloud` and `mcp_local`. Requires `user_api_key`; returns up to 50 tasks newest-first with `task_id`, `state`, `progress_percentage`, `created_at`, and `prompt_excerpt`. The `task_create` description was updated to say "call task_list to recover a lost task_id" instead of "no task_list, lost task_id = lost task".
+HMAC-SHA256 tokens, 15-minute default TTL, scoped per-artifact.
 
-### 2.7 `app.py` is an 81 KB monolith
+### 3.8 ~~Tools used `task_*` prefix instead of `plan_*`~~ (FIXED)
 
-All tool handlers, auth logic, DB calls, and schema definitions live in one file. This makes onboarding slow, PRs hard to review, and bugs harder to isolate.
+All external tool names renamed to `plan_*`.
 
-**Fix:** Refactor into modules: `auth.py`, `tools/task.py`, `tools/meta.py`, `schemas.py`.
+### 3.9 ~~`app.py` is a 76 KB monolith~~ (FIXED)
+
+Refactored into 10+ focused modules (commit 9f1a7db9). `app.py` is now a thin re-export facade.
+
+### 3.10 ~~`plan_list` requires `user_api_key` in visible MCP schema~~ (FIXED)
+
+`user_api_key` is now optional in the `PlanListInput` schema (not in `required` list), matching `plan_create`. The HTTP layer auto-injects it from the `X-API-Key` header via `_get_authenticated_user_api_key()`. The handler still enforces the key at runtime (returns `USER_API_KEY_REQUIRED` if absent).
 
 ---
 
-## 3. Proposed Improvements
+## 4. What's Broken or Inconsistent
+
+### ~~4.1 Dev-secret fallback in production~~ (FIXED)
+
+`auth.py` now exports `validate_api_key_secret()` which raises `RuntimeError` when `PLANEXE_API_KEY_SECRET` is not set. `download_tokens.py` exports `validate_download_token_secret()` which raises when neither `PLANEXE_DOWNLOAD_TOKEN_SECRET` nor `PLANEXE_API_KEY_SECRET` is set. Both are called at module level in `http_server.py` when `AUTH_REQUIRED` is true, so the server fails hard at startup instead of silently falling back to dev secrets. The existing runtime fallbacks (`"dev-api-key-secret"` and random per-process secret) remain for local development with `PLANEXE_MCP_REQUIRE_AUTH=false`.
 
-### 3.1 `task_list` tool (high value, low effort)
+### ~~4.2 `/download` endpoint not rate-limited~~ (FIXED)
 
-```json
-{
-  "name": "task_list",
-  "description": "List the most recent tasks for the authenticated user.",
-  "inputSchema": {
-    "properties": {
-      "limit": {"type": "integer", "default": 10, "maximum": 50}
-    }
-  }
-}
-```
+A separate download rate limiter (`_enforce_download_rate_limit`) now covers `/download` paths with its own bucket and configurable limits: `PLANEXE_MCP_DOWNLOAD_RATE_LIMIT` (default 10 req) and `PLANEXE_MCP_DOWNLOAD_RATE_WINDOW_SECONDS` (default 60s). This is deliberately tighter than the MCP rate limit (60 req/60s) since download responses are 700KB–6MB. The sweep task cleans up download buckets alongside MCP buckets.
 
-Recovers lost task IDs, enables dashboards, and is the single most-requested missing feature in similar task-runner MCP servers.
+### ~~4.3 Body size validation only on REST endpoint~~ (FIXED)
 
-### 3.2 ~~Signed, expiring download tokens~~ (FIXED)
+`_enforce_body_size` now checks both `/mcp/tools/call` and `/mcp/` POST requests. The `Content-Length` requirement (411) is only enforced on the REST endpoint since Streamable HTTP may use chunked encoding without `Content-Length`; however, when `Content-Length` is present on either endpoint it is validated against `MAX_BODY_BYTES`.
 
-`task_file_info` now returns download URLs that include a signed, short-lived token:
-`/download/{task_id}/{filename}?token={expiry}.{hmac_sha256}`.
+### ~~4.4 `plan_file_info` silently defaults invalid artifact to `"report"`~~ (FIXED)
 
-- Token is HMAC-SHA256 over `task_id:filename:expiry`, scoped to one artifact.
-- Default TTL: 15 minutes (configurable via `PLANEXE_DOWNLOAD_TOKEN_TTL`).
-- Secret priority: `PLANEXE_DOWNLOAD_TOKEN_SECRET` → `PLANEXE_API_KEY_SECRET` → random per-process (with warning).
-- Tokenised URLs work in a browser without an API key header; the middleware validates the token and skips the API-key check.
-- Defence-in-depth: the download endpoint re-validates the token even after the middleware has passed it.
-- Backward compatible: requests without a token still require a valid API key header (existing behaviour).
+Both `handle_plan_file_info` (cloud) and `handle_plan_download` (local) now return `INVALID_ARGUMENT` with a descriptive message when the artifact value is not `"report"` or `"zip"`.
 
-### 3.3 SSE progress streaming (UX)
+### ~~4.5 No dedicated `plan_list` test~~ (FIXED)
 
-Long-running plans (10–20 minutes) give the user no feedback. A `task_progress` SSE endpoint (or a `progress` field in `task_status`) returning incremental log lines would dramatically improve perceived responsiveness.
+Added `mcp_cloud/tests/test_plan_list_tool.py` with 8 tests covering: tool listed, returns tasks, empty result, limit clamping (both directions), invalid API key, `USER_API_KEY_REQUIRED` when env requires key, no-key passthrough when not required (user_id=None), and default limit.
 
-Minimum viable version: a `log_lines` array in the `task_status` response (last 50 lines of agent output).
+### ~~4.6 CORS default is wildcard~~ (FIXED)
 
-### 3.4 Webhook / push notification (power users)
+When `AUTH_REQUIRED` is true and `PLANEXE_MCP_CORS_ORIGINS` is unset, the default is now `["https://mcp.planexe.org", "https://home.planexe.org"]` instead of `["*"]`. Wildcard CORS is only used in dev mode (`PLANEXE_MCP_REQUIRE_AUTH=false`) so browser-based tools like MCP Inspector work without extra configuration. Operators can override via `PLANEXE_MCP_CORS_ORIGINS`.
 
-Add an optional `webhook_url` to `task_create`. When the task transitions to `completed` or `failed`, POST a JSON summary to that URL. This removes the need for polling and enables CI/CD integrations.
+### ~~4.7 No request logging for successful tool calls~~ (FIXED)
 
-### 3.5 API versioning
+`handle_call_tool` now logs every tool call at INFO level with tool name, result (ok/error/exception), and duration in milliseconds. Unknown tools are logged at WARNING. Format: `tool_call tool=<name> result=<ok|error|exception> duration_ms=<N>`.
 
-All tool names and schemas are currently unversioned. A future breaking change (e.g. renaming `task_file_info` to `task_files`) will silently break clients.
-Add a `server_version` field to the `task_status` output and document a stability policy.
+### ~~4.8 Prompt excerpt length hardcoded~~ (FIXED)
 
-### 3.6 Refactor `app.py` into modules
+Extracted to `PROMPT_EXCERPT_MAX_LENGTH = 100` at module level in `db_queries.py`.
 
-```
-mcp_cloud/
-  auth.py          # _resolve_user_from_api_key, _hash_user_api_key
-  schemas.py       # TASK_CREATE_INPUT_SCHEMA, TOOL_DEFINITIONS, …
-  tools/
-    task.py        # task_create, task_status, task_stop, task_retry, task_list
-    meta.py        # prompt_examples, model_profiles
-  http_server.py   # ASGI wiring only
-  app.py           # thin entry-point, imports from above
-```
+### ~~4.9 Stale `task` variable names and backward-compat aliases~~ (FIXED)
 
-### 3.7 Remove or deprecate legacy REST endpoints
+All internal naming now uses `plan` consistently. Request classes renamed (`TaskCreateRequest` → `PlanCreateRequest`, etc.), DB query helpers renamed (`_create_task_sync` → `_create_plan_sync`, `get_task_by_id` → `get_plan_by_id`, etc.), local variables renamed (`task_snapshot` → `plan_snapshot`, etc.), all backward-compat aliases removed from `tool_models.py`, `schemas.py`, `handlers.py`, `app.py`, and `mcp_local/planexe_mcp_local.py` (~86 lines deleted). Test files renamed from `test_task_*.py` to `test_plan_*.py` with patch targets updated.
 
-The `/tasks` REST routes duplicate functionality now available through MCP tools. Keeping both surfaces means bugs can exist in one but not the other (as happened with the auth issue). Deprecate `/tasks` in favour of the MCP tool surface, with a sunset date in the changelog.
+### ~~4.10 `plan_list` auth differs from `plan_create`~~ (FIXED)
+
+`plan_list` now uses the same `PLANEXE_MCP_REQUIRE_USER_KEY` check as `plan_create`. When the key is not required and not provided, `plan_list` returns all tasks (no user scoping). `_list_tasks_sync` accepts `user_id=None` to support this.
 
 ---
 
-## 4. Promotion and Growth Strategies
+## 5. Proposed Improvements
+
+### 5.1 SSE progress streaming (UX)
+
+Long-running plans (10–20 minutes) give the user no feedback. A `log_lines` array in the `plan_status` response (last 50 lines of agent output) would dramatically improve perceived responsiveness.
+
+### 5.2 Webhook / push notification (power users)
+
+Add an optional `webhook_url` to `plan_create`. When the task transitions to `completed` or `failed`, POST a JSON summary to that URL. This removes the need for polling and enables CI/CD integrations.
 
-### 4.1 MCP registries
+### 5.3 API versioning
 
-- **Glama** — already listed ✓
+All tool names and schemas are currently unversioned. A future breaking change will silently break clients. Add a `server_version` field to the `plan_status` output and document a stability policy.
+
+### 5.4 Startup environment validation
+
+Add an explicit check at server startup that required secrets (`PLANEXE_API_KEY_SECRET`, `PLANEXE_DOWNLOAD_TOKEN_SECRET`) are set when auth is enabled. Fail loudly instead of falling back to dev defaults.
+
+---
+
+## 6. Promotion and Growth Strategies
+
+### 6.1 MCP registries
+
+- **Glama** — already listed
 - **mcp.so** — submit `server.json`; high traffic from Claude desktop users
 - **Smithery** — another fast-growing directory; supports one-click install
 - **awesome-mcp-servers** (GitHub) — submit a PR; maintainers merge quickly
 - **OpenTools** — focus on enterprise MCP discovery
 
-### 4.2 Content
+### 6.2 Content
 
-- **Blog post: "From prompt to project plan in 60 seconds"** — a short walkthrough showing MCP Inspector → task_create → task_status → download. Publish on dev.to, Hacker News (Show HN), and the PlanExe GitHub Discussions.
+- **Blog post: "From prompt to project plan in 60 seconds"** — a short walkthrough showing MCP Inspector → `plan_create` → `plan_status` → download. Publish on dev.to, Hacker News (Show HN), and the PlanExe GitHub Discussions.
 - **YouTube demo (2–3 minutes)** — screen recording of Claude Desktop using PlanExe MCP end-to-end. Pin it to the README.
-- **Twitter/X thread** — "I built an MCP server that turns a ~500-word prompt into a full project plan. Here's how it works: 🧵"
+- **Twitter/X thread** — "I built an MCP server that turns a ~500-word prompt into a full project plan. Here's how it works:"
 
-### 4.3 Community integrations
+### 6.3 Community integrations
 
 - **Claude Desktop config snippet** — provide a ready-to-paste `claude_desktop_config.json` block in the README.
 - **Cursor / Windsurf rule** — provide a `.cursorrules` or `.windsurfrules` snippet that wires PlanExe MCP automatically.
-- **GitHub Actions** — a reusable workflow `planexe/create-plan@v1` that runs `task_create` and uploads the result as a release asset. This is a high-visibility integration channel.
+- **GitHub Actions** — a reusable workflow `planexe/create-plan@v1` that runs `plan_create` and uploads the result as a release asset. This is a high-visibility integration channel.
 
-### 4.4 Example prompt gallery
+### 6.4 Example prompt gallery
 
 Add 10–15 high-quality example prompts (startup, research paper, home renovation, hiring plan, …) to `prompt_examples`. Agents and users copy-paste these; each successful use is a social proof data point.
 
-### 4.5 Observability / social proof
+### 6.5 Observability / social proof
 
 - Add a public counter to the homepage: "X plans created this week".
 - Post a monthly changelog to GitHub Discussions so subscribers see activity.
@@ -161,26 +214,44 @@ Add 10–15 high-quality example prompts (startup, research paper, home renovati
 
 ---
 
-## 5. Quick-win Checklist
-
-| Priority | Task | Effort |
-|----------|------|--------|
-| P0 | ~~Fix SKILL.md tool count~~ (DONE) | — |
-| P0 | ~~Standardise URL trailing slash~~ (DONE) | — |
-| P0 | ~~Fix `speed_vs_detail` schema/docs mismatch~~ (DONE) | — |
-| P1 | ~~Add `task_list` tool~~ (DONE) | — |
-| P1 | ~~Fix `task_file_info` empty-dict response~~ (DONE) | — |
-| P1 | ~~Add rate limiting to `/mcp` endpoint~~ (DONE) | — |
-| P1 | Submit to mcp.so + Smithery | 30 min |
-| P1 | Write README demo GIF / YouTube link | 1 h |
-| P2 | Add `log_lines` to task_status | 4 h |
-| P2 | Refactor app.py into modules | 1 day |
-| P3 | ~~Signed download tokens~~ (DONE) | — |
-| P3 | Webhook support | 1 day |
-| P3 | GitHub Actions integration | 1 day |
+## 7. Quick-win Checklist
+
+
+| Priority | Task                                                                   | Effort | Status |
+| -------- | ---------------------------------------------------------------------- | ------ | ------ |
+| P0       | ~~Fix SKILL.md tool count~~                                            | —      | DONE   |
+| P0       | ~~Standardise URL trailing slash~~                                     | —      | DONE   |
+| P0       | ~~Fix `speed_vs_detail` schema/docs mismatch~~                         | —      | DONE   |
+| P0       | ~~Rename tools from `task_*` to `plan_*`~~                             | —      | DONE   |
+| P1       | ~~Add `plan_list` tool~~                                               | —      | DONE   |
+| P1       | ~~Fix `plan_file_info` empty-dict response~~                           | —      | DONE   |
+| P1       | ~~Add rate limiting to `/mcp` endpoint~~                               | —      | DONE   |
+| P1       | ~~Signed download tokens~~                                             | —      | DONE   |
+| P1       | ~~Refactor `app.py` into modules~~                                     | —      | DONE   |
+| P1       | ~~Remove `user_api_key` from `plan_list` visible schema~~              | —      | DONE   |
+| P1       | ~~Fail-hard on missing secrets in production (4.1)~~                   | —      | DONE   |
+| P1       | ~~Rate-limit `/download` endpoint (4.2)~~                              | —      | DONE   |
+| P1       | ~~Add `plan_list` handler tests (4.5)~~                                | —      | DONE   |
+| P1       | Submit to mcp.so + Smithery                                            | 30 min |        |
+| P1       | Write README demo GIF / YouTube link                                   | 1 h    |        |
+| P2       | ~~Body size validation on Streamable HTTP (4.3)~~                      | —      | DONE   |
+| P2       | ~~Return error for invalid artifact value (4.4)~~                      | —      | DONE   |
+| P2       | ~~Add tool-call audit logging (4.7)~~                                  | —      | DONE   |
+| P2       | Add `log_lines` to `plan_status` (5.1)                                 | 4 h    |        |
+| P2       | ~~Rename internal `task` variables/classes/helpers to `plan` (4.9)~~   | —      | DONE   |
+| P2       | ~~Remove backward-compat `Task*`/`handle_task_*`/`TASK_*` aliases (4.9)~~ | —  | DONE   |
+| P2       | ~~Rename test files from `test_task_*` to `test_plan_*` (4.9)~~       | —      | DONE   |
+| P2       | ~~Tighten default CORS origins (4.6)~~                                 | —      | DONE   |
+| P2       | ~~Align `plan_list` auth with `plan_create` (4.10)~~                   | —      | DONE   |
+| P3       | Webhook support (5.2)                                                  | 1 day  |        |
+| P3       | API versioning (5.3)                                                   | 4 h    |        |
+| P3       | GitHub Actions integration (6.3)                                       | 1 day  |        |
+
 
 ---
 
-## 6. Summary
+## 8. Summary
+
+The MCP surface is functionally solid and ahead of most MCP servers in terms of schema rigour, annotation coverage, and security (signed download tokens, layered auth, auto-injected user keys). The codebase has been significantly improved since rev 1: `app.py` was refactored from a 76 KB monolith into 10+ focused modules, `plan_list` now follows the same auth-injection pattern as `plan_create`, and all P0 issues are resolved.
 
-The MCP surface is functionally solid and ahead of most hobby MCP servers in terms of schema rigour and annotation coverage. The main weaknesses are: small but sharp inconsistencies in docs/schemas that erode trust, a missing `task_list` tool that makes the server feel fragile in long agent sessions, and limited discovery beyond Glama. Fixing the P0/P1 items above takes less than a day and would meaningfully improve both reliability and adoption.
+All P1 code-quality issues are now resolved, including fail-hard on missing secrets in production (4.1). The remaining checklist items are promotion/growth tasks (mcp.so submission, README demo) and lower-priority enhancements (CORS tightening, SSE streaming, webhooks, API versioning).
diff --git a/mcp_cloud/AGENTS.md b/mcp_cloud/AGENTS.md
index e015e14a4..46773168b 100644
--- a/mcp_cloud/AGENTS.md
+++ b/mcp_cloud/AGENTS.md
@@ -15,7 +15,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
 - MCP tools must follow the specification in `docs/mcp/planexe_mcp_interface.md`:
   - Task management maps to `PlanItem` records (each task = one PlanItem).
   - Events are queried from `EventItem` database records.
-- Use the PlanItem UUID as the MCP `task_id`.
+- Use the PlanItem UUID as the MCP `plan_id`.
 - Public task state contract:
   - `plan_status.state` must use exactly: `pending`, `processing`, `completed`, `failed`.
   - These values correspond 1:1 with `database_api.model_planitem.PlanState`.
@@ -35,7 +35,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
 - Expose `model_profiles` as the discovery tool for profile selection.
 - `model_profiles` must report profile guidance and currently available models after class whitelist filtering.
 - Keep workflow wording explicit that prompt drafting + user approval is a non-tool step before `plan_create`.
-- Keep concurrency wording explicit: each `plan_create` call creates a new `task_id`; no global per-client concurrency cap is enforced server-side.
+- Keep concurrency wording explicit: each `plan_create` call creates a new `plan_id`; no global per-client concurrency cap is enforced server-side.
 - Visible input schema is intentionally limited to:
   - `prompt`
   - `model_profile` (`baseline`, `premium`, `frontier`, `custom`)
@@ -45,7 +45,7 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
 - The server communicates over stdio (standard input/output) following the MCP protocol.
 - Tools are registered via `@mcp_cloud.list_tools()` and handled via `@mcp_cloud.call_tool()`.
 - All tool responses must be JSON-serializable and follow the error model in the spec.
-- Keep tool error codes/docs aligned with actual runtime payloads (for example `TASK_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `generation_failed`, `content_unavailable`, `INTERNAL_ERROR`).
+- Keep tool error codes/docs aligned with actual runtime payloads (for example `PLAN_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `generation_failed`, `content_unavailable`, `INTERNAL_ERROR`).
 - Event cursors use format `cursor_{event_id}` for incremental polling.
 - **Run as task**: We expose MCP **tools** only (plan_create, plan_status, plan_stop, etc.), not the MCP **tasks** protocol (tasks/get, tasks/result, etc.). Do not advertise the tasks capability or add "Run as task" support; the spec and clients (e.g. Cursor) are aligned on tools-only.
 
diff --git a/mcp_cloud/README.md b/mcp_cloud/README.md
index 2cdd3a0b0..91263dd63 100644
--- a/mcp_cloud/README.md
+++ b/mcp_cloud/README.md
@@ -34,8 +34,8 @@ Build and run mcp_cloud with HTTP endpoints:
 docker compose up
 ```
 
-Important: `mcp_cloud` enqueues tasks and `worker_plan_database_{n}` executes them.  
-If no `worker_plan_database*` service is running, `plan_create` returns a task id but the task will not progress.
+Important: `mcp_cloud` enqueues plans and `worker_plan_database_{n}` executes them.
+If no `worker_plan_database*` service is running, `plan_create` returns a plan id but the plan will not progress.
 
 mcp_cloud exposes HTTP endpoints on port `8001` (or `${PLANEXE_MCP_HTTP_PORT}`). Authentication is controlled by `PLANEXE_MCP_REQUIRE_AUTH`:
 - `false`: no API key needed (local docker default).
@@ -133,31 +133,33 @@ See `docs/mcp/planexe_mcp_interface.md` for full specification. Available tools:
 
 - `prompt_examples` - Return example prompts. Use these as examples for plan_create.
 - `model_profiles` - List profile options and currently available models in each profile.
-- `plan_create` - Create a new task (returns task_id as UUID; may require user_api_key for credits)
-- `plan_status` - Get task status and progress
-- `plan_stop` - Stop an active task
-- `plan_retry` - Retry a failed task with the same task_id (optional model_profile, default baseline)
+- `plan_create` - Create a new plan (returns plan_id as UUID; may require user_api_key for credits)
+- `plan_status` - Get plan status and progress
+- `plan_stop` - Stop an active plan
+- `plan_retry` - Retry a failed plan with the same plan_id (optional model_profile, default baseline)
 - `plan_file_info` - Get file metadata for report or zip
 
 `plan_status` caller contract:
 - `pending` / `processing`: keep polling.
 - `completed`: terminal success, download is ready.
 - `failed`: terminal error.
-- If `failed`, call `plan_retry` to requeue the same task id.
+- If `failed`, call `plan_retry` to requeue the same plan id.
 
 Concurrency semantics:
-- Each `plan_create` call creates a new `task_id`.
-- `plan_retry` reuses the same failed `task_id`.
-- Server does not enforce a global one-task-at-a-time cap per client.
-- Client should track task ids explicitly when running tasks in parallel.
+- Each `plan_create` call creates a new `plan_id`.
+- `plan_retry` reuses the same failed `plan_id`.
+- Server does not enforce a global one-plan-at-a-time cap per client.
+- Client should track plan ids explicitly when running plans in parallel.
 
 Minimal error contract:
 - Tool errors use `{"error":{"code","message","details?"}}`.
-- Common codes: `TASK_NOT_FOUND`, `TASK_NOT_FAILED`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`.
+- Common codes: `PLAN_NOT_FOUND`, `PLAN_NOT_FAILED`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`.
 - `plan_file_info` may return `{}` while output is not ready (not an error payload).
 
 Note: `plan_download` is a synthetic tool provided by `mcp_local`, not by this server. If your client exposes `plan_download`, use it to save the report or zip locally; otherwise use `plan_file_info` to get `download_url` and fetch the file yourself.
 
+> **Breaking change (v2026-02-26):** External-facing field names were renamed from `task_id` → `plan_id`, `tasks` → `plans`, and error codes from `TASK_NOT_FOUND` → `PLAN_NOT_FOUND`, `TASK_NOT_FAILED` → `PLAN_NOT_FAILED`.
+
 **Tip**: Call `prompt_examples` to get example prompts to use with plan_create, then call `model_profiles` to choose `model_profile` based on current runtime availability. The prompt catalog is the same as in the frontends (`worker_plan.worker_plan_api.PromptCatalog`). When running with `PYTHONPATH` set to the repo root (e.g. stdio setup), the catalog is loaded automatically; otherwise built-in examples are returned.
 
 Download flow: call `plan_file_info` to obtain the `download_url`, then fetch the
@@ -407,5 +409,5 @@ See `railway.md` for Railway-specific deployment instructions. The server automa
   - Other files are fetched by downloading the run zip and extracting the file (less efficient but works without additional endpoints)
 - Artifact writes are not yet supported via HTTP (would require a write endpoint in `worker_plan`).
 - Artifact writes are rejected while a run is active (strict policy per spec).
-- Task IDs use the PlanItem UUID (e.g., `5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1`).
+- Plan IDs use the PlanItem UUID (e.g., `5e2b2a7c-8b49-4d2f-9b8f-6a3c1f05b9a1`).
 - **Security**: Authentication is configurable. For production, set `PLANEXE_MCP_REQUIRE_AUTH=true` and use UserApiKey validation (optionally with `PLANEXE_MCP_API_KEY` as a shared secret).
diff --git a/mcp_cloud/app.py b/mcp_cloud/app.py
index e41717f43..2ba5d0d2f 100644
--- a/mcp_cloud/app.py
+++ b/mcp_cloud/app.py
@@ -1,1837 +1,170 @@
 """
-PlanExe MCP Cloud
+PlanExe MCP Cloud – thin re-export facade.
 
-Implements the Model Context Protocol interface for PlanExe as specified in
- docs/mcp/planexe_mcp_interface.md. Communicates with worker_plan_database via the shared
-database_api models.
+All symbols previously importable from ``mcp_cloud.app`` are re-exported here
+so that existing callers (http_server.py, tests, etc.) continue to work.
+The actual implementations live in the focused modules under ``mcp_cloud/``.
 """
 import asyncio
-import contextvars
-import hashlib
-import hmac
-import io
-import json
-import logging
-import os
-import secrets
-import tempfile
-import time
-import uuid
-import zipfile
-from dataclasses import dataclass
-from datetime import UTC, datetime
-from pathlib import Path
-from typing import Any, Literal, Optional
-from urllib.parse import quote_plus
-from io import BytesIO
-import httpx
-from sqlalchemy import cast, text
-from sqlalchemy.dialects.postgresql import JSONB
-from mcp.server import Server
-from mcp.server.stdio import stdio_server
-from mcp.types import CallToolResult, Tool, TextContent, ToolAnnotations
-from pydantic import BaseModel
-from worker_plan_api.model_profile import (
-    ModelProfileEnum,
-    default_filename_for_profile,
-    normalize_model_profile,
-    resolve_model_profile_from_env,
-)
-from worker_plan_api.planexe_config import PlanExeConfig
-from worker_plan_api.llm_class_filter import (
-    ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES,
-    is_llm_class_allowed,
-    parse_llm_class_whitelist,
-)
 
-from mcp_cloud.dotenv_utils import load_planexe_dotenv
-_dotenv_loaded, _dotenv_paths = load_planexe_dotenv(Path(__file__).parent)
+from mcp.server.stdio import stdio_server
 
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+# -- db_setup: Flask app, DB, constants, request classes, MCP Server ----------
+from mcp_cloud.db_setup import (  # noqa: F401
+    app,
+    db,
+    build_postgres_uri_from_env,
+    ensure_planitem_stop_columns,
+    PLANEXE_SERVER_INSTRUCTIONS,
+    mcp_cloud_server as mcp_cloud,
+    BASE_DIR_RUN,
+    WORKER_PLAN_URL,
+    REPORT_FILENAME,
+    REPORT_CONTENT_TYPE,
+    ZIP_FILENAME,
+    ZIP_CONTENT_TYPE,
+    ZIP_SNAPSHOT_MAX_BYTES,
+    ModelProfileInput,
+    MODEL_PROFILE_TITLES,
+    MODEL_PROFILE_SUMMARIES,
+    PlanCreateRequest,
+    PlanStatusRequest,
+    PlanStopRequest,
+    PlanRetryRequest,
+    PlanFileInfoRequest,
+    PlanListRequest,
+    ModelProfilesRequest,
+    PlanItem,
+    PlanState,
+    EventItem,
+    EventType,
+    UserAccount,
+    UserApiKey,
+    logger,
 )
-logger = logging.getLogger(__name__)
-if not _dotenv_loaded:
-    logger.warning(
-        "No .env file found; searched: %s",
-        ", ".join(str(path) for path in _dotenv_paths),
-    )
 
-from database_api.planexe_db_singleton import db
-from database_api.model_planitem import PlanItem, PlanState
-from database_api.model_event import EventItem, EventType
-from database_api.model_user_account import UserAccount
-from database_api.model_user_api_key import UserApiKey
-from flask import Flask, has_app_context
-from mcp_cloud.tool_models import (
-    ModelProfilesInput,
-    ModelProfilesOutput,
-    PromptExamplesInput,
-    PromptExamplesOutput,
-    PlanCreateInput,
-    PlanCreateOutput,
-    PlanRetryInput,
-    PlanRetryOutput,
-    PlanStopOutput,
-    PlanStatusInput,
-    PlanStopInput,
-    PlanFileInfoInput,
-    PlanFileInfoNotReadyOutput,
-    PlanStatusSuccess,
-    PlanFileInfoReadyOutput,
-    PlanListInput,
-    PlanListOutput,
-    ErrorDetail,
-    # backward-compat aliases used by internal Request classes
-    TaskCreateInput,
-    TaskStatusInput,
-    TaskStopInput,
-    TaskRetryInput,
-    TaskFileInfoInput,
-    TaskListInput,
+# -- auth: API-key hashing and user resolution --------------------------------
+from mcp_cloud.auth import (  # noqa: F401
+    _hash_user_api_key,
+    _resolve_user_from_api_key,
 )
 
-app = Flask(__name__)
-app.config.from_pyfile('config.py')
-
-def build_postgres_uri_from_env(env: dict[str, str]) -> tuple[str, dict[str, str]]:
-    """Construct a SQLAlchemy URI for Postgres using environment variables."""
-    host = env.get("PLANEXE_POSTGRES_HOST") or "database_postgres"
-    port = str(env.get("PLANEXE_POSTGRES_PORT") or "5432")
-    dbname = env.get("PLANEXE_POSTGRES_DB") or "planexe"
-    user = env.get("PLANEXE_POSTGRES_USER") or "planexe"
-    password = env.get("PLANEXE_POSTGRES_PASSWORD") or "planexe"
-    uri = f"postgresql+psycopg2://{quote_plus(user)}:{quote_plus(password)}@{host}:{port}/{dbname}"
-    safe_config = {"host": host, "port": port, "dbname": dbname, "user": user}
-    return uri, safe_config
-
-sqlalchemy_database_uri = os.environ.get("SQLALCHEMY_DATABASE_URI")
-if sqlalchemy_database_uri is None:
-    sqlalchemy_database_uri, db_settings = build_postgres_uri_from_env(os.environ)
-    logger.info(f"SQLALCHEMY_DATABASE_URI not set. Using Postgres defaults: {db_settings}")
-else:
-    logger.info("Using SQLALCHEMY_DATABASE_URI from environment.")
-
-app.config['SQLALCHEMY_DATABASE_URI'] = sqlalchemy_database_uri
-app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {'pool_recycle': 280, 'pool_pre_ping': True}
-db.init_app(app)
-
-def ensure_planitem_stop_columns() -> None:
-    statements = (
-        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_jsonl TEXT",
-        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_bytes INTEGER",
-        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_activity_overview_json JSON",
-        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_artifact_layout_version INTEGER",
-        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested BOOLEAN",
-        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested_timestamp TIMESTAMP",
-    )
-    with db.engine.begin() as conn:
-        for statement in statements:
-            try:
-                conn.execute(text(statement))
-            except Exception as exc:
-                logger.warning("Schema update failed for %s: %s", statement, exc, exc_info=True)
-
-with app.app_context():
-    ensure_planitem_stop_columns()
-
-# Shown in MCP initialize (e.g. Inspector) so clients know what PlanExe does.
-PLANEXE_SERVER_INSTRUCTIONS = (
-    "PlanExe generates strategic project-plan drafts from a natural-language prompt. "
-    "Output is a self-contained interactive HTML report (~700KB) with 20+ sections including "
-    "executive summary, interactive Gantt charts, risk analysis, SWOT, governance, investor pitch, "
-    "team profiles, work breakdown, scenario comparison, expert criticism, and adversarial sections "
-    "(premortem, self-audit checklist, premise attacks) that stress-test whether the plan holds up. "
-    "The output is a draft to refine, not final ground truth — but it surfaces hard questions the prompter may not have considered. "
-    "Use PlanExe for substantial multi-phase projects with constraints, stakeholders, budgets, and timelines. "
-    "Do not use PlanExe for tiny one-shot outputs (for example: 'give me a 5-point checklist'); use a normal LLM response for that. "
-    "The planning pipeline is fixed end-to-end; callers cannot select individual internal pipeline steps to run. "
-    "Required interaction order: call prompt_examples first. "
-    "Optional before plan_create: call model_profiles to see profile guidance and available models in each profile. "
-    "Then perform a non-tool step: draft a strong prompt as flowing prose (not structured markdown with headers or bullets), "
-    "typically ~300-800 words, and get user approval. "
-    "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. "
-    "Write the prompt as flowing prose — weave specs, constraints, and targets naturally into sentences. "
-    "Only after approval, call plan_create. "
-    "Each plan_create call creates a new task_id; the server does not enforce a global per-client concurrency limit. "
-    "Then poll plan_status (about every 5 minutes); use plan_file_info when complete. "
-    "If a run fails, call plan_retry with the failed task_id to requeue it (optional model_profile, defaults to baseline). "
-    "To stop, call plan_stop with the task_id from plan_create; stopping is asynchronous and the task will eventually transition to failed. "
-    "If model_profiles returns MODEL_PROFILES_UNAVAILABLE, inform the user that no models are currently configured and the server administrator needs to set up model profiles. "
-    "Tool errors use {error:{code,message}}. plan_file_info returns {ready:false,reason:...} while the artifact is not yet ready; check readiness by testing whether download_url is present in the response. "
-    "plan_file_info download_url is the absolute URL where the requested artifact can be downloaded. "
-    "To list recent tasks for a user call plan_list with user_api_key; returns task_id, state, progress_percentage, created_at, and prompt_excerpt for each task. "
-    "plan_status state contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. "
-    "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the task was likely queued but not picked up by a worker (server issue). "
-    "If plan_status is in processing and output files do not change for longer than 20 minutes, the plan_create likely failed/stalled. "
-    "In both cases, report the issue to PlanExe developers on GitHub: https://github.com/PlanExeOrg/PlanExe/issues . "
-    "Main output: a self-contained interactive HTML report (~700KB) with collapsible sections and interactive Gantt charts — open in a browser. "
-    "The zip contains the intermediary pipeline files (md, json, csv) that fed the report."
+# -- db_queries: plan lookup and sync DB operations ----------------------------
+from mcp_cloud.db_queries import (  # noqa: F401
+    find_plan_by_task_id,
+    get_plan_by_id,
+    resolve_plan_for_task_id,
+    _create_plan_sync,
+    _get_plan_status_snapshot_sync,
+    _request_plan_stop_sync,
+    _retry_failed_plan_sync,
+    _get_plan_for_report_sync,
+    _list_plans_sync,
+    get_plan_state_mapping,
+    _extract_plan_create_metadata_overrides,
+    _merge_plan_create_config,
 )
 
-mcp_cloud = Server("planexe-mcp-cloud", instructions=PLANEXE_SERVER_INSTRUCTIONS)
-
-# Base directory for run artifacts (not used directly, fetched via worker_plan HTTP API)
-BASE_DIR_RUN = Path(os.environ.get("PLANEXE_RUN_DIR", Path(__file__).parent.parent / "run")).resolve()
-
-WORKER_PLAN_URL = os.environ.get("PLANEXE_WORKER_PLAN_URL", "http://worker_plan:8000")
-
-REPORT_FILENAME = "030-report.html"
-REPORT_CONTENT_TYPE = "text/html; charset=utf-8"
-ZIP_FILENAME = "run.zip"
-ZIP_CONTENT_TYPE = "application/zip"
-ZIP_SNAPSHOT_MAX_BYTES = 100_000_000
-
-ModelProfileInput = Literal[
-    "baseline",
-    "premium",
-    "frontier",
-    "custom",
-]
-MODEL_PROFILE_TITLES = {
-    ModelProfileEnum.BASELINE.value: "Baseline",
-    ModelProfileEnum.PREMIUM.value: "Premium",
-    ModelProfileEnum.FRONTIER.value: "Frontier",
-    ModelProfileEnum.CUSTOM.value: "Custom",
-}
-MODEL_PROFILE_SUMMARIES = {
-    ModelProfileEnum.BASELINE.value: "Cheap and fast; recommended default when creating a plan.",
-    ModelProfileEnum.PREMIUM.value: "Higher-cost profile tuned for stronger output quality.",
-    ModelProfileEnum.FRONTIER.value: "Most capable models first; usually slowest/most expensive.",
-    ModelProfileEnum.CUSTOM.value: "User-managed profile file for custom model ordering.",
-}
-
-class TaskCreateRequest(BaseModel):
-    prompt: str
-    model_profile: Optional[ModelProfileInput] = None
-    user_api_key: Optional[str] = None
-
-class TaskStatusRequest(BaseModel):
-    task_id: str
-
-class TaskStopRequest(BaseModel):
-    task_id: str
-
-class TaskRetryRequest(BaseModel):
-    task_id: str
-    model_profile: ModelProfileInput = "baseline"
-
-class TaskFileInfoRequest(BaseModel):
-    task_id: str
-    artifact: Optional[str] = None
-
-class TaskListRequest(BaseModel):
-    user_api_key: str
-    limit: int = 10
-
-class ModelProfilesRequest(BaseModel):
-    """No input parameters."""
-    pass
-
-# Helper functions
-def find_plan_by_task_id(task_id: str) -> Optional[PlanItem]:
-    """Find PlanItem by MCP task_id (UUID), with legacy fallback."""
-    task = get_task_by_id(task_id)
-    if task is not None:
-        return task
-
-    def _query_legacy() -> Optional[PlanItem]:
-        query = db.session.query(PlanItem)
-        if db.engine.dialect.name == "postgresql":
-            tasks = query.filter(
-                cast(PlanItem.parameters, JSONB).contains({"_mcp_task_id": task_id})
-            ).all()
-        else:
-            tasks = query.filter(
-                PlanItem.parameters.contains({"_mcp_task_id": task_id})
-            ).all()
-        if tasks:
-            return tasks[0]
-        return None
-
-    if has_app_context():
-        legacy_task = _query_legacy()
-    else:
-        with app.app_context():
-            legacy_task = _query_legacy()
-    if legacy_task is not None:
-        logger.debug("Resolved legacy MCP task id %s to task %s", task_id, legacy_task.id)
-    return legacy_task
-
-def get_task_by_id(task_id: str) -> Optional[PlanItem]:
-    """Fetch a PlanItem by its UUID string."""
-    def _query() -> Optional[PlanItem]:
-        try:
-            task_uuid = uuid.UUID(task_id)
-        except ValueError:
-            return None
-        return db.session.get(PlanItem, task_uuid)
-
-    if has_app_context():
-        return _query()
-    with app.app_context():
-        return _query()
-
-def resolve_task_for_task_id(task_id: str) -> Optional[PlanItem]:
-    """Resolve a PlanItem from a task_id (UUID), with legacy fallback."""
-    return find_plan_by_task_id(task_id)
-
-def _hash_user_api_key(raw_key: str) -> str:
-    secret = os.environ.get("PLANEXE_API_KEY_SECRET", "dev-api-key-secret")
-    if secret == "dev-api-key-secret":
-        logger.warning("PLANEXE_API_KEY_SECRET not set. Using dev secret for API key hashing.")
-    return hashlib.sha256(f"{secret}:{raw_key}".encode("utf-8")).hexdigest()
-
-def _resolve_user_from_api_key(raw_key: str) -> Optional[dict[str, Any]]:
-    if not raw_key:
-        return None
-    key_hash = _hash_user_api_key(raw_key)
-    with app.app_context():
-        api_key = UserApiKey.query.filter_by(key_hash=key_hash, revoked_at=None).first()
-        if not api_key:
-            return None
-        user = db.session.get(UserAccount, api_key.user_id)
-        if not user:
-            return None
-
-        user_context = {
-            "user_id": str(user.id),
-            "credits_balance": float(user.credits_balance or 0),
-        }
-        api_key.last_used_at = datetime.now(UTC)
-        db.session.commit()
-        return user_context
-
-def _create_task_sync(
-    prompt: str,
-    config: Optional[dict[str, Any]],
-    metadata: Optional[dict[str, Any]],
-) -> dict[str, Any]:
-    with app.app_context():
-        parameters = dict(config or {})
-        parameters["model_profile"] = normalize_model_profile(parameters.get("model_profile")).value
-        parameters["trigger_source"] = "mcp plan_create"
-
-        task = PlanItem(
-            prompt=prompt,
-            state=PlanState.pending,
-            user_id=metadata.get("user_id", "admin") if metadata else "admin",
-            parameters=parameters,
-        )
-        db.session.add(task)
-        db.session.commit()
-
-        task_id = str(task.id)
-        event_context = {
-            "task_id": task_id,
-            "task_handle": task_id,
-            "prompt": task.prompt,
-            "user_id": task.user_id,
-            "config": config,
-            "metadata": metadata,
-            "parameters": task.parameters,
-        }
-        event = EventItem(
-            event_type=EventType.TASK_PENDING,
-            message="Enqueued task via MCP",
-            context=event_context,
-        )
-        db.session.add(event)
-        db.session.commit()
-
-        created_at = task.timestamp_created
-        if created_at and created_at.tzinfo is None:
-            created_at = created_at.replace(tzinfo=UTC)
-        return {
-            "task_id": task_id,
-            "created_at": created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z"),
-        }
-
-def _get_task_status_snapshot_sync(task_id: str) -> Optional[dict[str, Any]]:
-    with app.app_context():
-        task = find_plan_by_task_id(task_id)
-        if task is None:
-            return None
-        return {
-            "id": str(task.id),
-            "state": task.state,
-            "stop_requested": bool(task.stop_requested),
-            "progress_percentage": task.progress_percentage,
-            "timestamp_created": task.timestamp_created,
-        }
-
-def _request_task_stop_sync(task_id: str) -> Optional[dict[str, Any]]:
-    with app.app_context():
-        task = find_plan_by_task_id(task_id)
-        if task is None:
-            return None
-        stop_requested = False
-        if task.state in (PlanState.pending, PlanState.processing):
-            task.stop_requested = True
-            task.stop_requested_timestamp = datetime.now(UTC)
-            task.progress_message = "Stop requested by user."
-            db.session.commit()
-            logger.info("Stop requested for task %s; stop flag set on task %s.", task_id, task.id)
-            stop_requested = True
-        return {
-            "state": get_task_state_mapping(task.state),
-            "stop_requested": stop_requested,
-        }
-
-
-def _retry_failed_task_sync(task_id: str, model_profile: str) -> Optional[dict[str, Any]]:
-    with app.app_context():
-        task = find_plan_by_task_id(task_id)
-        if task is None:
-            return None
-        if task.state != PlanState.failed:
-            return {
-                "error": {
-                    "code": "TASK_NOT_FAILED",
-                    "message": f"Task is not in failed state: {task_id}",
-                }
-            }
-
-        normalized_profile = normalize_model_profile(model_profile).value
-        now_utc = datetime.now(UTC)
-        parameters = dict(task.parameters) if isinstance(task.parameters, dict) else {}
-        parameters["model_profile"] = normalized_profile
-        parameters["trigger_source"] = "mcp plan_retry"
-
-        # Reset task state and clear prior run artifacts before requeueing.
-        task.state = PlanState.pending
-        task.timestamp_created = now_utc
-        task.progress_percentage = 0.0
-        task.progress_message = "Retry requested via MCP."
-        task.stop_requested = False
-        task.stop_requested_timestamp = None
-        task.generated_report_html = None
-        task.run_zip_snapshot = None
-        task.run_track_activity_jsonl = None
-        task.run_track_activity_bytes = None
-        task.run_activity_overview_json = None
-        task.run_artifact_layout_version = None
-        task.parameters = parameters
-        db.session.commit()
-
-        event_context = {
-            "task_id": str(task.id),
-            "task_handle": str(task.id),
-            "retry_of_task_id": task_id,
-            "model_profile": normalized_profile,
-            "parameters": task.parameters,
-        }
-        event = EventItem(
-            event_type=EventType.TASK_PENDING,
-            message="Retried failed task via MCP",
-            context=event_context,
-        )
-        db.session.add(event)
-        db.session.commit()
-
-        return {
-            "task_id": str(task.id),
-            "state": get_task_state_mapping(task.state),
-            "model_profile": normalized_profile,
-            "retried_at": now_utc.replace(microsecond=0).isoformat().replace("+00:00", "Z"),
-        }
-
-
-def _get_task_for_report_sync(task_id: str) -> Optional[dict[str, Any]]:
-    with app.app_context():
-        task = resolve_task_for_task_id(task_id)
-        if task is None:
-            return None
-        return {
-            "id": str(task.id),
-            "state": task.state,
-            "progress_message": task.progress_message,
-        }
-
-def _list_tasks_sync(user_id: str, limit: int) -> list[dict[str, Any]]:
-    with app.app_context():
-        tasks = (
-            db.session.query(PlanItem)
-            .filter_by(user_id=user_id)
-            .order_by(PlanItem.timestamp_created.desc())
-            .limit(max(1, min(limit, 50)))
-            .all()
-        )
-        results = []
-        for task in tasks:
-            created_at = task.timestamp_created
-            if created_at and created_at.tzinfo is None:
-                created_at = created_at.replace(tzinfo=UTC)
-            results.append({
-                "task_id": str(task.id),
-                "state": get_task_state_mapping(task.state),
-                "progress_percentage": float(task.progress_percentage or 0.0),
-                "created_at": (
-                    created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z")
-                    if created_at else None
-                ),
-                "prompt_excerpt": (task.prompt or "")[:100],
-            })
-        return results
-
-
-def list_files_from_zip_bytes(zip_bytes: bytes) -> list[str]:
-    """List file entries from an in-memory zip archive."""
-    try:
-        with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file:
-            files = [name for name in zip_file.namelist() if not name.endswith("/")]
-            return sorted(files)
-    except Exception as exc:
-        logger.warning("Unable to list files from zip snapshot: %s", exc)
-        return []
-
-def extract_file_from_zip_bytes(zip_bytes: bytes, file_path: str) -> Optional[bytes]:
-    """Extract a file from an in-memory zip archive."""
-    try:
-        with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file:
-            file_path_normalized = file_path.lstrip('/')
-            try:
-                return zip_file.read(file_path_normalized)
-            except KeyError:
-                return None
-    except Exception as exc:
-        logger.warning("Unable to read %s from zip snapshot: %s", file_path, exc)
-        return None
-
-def extract_file_from_zip_file(file_handle: io.BufferedIOBase, file_path: str) -> Optional[bytes]:
-    """Extract a file from a seekable zip file handle."""
-    try:
-        with zipfile.ZipFile(file_handle, 'r') as zip_file:
-            file_path_normalized = file_path.lstrip('/')
-            try:
-                return zip_file.read(file_path_normalized)
-            except KeyError:
-                return None
-    except Exception as exc:
-        logger.warning("Unable to read %s from zip stream: %s", file_path, exc)
-        return None
-
-def fetch_report_from_db(task_id: str) -> Optional[bytes]:
-    """Fetch the report HTML stored in the PlanItem."""
-    task = get_task_by_id(task_id)
-    if task and task.generated_report_html is not None:
-        return task.generated_report_html.encode("utf-8")
-    return None
-
-def fetch_zip_snapshot(task_id: str) -> Optional[bytes]:
-    """Fetch the zip snapshot stored in the PlanItem."""
-    task = get_task_by_id(task_id)
-    if task and task.run_zip_snapshot is not None:
-        return task.run_zip_snapshot
-    return None
-
-def fetch_file_from_zip_snapshot(task_id: str, file_path: str) -> Optional[bytes]:
-    """Fetch a file from the PlanItem zip snapshot."""
-    task = get_task_by_id(task_id)
-    if task and task.run_zip_snapshot is not None:
-        return extract_file_from_zip_bytes(task.run_zip_snapshot, file_path)
-    return None
-
-def list_files_from_zip_snapshot(task_id: str) -> Optional[list[str]]:
-    """List files from the PlanItem zip snapshot."""
-    task = get_task_by_id(task_id)
-    if task and task.run_zip_snapshot is not None:
-        return list_files_from_zip_bytes(task.run_zip_snapshot)
-    return None
-
-async def fetch_artifact_from_worker_plan(run_id: str, file_path: str) -> Optional[bytes]:
-    """Fetch an artifact file from worker_plan via HTTP."""
-    try:
-        async with httpx.AsyncClient(timeout=60.0) as client:
-            # For report.html, use the dedicated report endpoint (most efficient)
-            if (
-                file_path == "report.html"
-                or file_path.endswith("/report.html")
-                or file_path == REPORT_FILENAME
-                or file_path.endswith(f"/{REPORT_FILENAME}")
-            ):
-                report_response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/report")
-                if report_response.status_code == 200:
-                    return report_response.content
-                logger.warning(f"Worker plan returned {report_response.status_code} for report: {run_id}")
-                report_from_db = await asyncio.to_thread(fetch_report_from_db, run_id)
-                if report_from_db is not None:
-                    return report_from_db
-                report_from_zip = await asyncio.to_thread(
-                    fetch_file_from_zip_snapshot, run_id, REPORT_FILENAME
-                )
-                if report_from_zip is not None:
-                    return report_from_zip
-                return None
-            
-            # For other files, fetch the zip and extract the file
-            # This is less efficient but works without a file serving endpoint
-            async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as zip_response:
-                if zip_response.status_code != 200:
-                    logger.warning(f"Worker plan returned {zip_response.status_code} for zip: {run_id}")
-                else:
-                    zip_too_large = False
-                    content_length = zip_response.headers.get("content-length")
-                    if content_length:
-                        try:
-                            if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES:
-                                logger.warning(
-                                    "Zip snapshot too large (%s bytes) for run %s; skipping.",
-                                    content_length,
-                                    run_id,
-                                )
-                                zip_too_large = True
-                        except ValueError:
-                            logger.warning(
-                                "Invalid Content-Length for zip snapshot: %s", content_length
-                            )
-                    if not zip_too_large:
-                        with tempfile.TemporaryFile() as tmp_file:
-                            size = 0
-                            async for chunk in zip_response.aiter_bytes():
-                                size += len(chunk)
-                                if size > ZIP_SNAPSHOT_MAX_BYTES:
-                                    logger.warning(
-                                        "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.",
-                                        ZIP_SNAPSHOT_MAX_BYTES,
-                                        run_id,
-                                    )
-                                    zip_too_large = True
-                                    break
-                                tmp_file.write(chunk)
-                            if not zip_too_large:
-                                tmp_file.seek(0)
-                                file_data = extract_file_from_zip_file(tmp_file, file_path)
-                                if file_data is not None:
-                                    return file_data
-
-            snapshot_file = await asyncio.to_thread(fetch_file_from_zip_snapshot, run_id, file_path)
-            if snapshot_file is not None:
-                return snapshot_file
-            return None
-            
-    except Exception as e:
-        logger.error(f"Error fetching artifact from worker_plan: {e}", exc_info=True)
-        return None
-
-async def fetch_file_list_from_worker_plan(run_id: str) -> Optional[list[str]]:
-    """Fetch the list of files from worker_plan via HTTP."""
-    try:
-        async with httpx.AsyncClient(timeout=30.0) as client:
-            response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/files")
-            if response.status_code == 200:
-                data = response.json()
-                files = data.get("files", [])
-                if files:
-                    return files
-                fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id)
-                if fallback_files:
-                    return fallback_files
-                return files
-            logger.warning(f"Worker plan returned {response.status_code} for files list: {run_id}")
-            fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id)
-            if fallback_files is not None:
-                return fallback_files
-            return None
-    except Exception as e:
-        logger.error(f"Error fetching file list from worker_plan: {e}", exc_info=True)
-        return None
-
-
-def list_files_from_local_run_dir(run_id: str) -> Optional[list[str]]:
-    """
-    List files from local run directory when this service shares PLANEXE_RUN_DIR
-    with the worker (e.g., Docker compose).
-    """
-    run_dir = (BASE_DIR_RUN / run_id).resolve()
-    try:
-        if not run_dir.is_relative_to(BASE_DIR_RUN):
-            return None
-    except ValueError:
-        return None
-    if not run_dir.exists() or not run_dir.is_dir():
-        return None
-    try:
-        return sorted([path.name for path in run_dir.iterdir() if path.is_file()])
-    except Exception as exc:
-        logger.warning("Unable to list local run dir files for %s: %s", run_id, exc)
-        return None
-
-async def fetch_zip_from_worker_plan(run_id: str) -> Optional[bytes]:
-    """Fetch the zip snapshot from worker_plan via HTTP."""
-    try:
-        async with httpx.AsyncClient(timeout=60.0) as client:
-            async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as response:
-                if response.status_code != 200:
-                    logger.warning("Worker plan returned %s for zip: %s", response.status_code, run_id)
-                else:
-                    zip_too_large = False
-                    content_length = response.headers.get("content-length")
-                    if content_length:
-                        try:
-                            if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES:
-                                logger.warning(
-                                    "Zip snapshot too large (%s bytes) for run %s; skipping.",
-                                    content_length,
-                                    run_id,
-                                )
-                                zip_too_large = True
-                        except ValueError:
-                            logger.warning(
-                                "Invalid Content-Length for zip snapshot: %s", content_length
-                            )
-                    if not zip_too_large:
-                        buffer = BytesIO()
-                        size = 0
-                        async for chunk in response.aiter_bytes():
-                            size += len(chunk)
-                            if size > ZIP_SNAPSHOT_MAX_BYTES:
-                                logger.warning(
-                                    "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.",
-                                    ZIP_SNAPSHOT_MAX_BYTES,
-                                    run_id,
-                                )
-                                zip_too_large = True
-                                break
-                            buffer.write(chunk)
-                        if not zip_too_large:
-                            return buffer.getvalue()
-
-            snapshot_bytes = await asyncio.to_thread(fetch_zip_snapshot, run_id)
-            if snapshot_bytes is not None:
-                return snapshot_bytes
-            return None
-    except Exception as e:
-        logger.error(f"Error fetching zip from worker_plan: {e}", exc_info=True)
-        return None
-
-
-def _sanitize_legacy_zip_snapshot(zip_bytes: bytes) -> Optional[bytes]:
-    """Remove internal track_activity.jsonl files from legacy zip snapshots."""
-    try:
-        with zipfile.ZipFile(BytesIO(zip_bytes), "r") as in_zip:
-            entries = [name for name in in_zip.namelist() if not name.endswith("/")]
-            if not any(name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl" for name in entries):
-                return zip_bytes
-            out_buffer = BytesIO()
-            with zipfile.ZipFile(out_buffer, "w", compression=zipfile.ZIP_DEFLATED) as out_zip:
-                for name in entries:
-                    if name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl":
-                        continue
-                    out_zip.writestr(name, in_zip.read(name))
-            return out_buffer.getvalue()
-    except Exception as exc:
-        logger.warning("Unable to sanitize legacy run zip snapshot: %s", exc)
-        return None
-
-
-async def fetch_user_downloadable_zip(task_id: str) -> Optional[bytes]:
-    """
-    Fetch a user-downloadable zip for a task.
-    New layout snapshots are served directly from PlanItem.run_zip_snapshot.
-    Legacy/task-dir fallbacks are sanitized to remove track_activity.jsonl.
-    """
-    task = await asyncio.to_thread(get_task_by_id, task_id)
-    if task is None:
-        return None
-
-    snapshot_bytes = task.run_zip_snapshot if task.run_zip_snapshot is not None else None
-    layout_version = task.run_artifact_layout_version or 0
-    if snapshot_bytes is not None:
-        if layout_version >= 2:
-            return snapshot_bytes
-        return _sanitize_legacy_zip_snapshot(snapshot_bytes)
-
-    worker_plan_zip = await fetch_zip_from_worker_plan(str(task.id))
-    if worker_plan_zip is None:
-        return None
-    return _sanitize_legacy_zip_snapshot(worker_plan_zip)
-
-def compute_sha256(content: str | bytes) -> str:
-    """Compute SHA256 hash of content."""
-    if isinstance(content, str):
-        content = content.encode('utf-8')
-    return hashlib.sha256(content).hexdigest()
-
-def get_task_state_mapping(task_state: PlanState) -> str:
-    """Map PlanState to MCP task state."""
-    mapping = {
-        PlanState.pending: "pending",
-        PlanState.processing: "processing",
-        PlanState.completed: "completed",
-        PlanState.failed: "failed",
-    }
-    return mapping.get(task_state, "pending")
-
-def _extract_task_create_metadata_overrides(arguments: dict[str, Any]) -> dict[str, Any]:
-    """Extract plan_create runtime overrides from hidden metadata containers.
-
-    Supported hidden containers:
-    - arguments.tool_metadata
-    - arguments.metadata
-    - arguments._meta
-
-    If a container includes nested namespaces, these are checked first:
-    - plan_create
-    - task_create (legacy alias)
-    - planexe_task_create (legacy alias)
-    - planexe
-    """
-    merged: dict[str, Any] = {}
-    metadata_candidates: list[dict[str, Any]] = []
-
-    for key in ("tool_metadata", "metadata", "_meta"):
-        candidate = arguments.get(key)
-        if isinstance(candidate, dict):
-            metadata_candidates.append(candidate)
-
-    for candidate in metadata_candidates:
-        merged.update(candidate)
-        for nested_key in ("plan_create", "task_create", "planexe_task_create", "planexe"):
-            nested = candidate.get(nested_key)
-            if isinstance(nested, dict):
-                merged.update(nested)
-
-    return merged
-
-def _merge_task_create_config(
-    config: Optional[dict[str, Any]],
-    model_profile: Optional[str],
-) -> Optional[dict[str, Any]]:
-    merged = dict(config or {})
-    if isinstance(model_profile, str):
-        candidate_profile = model_profile.strip()
-        if candidate_profile and "model_profile" not in merged:
-            merged["model_profile"] = candidate_profile
-    return merged or None
-
-
-def _sort_llm_config_entries(items: list[tuple[str, Any]]) -> list[tuple[str, Any]]:
-    def sort_key(item: tuple[str, Any]) -> tuple[int, str]:
-        key, model_data = item
-        priority = None
-        if isinstance(model_data, dict):
-            maybe_priority = model_data.get("priority")
-            if isinstance(maybe_priority, int):
-                priority = maybe_priority
-        if priority is None:
-            priority = 999999
-        return priority, key
-
-    return sorted(items, key=sort_key)
-
-
-def _extract_model_profile_entries(
-    model_map: dict[str, Any],
-    whitelist: Optional[set[str]],
-) -> list[dict[str, Any]]:
-    models: list[dict[str, Any]] = []
-
-    for model_key, model_data in _sort_llm_config_entries(list(model_map.items())):
-        class_name = model_data.get("class") if isinstance(model_data, dict) else None
-        if not is_llm_class_allowed(class_name, whitelist):
-            continue
-
-        model_name = None
-        priority = None
-        if isinstance(model_data, dict):
-            arguments = model_data.get("arguments")
-            if isinstance(arguments, dict):
-                maybe_model = arguments.get("model")
-                if isinstance(maybe_model, str):
-                    model_name = maybe_model
-            maybe_priority = model_data.get("priority")
-            if isinstance(maybe_priority, int):
-                priority = maybe_priority
-            elif isinstance(model_data.get("prio"), int):
-                priority = model_data["prio"]
-
-        models.append(
-            {
-                "key": model_key,
-                "provider_class": class_name if isinstance(class_name, str) else None,
-                "model": model_name,
-                "priority": priority,
-            }
-        )
-
-    return models
-
-
-def _profile_models_payload(
-    profile: ModelProfileEnum,
-    whitelist: Optional[set[str]],
-) -> dict[str, Any]:
-    config_filename = default_filename_for_profile(profile)
-    planexe_config_path = PlanExeConfig.resolve_planexe_config_path()
-    config_path = PlanExeConfig.find_file_in_search_order(config_filename, planexe_config_path)
-    if config_path is None:
-        return {
-            "profile": profile.value,
-            "title": MODEL_PROFILE_TITLES[profile.value],
-            "summary": MODEL_PROFILE_SUMMARIES[profile.value],
-            "model_count": 0,
-            "models": [],
-        }
-
-    try:
-        with config_path.open("r", encoding="utf-8") as fh:
-            model_map = json.load(fh)
-    except Exception as exc:
-        logger.warning(
-            "Unable to read profile config %s for model profile %s: %s",
-            config_filename,
-            profile.value,
-            exc,
-        )
-        return {
-            "profile": profile.value,
-            "title": MODEL_PROFILE_TITLES[profile.value],
-            "summary": MODEL_PROFILE_SUMMARIES[profile.value],
-            "model_count": 0,
-            "models": [],
-        }
-
-    if not isinstance(model_map, dict):
-        return {
-            "profile": profile.value,
-            "title": MODEL_PROFILE_TITLES[profile.value],
-            "summary": MODEL_PROFILE_SUMMARIES[profile.value],
-            "model_count": 0,
-            "models": [],
-        }
-
-    models = _extract_model_profile_entries(model_map, whitelist)
-    return {
-        "profile": profile.value,
-        "title": MODEL_PROFILE_TITLES[profile.value],
-        "summary": MODEL_PROFILE_SUMMARIES[profile.value],
-        "model_count": len(models),
-        "models": models,
-    }
-
-
-def _get_model_profiles_sync() -> dict[str, Any]:
-    raw_whitelist = os.environ.get(ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES)
-    whitelist = parse_llm_class_whitelist(raw_whitelist)
-    default_profile = resolve_model_profile_from_env().value
-    profiles_all = [
-        _profile_models_payload(profile, whitelist)
-        for profile in ModelProfileEnum
-    ]
-    profiles = [profile for profile in profiles_all if int(profile.get("model_count") or 0) > 0]
-
-    return {
-        "default_profile": default_profile,
-        "profiles": profiles,
-        "message": (
-            "Use one of these profile values in plan_create.model_profile. "
-            "Model lists show what is currently available in each profile."
-        ),
-    }
-
-# Context var set by HTTP server so download URLs use the request's host when
-# PLANEXE_MCP_PUBLIC_BASE_URL is not set (avoids localhost for remote clients).
-_download_base_url_ctx: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
-    "download_base_url", default=None
+# -- zip_utils: zip extraction, sanitization, hashing -------------------------
+from mcp_cloud.zip_utils import (  # noqa: F401
+    list_files_from_zip_bytes,
+    extract_file_from_zip_bytes,
+    extract_file_from_zip_file,
+    fetch_report_from_db,
+    fetch_zip_snapshot,
+    fetch_file_from_zip_snapshot,
+    list_files_from_zip_snapshot,
+    _sanitize_legacy_zip_snapshot,
+    compute_sha256,
 )
 
+# -- worker_fetchers: HTTP fetchers for worker_plan artifacts ------------------
+from mcp_cloud.worker_fetchers import (  # noqa: F401
+    fetch_artifact_from_worker_plan,
+    fetch_file_list_from_worker_plan,
+    list_files_from_local_run_dir,
+    fetch_zip_from_worker_plan,
+    fetch_user_downloadable_zip,
+)
 
-def set_download_base_url(base_url: Optional[str]) -> None:
-    """Set the base URL used for download links for this request (e.g. from HTTP Request).
-    Cleared automatically when the request ends. Used when PLANEXE_MCP_PUBLIC_BASE_URL is unset."""
-    if base_url is not None:
-        _download_base_url_ctx.set(base_url.rstrip("/"))
-    else:
-        try:
-            _download_base_url_ctx.set("")
-        except LookupError:
-            pass
-
-
-def clear_download_base_url() -> None:
-    """Clear the request-scoped base URL (call when request ends)."""
-    try:
-        _download_base_url_ctx.set("")
-    except LookupError:
-        pass
-
-
-def _get_download_base_url() -> Optional[str]:
-    """Return base URL for download links: env var, then request context, then None."""
-    base_url = os.environ.get("PLANEXE_MCP_PUBLIC_BASE_URL")
-    if base_url:
-        return base_url.rstrip("/")
-    try:
-        ctx_url = _download_base_url_ctx.get()
-        return ctx_url if ctx_url else None
-    except LookupError:
-        return None
-
-
-def build_report_download_path(task_id: str) -> str:
-    return f"/download/{task_id}/{REPORT_FILENAME}"
-
-
-def build_zip_download_path(task_id: str) -> str:
-    return f"/download/{task_id}/{ZIP_FILENAME}"
-
-
-# ---------------------------------------------------------------------------
-# Signed, expiring download tokens
-# ---------------------------------------------------------------------------
-
-# Default TTL for signed download tokens (seconds). Configurable via env var.
-DOWNLOAD_TOKEN_TTL_SECONDS = int(os.environ.get("PLANEXE_DOWNLOAD_TOKEN_TTL", "900"))  # 15 min
-
-# Per-process fallback secret when no env var is set.  Tokens won't survive a
-# server restart, but that is acceptable for the fallback case.
-_random_token_secret: Optional[bytes] = None
-
-
-def _get_download_token_secret() -> bytes:
-    """Return the HMAC-SHA256 secret used to sign download tokens.
-
-    Priority: PLANEXE_DOWNLOAD_TOKEN_SECRET → PLANEXE_API_KEY_SECRET →
-    per-process random (with a warning logged once).
-    """
-    global _random_token_secret
-    for env_var in ("PLANEXE_DOWNLOAD_TOKEN_SECRET", "PLANEXE_API_KEY_SECRET"):
-        value = os.environ.get(env_var)
-        if value:
-            return value.encode()
-    if _random_token_secret is None:
-        _random_token_secret = secrets.token_bytes(32)
-        logger.warning(
-            "PLANEXE_DOWNLOAD_TOKEN_SECRET is not set; using a random per-process secret. "
-            "Download tokens will be invalidated on server restart. "
-            "Set PLANEXE_DOWNLOAD_TOKEN_SECRET to a stable value."
-        )
-    return _random_token_secret
-
-
-def generate_download_token(task_id: str, filename: str) -> str:
-    """Return a signed, time-limited token for one task artifact download.
-
-    Format: ``{expiry_unix_ts}.{hmac_hex}``
-    The HMAC covers ``task_id:filename:expiry`` so the token is scoped to
-    exactly one file and cannot be reused for a different task.
-    """
-    expiry = int(time.time()) + DOWNLOAD_TOKEN_TTL_SECONDS
-    message = f"{task_id}:{filename}:{expiry}".encode()
-    mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest()
-    return f"{expiry}.{mac}"
-
-
-def validate_download_token(token: str, task_id: str, filename: str) -> bool:
-    """Return True when *token* is a valid, unexpired token for the given artifact."""
-    try:
-        expiry_str, mac = token.split(".", 1)
-        expiry = int(expiry_str)
-    except (ValueError, AttributeError):
-        return False
-    if time.time() > expiry:
-        return False
-    message = f"{task_id}:{filename}:{expiry}".encode()
-    expected_mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest()
-    return hmac.compare_digest(mac, expected_mac)
-
-
-def build_report_download_url(task_id: str) -> Optional[str]:
-    base_url = _get_download_base_url()
-    if not base_url:
-        return None
-    token = generate_download_token(task_id, REPORT_FILENAME)
-    return f"{base_url}{build_report_download_path(task_id)}?token={token}"
-
-
-def build_zip_download_url(task_id: str) -> Optional[str]:
-    base_url = _get_download_base_url()
-    if not base_url:
-        return None
-    token = generate_download_token(task_id, ZIP_FILENAME)
-    return f"{base_url}{build_zip_download_path(task_id)}?token={token}"
-
-
-def _load_mcp_example_prompts() -> list[str]:
-    """Load prompts from the catalog that are marked as MCP examples (mcp_example or mcp-example-prompt true).
-
-    Uses worker_plan_api.PromptCatalog the same way as frontend_single_user and frontend_multi_user
-    (no env var). Tries repo-root import first, then adds worker_plan to sys.path so worker_plan_api
-    is top-level (same as frontends). Falls back to built-in examples if the catalog is unavailable.
-    """
-    catalog = None
-    try:
-        from worker_plan.worker_plan_api.prompt_catalog import PromptCatalog
-
-        catalog = PromptCatalog()
-        catalog.load_simple_plan_prompts()
-    except Exception:
-        try:
-            # Same as frontends when worker_plan exists; when not (e.g. Docker), repo_root has worker_plan_api
-            import sys
-
-            repo_root = Path(__file__).resolve().parent.parent
-            worker_plan_dir = repo_root / "worker_plan"
-            path_to_add = str(worker_plan_dir if worker_plan_dir.exists() else repo_root)
-            if path_to_add not in sys.path:
-                sys.path.insert(0, path_to_add)
-            from worker_plan_api.prompt_catalog import PromptCatalog
-
-            catalog = PromptCatalog()
-            catalog.load_simple_plan_prompts()
-        except Exception as e:
-            logger.warning(
-                "Prompt catalog unavailable (%s); using built-in examples.",
-                e,
-            )
-            return _builtin_mcp_example_prompts()
-
-    if catalog is None:
-        return _builtin_mcp_example_prompts()
-
-    samples: list[str] = []
-    for item in catalog.all():
-        if item.extras.get("mcp_example") is True or item.extras.get("mcp-example-prompt") is True:
-            samples.append(item.prompt)
-    if not samples:
-        return _builtin_mcp_example_prompts()
-    return samples
-
-
-def _builtin_mcp_example_prompts() -> list[str]:
-    """Fallback example prompts when the catalog file is missing or has no mcp_example entries."""
-    return [
-        (
-            "Vegan Butcher Shop. That sells artificial meat (Plant-Based). Location Kødbyen, Copenhagen. "
-            "Sell sandwiches and sausages. Provocative marketing. Budget: 10 million DKK. Grand Opening in month 3. "
-            "Profitability Goal: month 12. Create a signature item that is a social media hit. "
-            "Pick a realistic scenario. I already have negotiated a 2 year lease inside Kødbyen. "
-            "Banned words: blockchain, VR, AR, AI, Robots."
-        ),
-        (
-            "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. "
-            "Budget 2.5M DKK. Open within 12 months. Include equipment, staffing, permits, and marketing. "
-            "Pick a realistic scenario; avoid overly ambitious timelines."
-        ),
-    ]
-
-
-PLAN_CREATE_INPUT_SCHEMA = PlanCreateInput.model_json_schema()
-PLAN_CREATE_OUTPUT_SCHEMA = PlanCreateOutput.model_json_schema()
-PLAN_STATUS_SUCCESS_SCHEMA = PlanStatusSuccess.model_json_schema()
-PLAN_STATUS_OUTPUT_SCHEMA = {
-    "oneOf": [
-        {
-            "type": "object",
-            "properties": {"error": ErrorDetail.model_json_schema()},
-            "required": ["error"],
-        },
-        PLAN_STATUS_SUCCESS_SCHEMA,
-    ]
-}
-PLAN_STOP_OUTPUT_SCHEMA = PlanStopOutput.model_json_schema()
-PLAN_RETRY_OUTPUT_SCHEMA = PlanRetryOutput.model_json_schema()
-PLAN_FILE_INFO_READY_OUTPUT_SCHEMA = PlanFileInfoReadyOutput.model_json_schema()
-PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA = PlanFileInfoNotReadyOutput.model_json_schema()
-PLAN_FILE_INFO_OUTPUT_SCHEMA = {
-    "oneOf": [
-        {
-            "type": "object",
-            "properties": {"error": ErrorDetail.model_json_schema()},
-            "required": ["error"],
-        },
-        PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA,
-        PLAN_FILE_INFO_READY_OUTPUT_SCHEMA,
-    ]
-}
-PLAN_STATUS_INPUT_SCHEMA = PlanStatusInput.model_json_schema()
-PLAN_STOP_INPUT_SCHEMA = PlanStopInput.model_json_schema()
-PLAN_RETRY_INPUT_SCHEMA = PlanRetryInput.model_json_schema()
-PLAN_FILE_INFO_INPUT_SCHEMA = PlanFileInfoInput.model_json_schema()
-
-PROMPT_EXAMPLES_INPUT_SCHEMA = PromptExamplesInput.model_json_schema()
-PROMPT_EXAMPLES_OUTPUT_SCHEMA = PromptExamplesOutput.model_json_schema()
-MODEL_PROFILES_INPUT_SCHEMA = ModelProfilesInput.model_json_schema()
-MODEL_PROFILES_OUTPUT_SCHEMA = ModelProfilesOutput.model_json_schema()
-PLAN_LIST_INPUT_SCHEMA = PlanListInput.model_json_schema()
-PLAN_LIST_OUTPUT_SCHEMA = PlanListOutput.model_json_schema()
-
-# Backward-compatible aliases for tests that reference old TASK_* names
-TASK_CREATE_INPUT_SCHEMA = PLAN_CREATE_INPUT_SCHEMA
-TASK_CREATE_OUTPUT_SCHEMA = PLAN_CREATE_OUTPUT_SCHEMA
-TASK_STATUS_INPUT_SCHEMA = PLAN_STATUS_INPUT_SCHEMA
-TASK_STATUS_OUTPUT_SCHEMA = PLAN_STATUS_OUTPUT_SCHEMA
-TASK_STOP_INPUT_SCHEMA = PLAN_STOP_INPUT_SCHEMA
-TASK_STOP_OUTPUT_SCHEMA = PLAN_STOP_OUTPUT_SCHEMA
-TASK_RETRY_INPUT_SCHEMA = PLAN_RETRY_INPUT_SCHEMA
-TASK_RETRY_OUTPUT_SCHEMA = PLAN_RETRY_OUTPUT_SCHEMA
-TASK_FILE_INFO_INPUT_SCHEMA = PLAN_FILE_INFO_INPUT_SCHEMA
-TASK_FILE_INFO_OUTPUT_SCHEMA = PLAN_FILE_INFO_OUTPUT_SCHEMA
-TASK_LIST_INPUT_SCHEMA = PLAN_LIST_INPUT_SCHEMA
-TASK_LIST_OUTPUT_SCHEMA = PLAN_LIST_OUTPUT_SCHEMA
-
-@dataclass(frozen=True)
-class ToolDefinition:
-    name: str
-    description: str
-    input_schema: dict[str, Any]
-    output_schema: Optional[dict[str, Any]] = None
-    annotations: Optional[dict[str, Any]] = None
-
-TOOL_DEFINITIONS = [
-    ToolDefinition(
-        name="prompt_examples",
-        description=(
-            "Call this first. Returns example prompts that define what a good prompt looks like. "
-            "Do NOT call plan_create yet. Optional before plan_create: call model_profiles to choose model_profile. "
-            "Next is a non-tool step: formulate a detailed prompt (typically ~300-800 words; use examples as a baseline, similar structure) and get user approval. "
-            "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. "
-            "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. "
-            "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. "
-            "The examples demonstrate this prose style — match their tone and density. "
-            "Then call plan_create. "
-            "PlanExe is not for tiny one-shot outputs like a 5-point checklist; and it does not support selecting only some internal pipeline steps."
-        ),
-        input_schema=PROMPT_EXAMPLES_INPUT_SCHEMA,
-        output_schema=PROMPT_EXAMPLES_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": True,
-            "destructiveHint": False,
-            "idempotentHint": True,
-            "openWorldHint": False,
-        },
-    ),
-    ToolDefinition(
-        name="model_profiles",
-        description=(
-            "Optional helper before plan_create. Returns model_profile options with plain-language guidance "
-            "and currently available models in each profile. "
-            "If no models are available, returns error code MODEL_PROFILES_UNAVAILABLE."
-        ),
-        input_schema=MODEL_PROFILES_INPUT_SCHEMA,
-        output_schema=MODEL_PROFILES_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": True,
-            "destructiveHint": False,
-            "idempotentHint": True,
-            "openWorldHint": False,
-        },
-    ),
-    ToolDefinition(
-        name="plan_create",
-        description=(
-            "Call only after prompt_examples and after you have completed prompt drafting/approval (non-tool step). "
-            "PlanExe turns the approved prompt into a strategic project-plan draft (20+ sections) in ~10-20 min. "
-            "Sections include: executive summary, interactive Gantt charts, investor pitch, project plan with SMART criteria, "
-            "strategic decision analysis, scenario comparison, assumptions with expert review, governance structure, "
-            "SWOT analysis, team role profiles, simulated expert criticism, work breakdown structure, "
-            "plan review (critical issues, KPIs, financial strategy, automation opportunities), Q&A, "
-            "premortem with failure scenarios, self-audit checklist, and adversarial premise attacks that argue against the project. "
-            "The adversarial sections (premortem, self-audit, premise attacks) surface risks and questions the prompter may not have considered. "
-            "Returns task_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_file_info. "
-            "If you lose a task_id, call plan_list with your user_api_key to recover it. "
-            "Each plan_create call creates a new task_id (no server-side dedup). "
-            "If you are unsure which model_profile to choose, call model_profiles first. "
-            "If your deployment uses credits, include user_api_key to charge the correct account. "
-            "Common error codes: INVALID_USER_API_KEY, USER_API_KEY_REQUIRED, INSUFFICIENT_CREDITS."
-        ),
-        input_schema=PLAN_CREATE_INPUT_SCHEMA,
-        output_schema=PLAN_CREATE_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": False,
-            "destructiveHint": False,
-            "idempotentHint": False,
-            "openWorldHint": True,
-        },
-    ),
-    ToolDefinition(
-        name="plan_status",
-        description=(
-            "Returns status and progress of the plan currently being created. "
-            "Poll at reasonable intervals only (e.g. every 5 minutes): plan generation typically takes 10-20 minutes "
-            "(baseline profile) and may take longer on higher-quality profiles. "
-            "State contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. "
-            "progress_percentage is 0-100 (integer-like float); 100 when completed. "
-            "files lists intermediate outputs produced so far; use their updated_at timestamps to detect stalls. "
-            "Unknown task_id returns error code TASK_NOT_FOUND. "
-            "Troubleshooting: pending for >5 minutes likely means queued but not picked up by a worker. "
-            "processing with no file-output changes for >20 minutes likely means failed/stalled. "
-            "Report these issues to https://github.com/PlanExeOrg/PlanExe/issues ."
-        ),
-        input_schema=PLAN_STATUS_INPUT_SCHEMA,
-        output_schema=PLAN_STATUS_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": True,
-            "destructiveHint": False,
-            "idempotentHint": True,
-            "openWorldHint": False,
-        },
-    ),
-    ToolDefinition(
-        name="plan_stop",
-        description=(
-            "Request the plan generation to stop. Pass the task_id (the UUID returned by plan_create). "
-            "Stopping is asynchronous: the stop flag is set immediately but the task may continue briefly before halting. "
-            "A stopped task will eventually transition to the failed state. "
-            "If the task is already completed or failed, stop_requested returns false (the task already finished). "
-            "Unknown task_id returns error code TASK_NOT_FOUND."
-        ),
-        input_schema=PLAN_STOP_INPUT_SCHEMA,
-        output_schema=PLAN_STOP_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": False,
-            "destructiveHint": True,
-            "idempotentHint": True,
-            "openWorldHint": False,
-        },
-    ),
-    ToolDefinition(
-        name="plan_retry",
-        description=(
-            "Retry a task that is currently in failed state. "
-            "Pass the failed task_id and optionally model_profile (defaults to baseline). "
-            "The task is reset to pending, prior artifacts are cleared, and the same task_id is requeued for processing. "
-            "Returns TASK_NOT_FOUND when task_id is unknown and TASK_NOT_FAILED when the task is not in failed state."
-        ),
-        input_schema=PLAN_RETRY_INPUT_SCHEMA,
-        output_schema=PLAN_RETRY_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": False,
-            "destructiveHint": False,
-            "idempotentHint": False,
-            "openWorldHint": True,
-        },
-    ),
-    ToolDefinition(
-        name="plan_file_info",
-        description=(
-            "Returns file metadata (content_type, download_url, download_size) for the report or zip artifact. "
-            "Use artifact='report' (default) for the interactive HTML report (~700KB, self-contained with embedded JS "
-            "for collapsible sections and interactive Gantt charts — open in a browser). "
-            "Use artifact='zip' for the full pipeline output bundle (md, json, csv intermediary files that fed the report). "
-            "While the task is still pending or processing, returns {ready:false,reason:\"processing\"}. "
-            "Check readiness by testing whether download_url is present in the response. "
-            "Once ready, present download_url to the user or fetch and save the file locally. "
-            "If your client exposes plan_download (e.g. mcp_local), prefer that to save the file locally. "
-            "Terminal error codes: generation_failed (plan failed), content_unavailable (artifact missing). "
-            "Unknown task_id returns error code TASK_NOT_FOUND."
-        ),
-        input_schema=PLAN_FILE_INFO_INPUT_SCHEMA,
-        output_schema=PLAN_FILE_INFO_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": True,
-            "destructiveHint": False,
-            "idempotentHint": True,
-            "openWorldHint": False,
-        },
-    ),
-    ToolDefinition(
-        name="plan_list",
-        description=(
-            "List the most recent tasks for an authenticated user. "
-            "Requires user_api_key (pex_...). "
-            "Returns up to `limit` tasks (default 10, max 50) newest-first, each with task_id, state, "
-            "progress_percentage, created_at (ISO 8601), and a prompt_excerpt (first 100 chars). "
-            "Use this to recover a lost task_id or to review recent activity."
-        ),
-        input_schema=PLAN_LIST_INPUT_SCHEMA,
-        output_schema=PLAN_LIST_OUTPUT_SCHEMA,
-        annotations={
-            "readOnlyHint": True,
-            "destructiveHint": False,
-            "idempotentHint": True,
-            "openWorldHint": False,
-        },
-    ),
-]
-
-@mcp_cloud.list_tools()
-async def handle_list_tools() -> list[Tool]:
-    """List all available MCP tools."""
-    return [
-        Tool(
-            name=definition.name,
-            description=definition.description,
-            outputSchema=definition.output_schema,
-            inputSchema=definition.input_schema,
-            annotations=ToolAnnotations(**definition.annotations) if definition.annotations else None,
-        )
-        for definition in TOOL_DEFINITIONS
-    ]
-
-@mcp_cloud.call_tool()
-async def handle_call_tool(name: str, arguments: dict[str, Any]) -> CallToolResult:
-    """Dispatch MCP tool calls and return structured JSON errors for unknown tools."""
-    try:
-        handler = TOOL_HANDLERS.get(name)
-        if handler is None:
-            response = {"error": {"code": "INVALID_TOOL", "message": f"Unknown tool: {name}"}}
-            return CallToolResult(
-                content=[TextContent(type="text", text=json.dumps(response))],
-                structuredContent=response,
-                isError=True,
-            )
-        return await handler(arguments)
-    except Exception as e:
-        logger.error(f"Error handling tool {name}: {e}", exc_info=True)
-        response = {"error": {"code": "INTERNAL_ERROR", "message": str(e)}}
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult:
-    """Create a new PlanExe task and enqueue it for processing.
-
-    Examples:
-        - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → returns task_id (UUID) + created_at
-
-    Args:
-        - prompt: What the plan should cover (goal, context, constraints).
-        - model_profile: Optional profile ("baseline" | "premium" | "frontier" | "custom"). Call model_profiles to inspect options.
-
-    Returns:
-        - content: JSON string matching structuredContent.
-        - structuredContent: {"task_id": "<uuid>", "created_at": ...}
-        - isError: False on success.
-    """
-    req = TaskCreateRequest(**arguments)
-    metadata_overrides = _extract_task_create_metadata_overrides(arguments)
-    metadata_model_profile = metadata_overrides.get("model_profile")
-    model_profile = req.model_profile
-    if model_profile is None and isinstance(metadata_model_profile, str):
-        model_profile = metadata_model_profile
-
-    merged_config = _merge_task_create_config(None, model_profile)
-    require_user_key = os.environ.get("PLANEXE_MCP_REQUIRE_USER_KEY", "false").lower() in ("1", "true", "yes", "on")
-    user_context = None
-    if req.user_api_key:
-        user_context = _resolve_user_from_api_key(req.user_api_key.strip())
-        if not user_context:
-            response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}}
-            return CallToolResult(
-                content=[TextContent(type="text", text=json.dumps(response))],
-                structuredContent=response,
-                isError=True,
-            )
-    elif require_user_key:
-        response = {"error": {"code": "USER_API_KEY_REQUIRED", "message": "user_api_key is required for plan_create."}}
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-    if user_context and float(user_context.get("credits_balance", 0.0)) <= 0.0:
-        response = {"error": {"code": "INSUFFICIENT_CREDITS", "message": "Not enough credits."}}
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-    response = await asyncio.to_thread(
-        _create_task_sync,
-        req.prompt,
-        merged_config,
-        {"user_id": str(user_context["user_id"])} if user_context else None,
-    )
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(response))],
-        structuredContent=response,
-        isError=False,
-    )
-
-
-async def handle_prompt_examples(arguments: dict[str, Any]) -> CallToolResult:
-    """Return curated prompts from the catalog (mcp_example true) so LLMs can see example detail."""
-    samples = _load_mcp_example_prompts()
-    payload = {
-        "samples": samples,
-        "message": (
-            "Next: complete the non-tool step by drafting a detailed prompt (typically ~300-800 words) using these as a baseline (similar structure), then get user approval. "
-            "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. "
-            "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. "
-            "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. "
-            "The examples demonstrate this prose style — match their tone and density. "
-            "Only after approval, call plan_create. "
-            "Do not use PlanExe for tiny one-shot requests (e.g., rewrite this email, summarize this document). "
-            "PlanExe always runs the full fixed planning pipeline; callers cannot run only selected internal steps."
-        ),
-    }
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(payload))],
-        structuredContent=payload,
-        isError=False,
-    )
-
-
-async def handle_model_profiles(arguments: dict[str, Any]) -> CallToolResult:
-    """Return model profile options and currently available models in each profile."""
-    _ = ModelProfilesRequest(**(arguments or {}))
-    payload = await asyncio.to_thread(_get_model_profiles_sync)
-    profiles = payload.get("profiles")
-    if not isinstance(profiles, list) or len(profiles) == 0:
-        response = {
-            "error": {
-                "code": "MODEL_PROFILES_UNAVAILABLE",
-                "message": (
-                    "No models are currently configured. "
-                    "Inform the user that the server administrator needs to set up model profiles before plans can be created."
-                ),
-            }
-        }
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(payload))],
-        structuredContent=payload,
-        isError=False,
-    )
-
-
-async def handle_plan_status(arguments: dict[str, Any]) -> CallToolResult:
-    """Fetch the current task status, progress, and recent files for a task.
-
-    Examples:
-        - {"task_id": "uuid"} → state/progress/timing + recent files
-
-    Args:
-        - task_id: Task UUID returned by plan_create.
-
-    Returns:
-        - content: JSON string matching structuredContent.
-        - structuredContent: status payload or error.
-        - isError: True only when task_id is unknown.
-    """
-    req = TaskStatusRequest(**arguments)
-    task_id = req.task_id
-
-    task_snapshot = await asyncio.to_thread(_get_task_status_snapshot_sync, task_id)
-    if task_snapshot is None:
-        response = {
-            "error": {
-                "code": "TASK_NOT_FOUND",
-                "message": f"Task not found: {task_id}",
-            }
-        }
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-    progress_percentage = float(task_snapshot.get("progress_percentage") or 0.0)
-
-    task_state = task_snapshot["state"]
-    state = get_task_state_mapping(task_state)
-    if task_state == PlanState.completed:
-        progress_percentage = 100.0
-
-    # Collect files from worker_plan
-    task_uuid = task_snapshot["id"]
-    files = []
-    if task_uuid:
-        files_list = await fetch_file_list_from_worker_plan(task_uuid)
-        if not files_list:
-            files_list = await asyncio.to_thread(list_files_from_zip_snapshot, task_uuid)
-        if not files_list:
-            files_list = await asyncio.to_thread(list_files_from_local_run_dir, task_uuid)
-        if files_list:
-            for file_name in files_list[:10]:  # Limit to 10 most recent
-                if file_name != "log.txt":
-                    updated_at = datetime.now(UTC).replace(microsecond=0)
-                    files.append({
-                        "path": file_name,
-                        "updated_at": updated_at.isoformat().replace("+00:00", "Z"),  # Approximate
-                    })
-
-    created_at = task_snapshot["timestamp_created"]
-    if created_at and created_at.tzinfo is None:
-        created_at = created_at.replace(tzinfo=UTC)
-
-    response = {
-        "task_id": task_uuid,
-        "state": state,
-        "progress_percentage": progress_percentage,
-        "timing": {
-            "started_at": (
-                created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z")
-                if created_at
-                else None
-            ),
-            "elapsed_sec": (datetime.now(UTC) - created_at).total_seconds() if created_at else 0,
-        },
-        "files": files[:10],  # Limit to 10 most recent
-    }
-
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(response))],
-        structuredContent=response,
-        isError=False,
-    )
-
-async def handle_plan_stop(arguments: dict[str, Any]) -> CallToolResult:
-    """Request an active task to stop.
-
-    Examples:
-        - {"task_id": "uuid"} → stop request accepted
-
-    Args:
-        - task_id: Task UUID returned by plan_create.
-
-    Returns:
-        - content: JSON string matching structuredContent.
-        - structuredContent: {"state": "pending|processing|completed|failed", "stop_requested": bool} or error payload.
-        - isError: True only when task_id is unknown.
-    """
-    req = TaskStopRequest(**arguments)
-    task_id = req.task_id
-
-    stop_result = await asyncio.to_thread(_request_task_stop_sync, task_id)
-    if stop_result is None:
-        response = {
-            "error": {
-                "code": "TASK_NOT_FOUND",
-                "message": f"Task not found: {task_id}",
-            }
-        }
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-    response = stop_result
-
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(response))],
-        structuredContent=response,
-        isError=False,
-    )
-
-
-async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult:
-    """Retry a failed task by resetting it back to pending."""
-    req = TaskRetryRequest(**arguments)
-    task_id = req.task_id
-    retry_result = await asyncio.to_thread(_retry_failed_task_sync, task_id, req.model_profile)
-
-    if retry_result is None:
-        response = {
-            "error": {
-                "code": "TASK_NOT_FOUND",
-                "message": f"Task not found: {task_id}",
-            }
-        }
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-    if isinstance(retry_result.get("error"), dict):
-        response = retry_result
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-    response = retry_result
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(response))],
-        structuredContent=response,
-        isError=False,
-    )
-
-
-async def handle_plan_file_info(arguments: dict[str, Any]) -> CallToolResult:
-    """Return download metadata for a task's report or zip artifact.
-
-    Examples:
-        - {"task_id": "uuid"} → report metadata (default)
-        - {"task_id": "uuid", "artifact": "zip"} → zip metadata
-
-    Args:
-        - task_id: Task UUID returned by plan_create.
-        - artifact: Optional "report" or "zip".
-
-    Returns:
-        - content: JSON string matching structuredContent.
-        - structuredContent: metadata (content_type, sha256, download_size,
-          optional download_url) or {} if not ready, or error payload.
-        - isError: True only when task_id is unknown.
-    """
-    req = TaskFileInfoRequest(**arguments)
-    task_id = req.task_id
-    artifact = req.artifact.strip().lower() if isinstance(req.artifact, str) else "report"
-    if artifact not in ("report", "zip"):
-        artifact = "report"
-    task_snapshot = await asyncio.to_thread(_get_task_for_report_sync, task_id)
-    if task_snapshot is None:
-        response = {
-            "error": {
-                "code": "TASK_NOT_FOUND",
-                "message": f"Task not found: {task_id}",
-            }
-        }
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-
-    run_id = task_snapshot["id"]
-    if artifact == "zip":
-        content_bytes = await fetch_user_downloadable_zip(run_id)
-        if content_bytes is None:
-            task_state = task_snapshot["state"]
-            if task_state in (PlanState.pending, PlanState.processing) or task_state is None:
-                response = {"ready": False, "reason": "processing"}
-            else:
-                response = {
-                    "error": {
-                        "code": "content_unavailable",
-                        "message": "zip content_bytes is None",
-                    },
-                }
-            return CallToolResult(
-                content=[TextContent(type="text", text=json.dumps(response))],
-                structuredContent=response,
-                isError=False,
-            )
-
-        total_size = len(content_bytes)
-        content_hash = compute_sha256(content_bytes)
-        response = {
-            "content_type": ZIP_CONTENT_TYPE,
-            "sha256": content_hash,
-            "download_size": total_size,
-        }
-        download_url = build_zip_download_url(run_id)
-        if download_url:
-            response["download_url"] = download_url
-
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=False,
-        )
-
-    task_state = task_snapshot["state"]
-    if task_state in (PlanState.pending, PlanState.processing) or task_state is None:
-        response = {"ready": False, "reason": "processing"}
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=False,
-        )
-    if task_state == PlanState.failed:
-        message = task_snapshot["progress_message"] or "Plan generation failed."
-        response = {"ready": False, "reason": "failed", "error": {"code": "generation_failed", "message": message}}
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=False,
-        )
-
-    content_bytes = await fetch_artifact_from_worker_plan(run_id, REPORT_FILENAME)
-    if content_bytes is None:
-        response = {
-            "error": {
-                "code": "content_unavailable",
-                "message": "content_bytes is None",
-            },
-        }
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=False,
-        )
-
-    total_size = len(content_bytes)
-    content_hash = compute_sha256(content_bytes)
-    response = {
-        "content_type": REPORT_CONTENT_TYPE,
-        "sha256": content_hash,
-        "download_size": total_size,
-    }
-    download_url = build_report_download_url(run_id)
-    if download_url:
-        response["download_url"] = download_url
+# -- model_profiles: model profile introspection ------------------------------
+from mcp_cloud.model_profiles import (  # noqa: F401
+    _sort_llm_config_entries,
+    _extract_model_profile_entries,
+    _profile_models_payload,
+    _get_model_profiles_sync,
+)
 
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(response))],
-        structuredContent=response,
-        isError=False,
-    )
+# -- download_tokens: signed download tokens and URL builders ------------------
+from mcp_cloud.download_tokens import (  # noqa: F401
+    _download_base_url_ctx,
+    set_download_base_url,
+    clear_download_base_url,
+    _get_download_base_url,
+    _get_download_token_secret,
+    generate_download_token,
+    validate_download_token,
+    build_report_download_url,
+    build_zip_download_url,
+    build_report_download_path,
+    build_zip_download_path,
+)
 
-async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult:
-    """Return recent tasks for an authenticated user."""
-    try:
-        req = TaskListRequest(**arguments)
-    except Exception as exc:
-        response = {"error": {"code": "INVALID_ARGUMENTS", "message": str(exc)}}
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-    user_context = _resolve_user_from_api_key(req.user_api_key.strip())
-    if not user_context:
-        response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}}
-        return CallToolResult(
-            content=[TextContent(type="text", text=json.dumps(response))],
-            structuredContent=response,
-            isError=True,
-        )
-    limit = max(1, min(req.limit, 50))
-    tasks = await asyncio.to_thread(_list_tasks_sync, str(user_context["user_id"]), limit)
-    response = {
-        "tasks": tasks,
-        "message": f"Returned {len(tasks)} task(s).",
-    }
-    return CallToolResult(
-        content=[TextContent(type="text", text=json.dumps(response))],
-        structuredContent=response,
-        isError=False,
-    )
+# -- prompt_examples: example prompt loading -----------------------------------
+from mcp_cloud.prompt_examples import (  # noqa: F401
+    _load_mcp_example_prompts,
+    _builtin_mcp_example_prompts,
+)
 
+# -- schemas: tool schema constants and ToolDefinition -------------------------
+from mcp_cloud.schemas import (  # noqa: F401
+    PLAN_CREATE_INPUT_SCHEMA,
+    PLAN_CREATE_OUTPUT_SCHEMA,
+    PLAN_STATUS_SUCCESS_SCHEMA,
+    PLAN_STATUS_OUTPUT_SCHEMA,
+    PLAN_STOP_OUTPUT_SCHEMA,
+    PLAN_RETRY_OUTPUT_SCHEMA,
+    PLAN_FILE_INFO_READY_OUTPUT_SCHEMA,
+    PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA,
+    PLAN_FILE_INFO_OUTPUT_SCHEMA,
+    PLAN_STATUS_INPUT_SCHEMA,
+    PLAN_STOP_INPUT_SCHEMA,
+    PLAN_RETRY_INPUT_SCHEMA,
+    PLAN_FILE_INFO_INPUT_SCHEMA,
+    PROMPT_EXAMPLES_INPUT_SCHEMA,
+    PROMPT_EXAMPLES_OUTPUT_SCHEMA,
+    MODEL_PROFILES_INPUT_SCHEMA,
+    MODEL_PROFILES_OUTPUT_SCHEMA,
+    PLAN_LIST_INPUT_SCHEMA,
+    PLAN_LIST_OUTPUT_SCHEMA,
+    ToolDefinition,
+    TOOL_DEFINITIONS,
+)
 
-TOOL_HANDLERS = {
-    "plan_create": handle_plan_create,
-    "plan_status": handle_plan_status,
-    "plan_stop": handle_plan_stop,
-    "plan_retry": handle_plan_retry,
-    "plan_file_info": handle_plan_file_info,
-    "plan_list": handle_plan_list,
-    "prompt_examples": handle_prompt_examples,
-    "model_profiles": handle_model_profiles,
-}
+# -- handlers: MCP tool handlers and dispatch ----------------------------------
+from mcp_cloud.handlers import (  # noqa: F401
+    handle_list_tools,
+    handle_call_tool,
+    handle_plan_create,
+    handle_prompt_examples,
+    handle_model_profiles,
+    handle_plan_status,
+    handle_plan_stop,
+    handle_plan_retry,
+    handle_plan_file_info,
+    handle_plan_list,
+    TOOL_HANDLERS,
+)
 
-# Backward-compatible aliases so existing imports of handle_task_* still work
-handle_task_create = handle_plan_create
-handle_task_status = handle_plan_status
-handle_task_stop = handle_plan_stop
-handle_task_retry = handle_plan_retry
-handle_task_file_info = handle_plan_file_info
-handle_task_list = handle_plan_list
 
 async def main():
     """Main entry point for MCP server."""
     logger.info("Starting PlanExe MCP Cloud...")
-    
+
     with app.app_context():
         db.create_all()
         logger.info("Database initialized")
-    
+
     async with stdio_server() as streams:
         await mcp_cloud.run(
             streams[0],
diff --git a/mcp_cloud/auth.py b/mcp_cloud/auth.py
new file mode 100644
index 000000000..c0140ad2b
--- /dev/null
+++ b/mcp_cloud/auth.py
@@ -0,0 +1,50 @@
+"""PlanExe MCP Cloud – API-key hashing and user resolution."""
+import hashlib
+import logging
+import os
+from datetime import UTC, datetime
+from typing import Any, Optional
+
+from mcp_cloud.db_setup import app, db, UserApiKey, UserAccount
+
+logger = logging.getLogger(__name__)
+
+
+def validate_api_key_secret() -> None:
+    """Raise if PLANEXE_API_KEY_SECRET is not set.
+
+    Call at startup when authentication is required so the server
+    fails hard instead of silently falling back to a dev secret.
+    """
+    if not os.environ.get("PLANEXE_API_KEY_SECRET"):
+        raise RuntimeError(
+            "PLANEXE_API_KEY_SECRET is not set. "
+            "Set this environment variable or disable auth with PLANEXE_MCP_REQUIRE_AUTH=false."
+        )
+
+
+def _hash_user_api_key(raw_key: str) -> str:
+    secret = os.environ.get("PLANEXE_API_KEY_SECRET", "dev-api-key-secret")
+    if secret == "dev-api-key-secret":
+        logger.warning("PLANEXE_API_KEY_SECRET not set. Using dev secret for API key hashing.")
+    return hashlib.sha256(f"{secret}:{raw_key}".encode("utf-8")).hexdigest()
+
+def _resolve_user_from_api_key(raw_key: str) -> Optional[dict[str, Any]]:
+    if not raw_key:
+        return None
+    key_hash = _hash_user_api_key(raw_key)
+    with app.app_context():
+        api_key = UserApiKey.query.filter_by(key_hash=key_hash, revoked_at=None).first()
+        if not api_key:
+            return None
+        user = db.session.get(UserAccount, api_key.user_id)
+        if not user:
+            return None
+
+        user_context = {
+            "user_id": str(user.id),
+            "credits_balance": float(user.credits_balance or 0),
+        }
+        api_key.last_used_at = datetime.now(UTC)
+        db.session.commit()
+        return user_context
diff --git a/mcp_cloud/db_queries.py b/mcp_cloud/db_queries.py
new file mode 100644
index 000000000..d68680a20
--- /dev/null
+++ b/mcp_cloud/db_queries.py
@@ -0,0 +1,304 @@
+"""PlanExe MCP Cloud – database query helpers."""
+import logging
+import uuid
+from datetime import UTC, datetime
+from typing import Any, Optional
+
+from flask import has_app_context
+from sqlalchemy import cast
+from sqlalchemy.dialects.postgresql import JSONB
+from worker_plan_api.model_profile import normalize_model_profile
+
+from mcp_cloud.db_setup import app, db, PlanItem, PlanState, EventItem, EventType
+
+logger = logging.getLogger(__name__)
+
+PROMPT_EXCERPT_MAX_LENGTH = 100
+
+
+# ---------------------------------------------------------------------------
+# Plan lookup
+# ---------------------------------------------------------------------------
+
+def find_plan_by_task_id(task_id: str) -> Optional[PlanItem]:
+    """Find PlanItem by MCP task_id (UUID), with legacy fallback."""
+    plan = get_plan_by_id(task_id)
+    if plan is not None:
+        return plan
+
+    def _query_legacy() -> Optional[PlanItem]:
+        query = db.session.query(PlanItem)
+        if db.engine.dialect.name == "postgresql":
+            plans = query.filter(
+                cast(PlanItem.parameters, JSONB).contains({"_mcp_task_id": task_id})
+            ).all()
+        else:
+            plans = query.filter(
+                PlanItem.parameters.contains({"_mcp_task_id": task_id})
+            ).all()
+        if plans:
+            return plans[0]
+        return None
+
+    if has_app_context():
+        legacy_plan = _query_legacy()
+    else:
+        with app.app_context():
+            legacy_plan = _query_legacy()
+    if legacy_plan is not None:
+        logger.debug("Resolved legacy MCP task id %s to plan %s", task_id, legacy_plan.id)
+    return legacy_plan
+
+def get_plan_by_id(task_id: str) -> Optional[PlanItem]:
+    """Fetch a PlanItem by its UUID string."""
+    def _query() -> Optional[PlanItem]:
+        try:
+            plan_uuid = uuid.UUID(task_id)
+        except ValueError:
+            return None
+        return db.session.get(PlanItem, plan_uuid)
+
+    if has_app_context():
+        return _query()
+    with app.app_context():
+        return _query()
+
+def resolve_plan_for_task_id(task_id: str) -> Optional[PlanItem]:
+    """Resolve a PlanItem from a task_id (UUID), with legacy fallback."""
+    return find_plan_by_task_id(task_id)
+
+
+# ---------------------------------------------------------------------------
+# Sync operations called from handlers via asyncio.to_thread
+# ---------------------------------------------------------------------------
+
+def _create_plan_sync(
+    prompt: str,
+    config: Optional[dict[str, Any]],
+    metadata: Optional[dict[str, Any]],
+) -> dict[str, Any]:
+    with app.app_context():
+        parameters = dict(config or {})
+        parameters["model_profile"] = normalize_model_profile(parameters.get("model_profile")).value
+        parameters["trigger_source"] = "mcp plan_create"
+
+        plan = PlanItem(
+            prompt=prompt,
+            state=PlanState.pending,
+            user_id=metadata.get("user_id", "admin") if metadata else "admin",
+            parameters=parameters,
+        )
+        db.session.add(plan)
+        db.session.commit()
+
+        plan_id = str(plan.id)
+        event_context = {
+            "plan_id": plan_id,
+            "task_handle": plan_id,
+            "prompt": plan.prompt,
+            "user_id": plan.user_id,
+            "config": config,
+            "metadata": metadata,
+            "parameters": plan.parameters,
+        }
+        event = EventItem(
+            event_type=EventType.TASK_PENDING,
+            message="Enqueued task via MCP",
+            context=event_context,
+        )
+        db.session.add(event)
+        db.session.commit()
+
+        created_at = plan.timestamp_created
+        if created_at and created_at.tzinfo is None:
+            created_at = created_at.replace(tzinfo=UTC)
+        return {
+            "plan_id": plan_id,
+            "created_at": created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z"),
+        }
+
+def _get_plan_status_snapshot_sync(task_id: str) -> Optional[dict[str, Any]]:
+    with app.app_context():
+        plan = find_plan_by_task_id(task_id)
+        if plan is None:
+            return None
+        return {
+            "id": str(plan.id),
+            "state": plan.state,
+            "stop_requested": bool(plan.stop_requested),
+            "progress_percentage": plan.progress_percentage,
+            "timestamp_created": plan.timestamp_created,
+        }
+
+def _request_plan_stop_sync(task_id: str) -> Optional[dict[str, Any]]:
+    with app.app_context():
+        plan = find_plan_by_task_id(task_id)
+        if plan is None:
+            return None
+        stop_requested = False
+        if plan.state in (PlanState.pending, PlanState.processing):
+            plan.stop_requested = True
+            plan.stop_requested_timestamp = datetime.now(UTC)
+            plan.progress_message = "Stop requested by user."
+            db.session.commit()
+            logger.info("Stop requested for task %s; stop flag set on plan %s.", task_id, plan.id)
+            stop_requested = True
+        return {
+            "state": get_plan_state_mapping(plan.state),
+            "stop_requested": stop_requested,
+        }
+
+
+def _retry_failed_plan_sync(task_id: str, model_profile: str) -> Optional[dict[str, Any]]:
+    with app.app_context():
+        plan = find_plan_by_task_id(task_id)
+        if plan is None:
+            return None
+        if plan.state != PlanState.failed:
+            return {
+                "error": {
+                    "code": "PLAN_NOT_FAILED",
+                    "message": f"Plan is not in failed state: {task_id}",
+                }
+            }
+
+        normalized_profile = normalize_model_profile(model_profile).value
+        now_utc = datetime.now(UTC)
+        parameters = dict(plan.parameters) if isinstance(plan.parameters, dict) else {}
+        parameters["model_profile"] = normalized_profile
+        parameters["trigger_source"] = "mcp plan_retry"
+
+        # Reset plan state and clear prior run artifacts before requeueing.
+        plan.state = PlanState.pending
+        plan.timestamp_created = now_utc
+        plan.progress_percentage = 0.0
+        plan.progress_message = "Retry requested via MCP."
+        plan.stop_requested = False
+        plan.stop_requested_timestamp = None
+        plan.generated_report_html = None
+        plan.run_zip_snapshot = None
+        plan.run_track_activity_jsonl = None
+        plan.run_track_activity_bytes = None
+        plan.run_activity_overview_json = None
+        plan.run_artifact_layout_version = None
+        plan.parameters = parameters
+        db.session.commit()
+
+        event_context = {
+            "plan_id": str(plan.id),
+            "task_handle": str(plan.id),
+            "retry_of_plan_id": task_id,
+            "model_profile": normalized_profile,
+            "parameters": plan.parameters,
+        }
+        event = EventItem(
+            event_type=EventType.TASK_PENDING,
+            message="Retried failed task via MCP",
+            context=event_context,
+        )
+        db.session.add(event)
+        db.session.commit()
+
+        return {
+            "plan_id": str(plan.id),
+            "state": get_plan_state_mapping(plan.state),
+            "model_profile": normalized_profile,
+            "retried_at": now_utc.replace(microsecond=0).isoformat().replace("+00:00", "Z"),
+        }
+
+
+def _get_plan_for_report_sync(task_id: str) -> Optional[dict[str, Any]]:
+    with app.app_context():
+        plan = resolve_plan_for_task_id(task_id)
+        if plan is None:
+            return None
+        return {
+            "id": str(plan.id),
+            "state": plan.state,
+            "progress_message": plan.progress_message,
+        }
+
+def _list_plans_sync(user_id: Optional[str], limit: int) -> list[dict[str, Any]]:
+    with app.app_context():
+        query = db.session.query(PlanItem)
+        if user_id is not None:
+            query = query.filter_by(user_id=user_id)
+        plans = (
+            query
+            .order_by(PlanItem.timestamp_created.desc())
+            .limit(max(1, min(limit, 50)))
+            .all()
+        )
+        results = []
+        for plan in plans:
+            created_at = plan.timestamp_created
+            if created_at and created_at.tzinfo is None:
+                created_at = created_at.replace(tzinfo=UTC)
+            results.append({
+                "plan_id": str(plan.id),
+                "state": get_plan_state_mapping(plan.state),
+                "progress_percentage": float(plan.progress_percentage or 0.0),
+                "created_at": (
+                    created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z")
+                    if created_at else None
+                ),
+                "prompt_excerpt": (plan.prompt or "")[:PROMPT_EXCERPT_MAX_LENGTH],
+            })
+        return results
+
+
+# ---------------------------------------------------------------------------
+# Utilities
+# ---------------------------------------------------------------------------
+
+def get_plan_state_mapping(plan_state: PlanState) -> str:
+    """Map PlanState to MCP task state."""
+    mapping = {
+        PlanState.pending: "pending",
+        PlanState.processing: "processing",
+        PlanState.completed: "completed",
+        PlanState.failed: "failed",
+    }
+    return mapping.get(plan_state, "pending")
+
+def _extract_plan_create_metadata_overrides(arguments: dict[str, Any]) -> dict[str, Any]:
+    """Extract plan_create runtime overrides from hidden metadata containers.
+
+    Supported hidden containers:
+    - arguments.tool_metadata
+    - arguments.metadata
+    - arguments._meta
+
+    If a container includes nested namespaces, these are checked first:
+    - plan_create
+    - task_create (legacy alias)
+    - planexe_task_create (legacy alias)
+    - planexe
+    """
+    merged: dict[str, Any] = {}
+    metadata_candidates: list[dict[str, Any]] = []
+
+    for key in ("tool_metadata", "metadata", "_meta"):
+        candidate = arguments.get(key)
+        if isinstance(candidate, dict):
+            metadata_candidates.append(candidate)
+
+    for candidate in metadata_candidates:
+        merged.update(candidate)
+        for nested_key in ("plan_create", "task_create", "planexe_task_create", "planexe"):
+            nested = candidate.get(nested_key)
+            if isinstance(nested, dict):
+                merged.update(nested)
+
+    return merged
+
+def _merge_plan_create_config(
+    config: Optional[dict[str, Any]],
+    model_profile: Optional[str],
+) -> Optional[dict[str, Any]]:
+    merged = dict(config or {})
+    if isinstance(model_profile, str):
+        candidate_profile = model_profile.strip()
+        if candidate_profile and "model_profile" not in merged:
+            merged["model_profile"] = candidate_profile
+    return merged or None
diff --git a/mcp_cloud/db_setup.py b/mcp_cloud/db_setup.py
new file mode 100644
index 000000000..81149e97d
--- /dev/null
+++ b/mcp_cloud/db_setup.py
@@ -0,0 +1,170 @@
+"""PlanExe MCP Cloud – database setup, Flask app, constants, and request classes."""
+import logging
+import os
+from pathlib import Path
+from typing import Literal, Optional
+from urllib.parse import quote_plus
+
+from flask import Flask
+from mcp.server import Server
+from pydantic import BaseModel
+from sqlalchemy import text
+from worker_plan_api.model_profile import ModelProfileEnum
+
+from mcp_cloud.dotenv_utils import load_planexe_dotenv
+
+_dotenv_loaded, _dotenv_paths = load_planexe_dotenv(Path(__file__).parent)
+
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+if not _dotenv_loaded:
+    logger.warning(
+        "No .env file found; searched: %s",
+        ", ".join(str(path) for path in _dotenv_paths),
+    )
+
+from database_api.planexe_db_singleton import db
+from database_api.model_planitem import PlanItem, PlanState
+from database_api.model_event import EventItem, EventType
+from database_api.model_user_account import UserAccount
+from database_api.model_user_api_key import UserApiKey
+
+app = Flask(__name__)
+app.config.from_pyfile('config.py')
+
+def build_postgres_uri_from_env(env: dict[str, str]) -> tuple[str, dict[str, str]]:
+    """Construct a SQLAlchemy URI for Postgres using environment variables."""
+    host = env.get("PLANEXE_POSTGRES_HOST") or "database_postgres"
+    port = str(env.get("PLANEXE_POSTGRES_PORT") or "5432")
+    dbname = env.get("PLANEXE_POSTGRES_DB") or "planexe"
+    user = env.get("PLANEXE_POSTGRES_USER") or "planexe"
+    password = env.get("PLANEXE_POSTGRES_PASSWORD") or "planexe"
+    uri = f"postgresql+psycopg2://{quote_plus(user)}:{quote_plus(password)}@{host}:{port}/{dbname}"
+    safe_config = {"host": host, "port": port, "dbname": dbname, "user": user}
+    return uri, safe_config
+
+sqlalchemy_database_uri = os.environ.get("SQLALCHEMY_DATABASE_URI")
+if sqlalchemy_database_uri is None:
+    sqlalchemy_database_uri, db_settings = build_postgres_uri_from_env(os.environ)
+    logger.info(f"SQLALCHEMY_DATABASE_URI not set. Using Postgres defaults: {db_settings}")
+else:
+    logger.info("Using SQLALCHEMY_DATABASE_URI from environment.")
+
+app.config['SQLALCHEMY_DATABASE_URI'] = sqlalchemy_database_uri
+app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {'pool_recycle': 280, 'pool_pre_ping': True}
+db.init_app(app)
+
+def ensure_planitem_stop_columns() -> None:
+    statements = (
+        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_jsonl TEXT",
+        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_track_activity_bytes INTEGER",
+        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_activity_overview_json JSON",
+        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS run_artifact_layout_version INTEGER",
+        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested BOOLEAN",
+        "ALTER TABLE task_item ADD COLUMN IF NOT EXISTS stop_requested_timestamp TIMESTAMP",
+    )
+    with db.engine.begin() as conn:
+        for statement in statements:
+            try:
+                conn.execute(text(statement))
+            except Exception as exc:
+                logger.warning("Schema update failed for %s: %s", statement, exc, exc_info=True)
+
+with app.app_context():
+    ensure_planitem_stop_columns()
+
+# Shown in MCP initialize (e.g. Inspector) so clients know what PlanExe does.
+PLANEXE_SERVER_INSTRUCTIONS = (
+    "PlanExe generates strategic project-plan drafts from a natural-language prompt. "
+    "Output is a self-contained interactive HTML report (~700KB) with 20+ sections including "
+    "executive summary, interactive Gantt charts, risk analysis, SWOT, governance, investor pitch, "
+    "team profiles, work breakdown, scenario comparison, expert criticism, and adversarial sections "
+    "(premortem, self-audit checklist, premise attacks) that stress-test whether the plan holds up. "
+    "The output is a draft to refine, not final ground truth — but it surfaces hard questions the prompter may not have considered. "
+    "Use PlanExe for substantial multi-phase projects with constraints, stakeholders, budgets, and timelines. "
+    "Do not use PlanExe for tiny one-shot outputs (for example: 'give me a 5-point checklist'); use a normal LLM response for that. "
+    "The planning pipeline is fixed end-to-end; callers cannot select individual internal pipeline steps to run. "
+    "Required interaction order: call prompt_examples first. "
+    "Optional before plan_create: call model_profiles to see profile guidance and available models in each profile. "
+    "Then perform a non-tool step: draft a strong prompt as flowing prose (not structured markdown with headers or bullets), "
+    "typically ~300-800 words, and get user approval. "
+    "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. "
+    "Write the prompt as flowing prose — weave specs, constraints, and targets naturally into sentences. "
+    "Only after approval, call plan_create. "
+    "Each plan_create call creates a new plan_id; the server does not enforce a global per-client concurrency limit. "
+    "Then poll plan_status (about every 5 minutes); use plan_file_info when complete. "
+    "If a run fails, call plan_retry with the failed plan_id to requeue it (optional model_profile, defaults to baseline). "
+    "To stop, call plan_stop with the plan_id from plan_create; stopping is asynchronous and the plan will eventually transition to failed. "
+    "If model_profiles returns MODEL_PROFILES_UNAVAILABLE, inform the user that no models are currently configured and the server administrator needs to set up model profiles. "
+    "Tool errors use {error:{code,message}}. plan_file_info returns {ready:false,reason:...} while the artifact is not yet ready; check readiness by testing whether download_url is present in the response. "
+    "plan_file_info download_url is the absolute URL where the requested artifact can be downloaded. "
+    "To list recent plans for a user call plan_list; returns plan_id, state, progress_percentage, created_at, and prompt_excerpt for each plan. "
+    "plan_status state contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. "
+    "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the plan was likely queued but not picked up by a worker (server issue). "
+    "If plan_status is in processing and output files do not change for longer than 20 minutes, the plan_create likely failed/stalled. "
+    "In both cases, report the issue to PlanExe developers on GitHub: https://github.com/PlanExeOrg/PlanExe/issues . "
+    "Main output: a self-contained interactive HTML report (~700KB) with collapsible sections and interactive Gantt charts — open in a browser. "
+    "The zip contains the intermediary pipeline files (md, json, csv) that fed the report."
+)
+
+mcp_cloud_server = Server("planexe-mcp-cloud", instructions=PLANEXE_SERVER_INSTRUCTIONS)
+
+# Base directory for run artifacts (not used directly, fetched via worker_plan HTTP API)
+BASE_DIR_RUN = Path(os.environ.get("PLANEXE_RUN_DIR", Path(__file__).parent.parent / "run")).resolve()
+
+WORKER_PLAN_URL = os.environ.get("PLANEXE_WORKER_PLAN_URL", "http://worker_plan:8000")
+
+REPORT_FILENAME = "030-report.html"
+REPORT_CONTENT_TYPE = "text/html; charset=utf-8"
+ZIP_FILENAME = "run.zip"
+ZIP_CONTENT_TYPE = "application/zip"
+ZIP_SNAPSHOT_MAX_BYTES = 100_000_000
+
+ModelProfileInput = Literal[
+    "baseline",
+    "premium",
+    "frontier",
+    "custom",
+]
+MODEL_PROFILE_TITLES = {
+    ModelProfileEnum.BASELINE.value: "Baseline",
+    ModelProfileEnum.PREMIUM.value: "Premium",
+    ModelProfileEnum.FRONTIER.value: "Frontier",
+    ModelProfileEnum.CUSTOM.value: "Custom",
+}
+MODEL_PROFILE_SUMMARIES = {
+    ModelProfileEnum.BASELINE.value: "Cheap and fast; recommended default when creating a plan.",
+    ModelProfileEnum.PREMIUM.value: "Higher-cost profile tuned for stronger output quality.",
+    ModelProfileEnum.FRONTIER.value: "Most capable models first; usually slowest/most expensive.",
+    ModelProfileEnum.CUSTOM.value: "User-managed profile file for custom model ordering.",
+}
+
+class PlanCreateRequest(BaseModel):
+    prompt: str
+    model_profile: Optional[ModelProfileInput] = None
+    user_api_key: Optional[str] = None
+
+class PlanStatusRequest(BaseModel):
+    plan_id: str
+
+class PlanStopRequest(BaseModel):
+    plan_id: str
+
+class PlanRetryRequest(BaseModel):
+    plan_id: str
+    model_profile: ModelProfileInput = "baseline"
+
+class PlanFileInfoRequest(BaseModel):
+    plan_id: str
+    artifact: Optional[str] = None
+
+class PlanListRequest(BaseModel):
+    user_api_key: Optional[str] = None
+    limit: int = 10
+
+class ModelProfilesRequest(BaseModel):
+    """No input parameters."""
+    pass
diff --git a/mcp_cloud/download_tokens.py b/mcp_cloud/download_tokens.py
new file mode 100644
index 000000000..9dccc6f95
--- /dev/null
+++ b/mcp_cloud/download_tokens.py
@@ -0,0 +1,152 @@
+"""PlanExe MCP Cloud – signed download tokens and URL builders."""
+import contextvars
+import hashlib
+import hmac
+import logging
+import os
+import secrets
+import time
+from typing import Optional
+
+from mcp_cloud.db_setup import REPORT_FILENAME, ZIP_FILENAME
+
+logger = logging.getLogger(__name__)
+
+
+# Context var set by HTTP server so download URLs use the request's host when
+# PLANEXE_MCP_PUBLIC_BASE_URL is not set (avoids localhost for remote clients).
+_download_base_url_ctx: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
+    "download_base_url", default=None
+)
+
+
+def set_download_base_url(base_url: Optional[str]) -> None:
+    """Set the base URL used for download links for this request (e.g. from HTTP Request).
+    Cleared automatically when the request ends. Used when PLANEXE_MCP_PUBLIC_BASE_URL is unset."""
+    if base_url is not None:
+        _download_base_url_ctx.set(base_url.rstrip("/"))
+    else:
+        try:
+            _download_base_url_ctx.set("")
+        except LookupError:
+            pass
+
+
+def clear_download_base_url() -> None:
+    """Clear the request-scoped base URL (call when request ends)."""
+    try:
+        _download_base_url_ctx.set("")
+    except LookupError:
+        pass
+
+
+def _get_download_base_url() -> Optional[str]:
+    """Return base URL for download links: env var, then request context, then None."""
+    base_url = os.environ.get("PLANEXE_MCP_PUBLIC_BASE_URL")
+    if base_url:
+        return base_url.rstrip("/")
+    try:
+        ctx_url = _download_base_url_ctx.get()
+        return ctx_url if ctx_url else None
+    except LookupError:
+        return None
+
+
+def build_report_download_path(task_id: str) -> str:
+    return f"/download/{task_id}/{REPORT_FILENAME}"
+
+
+def build_zip_download_path(task_id: str) -> str:
+    return f"/download/{task_id}/{ZIP_FILENAME}"
+
+
+# ---------------------------------------------------------------------------
+# Signed, expiring download tokens
+# ---------------------------------------------------------------------------
+
+# Default TTL for signed download tokens (seconds). Configurable via env var.
+DOWNLOAD_TOKEN_TTL_SECONDS = int(os.environ.get("PLANEXE_DOWNLOAD_TOKEN_TTL", "900"))  # 15 min
+
+# Per-process fallback secret when no env var is set.  Tokens won't survive a
+# server restart, but that is acceptable for the fallback case.
+_random_token_secret: Optional[bytes] = None
+
+
+def validate_download_token_secret() -> None:
+    """Raise if no stable download-token secret is configured.
+
+    Call at startup when authentication is required so the server
+    fails hard instead of silently using a random per-process secret
+    that invalidates tokens on restart.
+    """
+    for env_var in ("PLANEXE_DOWNLOAD_TOKEN_SECRET", "PLANEXE_API_KEY_SECRET"):
+        if os.environ.get(env_var):
+            return
+    raise RuntimeError(
+        "Neither PLANEXE_DOWNLOAD_TOKEN_SECRET nor PLANEXE_API_KEY_SECRET is set. "
+        "Set at least one or disable auth with PLANEXE_MCP_REQUIRE_AUTH=false."
+    )
+
+
+def _get_download_token_secret() -> bytes:
+    """Return the HMAC-SHA256 secret used to sign download tokens.
+
+    Priority: PLANEXE_DOWNLOAD_TOKEN_SECRET → PLANEXE_API_KEY_SECRET →
+    per-process random (with a warning logged once).
+    """
+    global _random_token_secret
+    for env_var in ("PLANEXE_DOWNLOAD_TOKEN_SECRET", "PLANEXE_API_KEY_SECRET"):
+        value = os.environ.get(env_var)
+        if value:
+            return value.encode()
+    if _random_token_secret is None:
+        _random_token_secret = secrets.token_bytes(32)
+        logger.warning(
+            "PLANEXE_DOWNLOAD_TOKEN_SECRET is not set; using a random per-process secret. "
+            "Download tokens will be invalidated on server restart. "
+            "Set PLANEXE_DOWNLOAD_TOKEN_SECRET to a stable value."
+        )
+    return _random_token_secret
+
+
+def generate_download_token(task_id: str, filename: str) -> str:
+    """Return a signed, time-limited token for one task artifact download.
+
+    Format: ``{expiry_unix_ts}.{hmac_hex}``
+    The HMAC covers ``task_id:filename:expiry`` so the token is scoped to
+    exactly one file and cannot be reused for a different task.
+    """
+    expiry = int(time.time()) + DOWNLOAD_TOKEN_TTL_SECONDS
+    message = f"{task_id}:{filename}:{expiry}".encode()
+    mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest()
+    return f"{expiry}.{mac}"
+
+
+def validate_download_token(token: str, task_id: str, filename: str) -> bool:
+    """Return True when *token* is a valid, unexpired token for the given artifact."""
+    try:
+        expiry_str, mac = token.split(".", 1)
+        expiry = int(expiry_str)
+    except (ValueError, AttributeError):
+        return False
+    if time.time() > expiry:
+        return False
+    message = f"{task_id}:{filename}:{expiry}".encode()
+    expected_mac = hmac.new(_get_download_token_secret(), message, hashlib.sha256).hexdigest()
+    return hmac.compare_digest(mac, expected_mac)
+
+
+def build_report_download_url(task_id: str) -> Optional[str]:
+    base_url = _get_download_base_url()
+    if not base_url:
+        return None
+    token = generate_download_token(task_id, REPORT_FILENAME)
+    return f"{base_url}{build_report_download_path(task_id)}?token={token}"
+
+
+def build_zip_download_url(task_id: str) -> Optional[str]:
+    base_url = _get_download_base_url()
+    if not base_url:
+        return None
+    token = generate_download_token(task_id, ZIP_FILENAME)
+    return f"{base_url}{build_zip_download_path(task_id)}?token={token}"
diff --git a/mcp_cloud/handlers.py b/mcp_cloud/handlers.py
new file mode 100644
index 000000000..870c007c1
--- /dev/null
+++ b/mcp_cloud/handlers.py
@@ -0,0 +1,554 @@
+"""PlanExe MCP Cloud – MCP tool handlers and dispatch."""
+import asyncio
+import json
+import logging
+import os
+import time
+from datetime import UTC, datetime
+from typing import Any
+
+from mcp.types import CallToolResult, Tool, TextContent, ToolAnnotations
+
+from mcp_cloud.db_setup import (
+    PlanState,
+    REPORT_CONTENT_TYPE,
+    REPORT_FILENAME,
+    ZIP_CONTENT_TYPE,
+    ModelProfileInput,
+    PlanCreateRequest,
+    PlanStatusRequest,
+    PlanStopRequest,
+    PlanRetryRequest,
+    PlanFileInfoRequest,
+    PlanListRequest,
+    ModelProfilesRequest,
+    mcp_cloud_server,
+)
+from mcp_cloud.auth import _resolve_user_from_api_key
+from mcp_cloud.db_queries import (
+    _create_plan_sync,
+    _get_plan_status_snapshot_sync,
+    _request_plan_stop_sync,
+    _retry_failed_plan_sync,
+    _get_plan_for_report_sync,
+    _list_plans_sync,
+    get_plan_state_mapping,
+    _extract_plan_create_metadata_overrides,
+    _merge_plan_create_config,
+)
+from mcp_cloud.zip_utils import (
+    list_files_from_zip_snapshot,
+    compute_sha256,
+)
+from mcp_cloud.worker_fetchers import (
+    fetch_artifact_from_worker_plan,
+    fetch_file_list_from_worker_plan,
+    list_files_from_local_run_dir,
+    fetch_user_downloadable_zip,
+)
+from mcp_cloud.model_profiles import _get_model_profiles_sync
+from mcp_cloud.download_tokens import build_report_download_url, build_zip_download_url
+from mcp_cloud.prompt_examples import _load_mcp_example_prompts
+from mcp_cloud.schemas import TOOL_DEFINITIONS
+
+logger = logging.getLogger(__name__)
+
+
+@mcp_cloud_server.list_tools()
+async def handle_list_tools() -> list[Tool]:
+    """List all available MCP tools."""
+    return [
+        Tool(
+            name=definition.name,
+            description=definition.description,
+            outputSchema=definition.output_schema,
+            inputSchema=definition.input_schema,
+            annotations=ToolAnnotations(**definition.annotations) if definition.annotations else None,
+        )
+        for definition in TOOL_DEFINITIONS
+    ]
+
+@mcp_cloud_server.call_tool()
+async def handle_call_tool(name: str, arguments: dict[str, Any]) -> CallToolResult:
+    """Dispatch MCP tool calls and return structured JSON errors for unknown tools."""
+    start = time.monotonic()
+    try:
+        handler = TOOL_HANDLERS.get(name)
+        if handler is None:
+            logger.warning("tool_call tool=%s result=unknown_tool", name)
+            response = {"error": {"code": "INVALID_TOOL", "message": f"Unknown tool: {name}"}}
+            return CallToolResult(
+                content=[TextContent(type="text", text=json.dumps(response))],
+                structuredContent=response,
+                isError=True,
+            )
+        result = await handler(arguments)
+        elapsed_ms = (time.monotonic() - start) * 1000
+        if result.isError:
+            logger.info("tool_call tool=%s result=error duration_ms=%.0f", name, elapsed_ms)
+        else:
+            logger.info("tool_call tool=%s result=ok duration_ms=%.0f", name, elapsed_ms)
+        return result
+    except Exception as e:
+        elapsed_ms = (time.monotonic() - start) * 1000
+        logger.error("tool_call tool=%s result=exception duration_ms=%.0f error=%s", name, elapsed_ms, e, exc_info=True)
+        response = {"error": {"code": "INTERNAL_ERROR", "message": str(e)}}
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult:
+    """Create a new PlanExe task and enqueue it for processing.
+
+    Examples:
+        - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → returns plan_id (UUID) + created_at
+
+    Args:
+        - prompt: What the plan should cover (goal, context, constraints).
+        - model_profile: Optional profile ("baseline" | "premium" | "frontier" | "custom"). Call model_profiles to inspect options.
+
+    Returns:
+        - content: JSON string matching structuredContent.
+        - structuredContent: {"plan_id": "<uuid>", "created_at": ...}
+        - isError: False on success.
+    """
+    req = PlanCreateRequest(**arguments)
+    metadata_overrides = _extract_plan_create_metadata_overrides(arguments)
+    metadata_model_profile = metadata_overrides.get("model_profile")
+    model_profile = req.model_profile
+    if model_profile is None and isinstance(metadata_model_profile, str):
+        model_profile = metadata_model_profile
+
+    merged_config = _merge_plan_create_config(None, model_profile)
+    require_user_key = os.environ.get("PLANEXE_MCP_REQUIRE_USER_KEY", "false").lower() in ("1", "true", "yes", "on")
+    user_context = None
+    if req.user_api_key:
+        user_context = _resolve_user_from_api_key(req.user_api_key.strip())
+        if not user_context:
+            response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}}
+            return CallToolResult(
+                content=[TextContent(type="text", text=json.dumps(response))],
+                structuredContent=response,
+                isError=True,
+            )
+    elif require_user_key:
+        response = {"error": {"code": "USER_API_KEY_REQUIRED", "message": "user_api_key is required for plan_create."}}
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+    if user_context and float(user_context.get("credits_balance", 0.0)) <= 0.0:
+        response = {"error": {"code": "INSUFFICIENT_CREDITS", "message": "Not enough credits."}}
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+    response = await asyncio.to_thread(
+        _create_plan_sync,
+        req.prompt,
+        merged_config,
+        {"user_id": str(user_context["user_id"])} if user_context else None,
+    )
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(response))],
+        structuredContent=response,
+        isError=False,
+    )
+
+
+async def handle_prompt_examples(arguments: dict[str, Any]) -> CallToolResult:
+    """Return curated prompts from the catalog (mcp_example true) so LLMs can see example detail."""
+    samples = _load_mcp_example_prompts()
+    payload = {
+        "samples": samples,
+        "message": (
+            "Next: complete the non-tool step by drafting a detailed prompt (typically ~300-800 words) using these as a baseline (similar structure), then get user approval. "
+            "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. "
+            "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. "
+            "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. "
+            "The examples demonstrate this prose style — match their tone and density. "
+            "Only after approval, call plan_create. "
+            "Do not use PlanExe for tiny one-shot requests (e.g., rewrite this email, summarize this document). "
+            "PlanExe always runs the full fixed planning pipeline; callers cannot run only selected internal steps."
+        ),
+    }
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(payload))],
+        structuredContent=payload,
+        isError=False,
+    )
+
+
+async def handle_model_profiles(arguments: dict[str, Any]) -> CallToolResult:
+    """Return model profile options and currently available models in each profile."""
+    _ = ModelProfilesRequest(**(arguments or {}))
+    payload = await asyncio.to_thread(_get_model_profiles_sync)
+    profiles = payload.get("profiles")
+    if not isinstance(profiles, list) or len(profiles) == 0:
+        response = {
+            "error": {
+                "code": "MODEL_PROFILES_UNAVAILABLE",
+                "message": (
+                    "No models are currently configured. "
+                    "Inform the user that the server administrator needs to set up model profiles before plans can be created."
+                ),
+            }
+        }
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(payload))],
+        structuredContent=payload,
+        isError=False,
+    )
+
+
+async def handle_plan_status(arguments: dict[str, Any]) -> CallToolResult:
+    """Fetch the current plan status, progress, and recent files for a plan.
+
+    Examples:
+        - {"plan_id": "uuid"} → state/progress/timing + recent files
+
+    Args:
+        - plan_id: Plan UUID returned by plan_create.
+
+    Returns:
+        - content: JSON string matching structuredContent.
+        - structuredContent: status payload or error.
+        - isError: True only when plan_id is unknown.
+    """
+    req = PlanStatusRequest(**arguments)
+    task_id = req.plan_id
+
+    plan_snapshot = await asyncio.to_thread(_get_plan_status_snapshot_sync, task_id)
+    if plan_snapshot is None:
+        response = {
+            "error": {
+                "code": "PLAN_NOT_FOUND",
+                "message": f"Plan not found: {task_id}",
+            }
+        }
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+    progress_percentage = float(plan_snapshot.get("progress_percentage") or 0.0)
+
+    plan_state = plan_snapshot["state"]
+    state = get_plan_state_mapping(plan_state)
+    if plan_state == PlanState.completed:
+        progress_percentage = 100.0
+
+    # Collect files from worker_plan
+    plan_uuid = plan_snapshot["id"]
+    files = []
+    if plan_uuid:
+        files_list = await fetch_file_list_from_worker_plan(plan_uuid)
+        if not files_list:
+            files_list = await asyncio.to_thread(list_files_from_zip_snapshot, plan_uuid)
+        if not files_list:
+            files_list = await asyncio.to_thread(list_files_from_local_run_dir, plan_uuid)
+        if files_list:
+            for file_name in files_list[:10]:  # Limit to 10 most recent
+                if file_name != "log.txt":
+                    updated_at = datetime.now(UTC).replace(microsecond=0)
+                    files.append({
+                        "path": file_name,
+                        "updated_at": updated_at.isoformat().replace("+00:00", "Z"),  # Approximate
+                    })
+
+    created_at = plan_snapshot["timestamp_created"]
+    if created_at and created_at.tzinfo is None:
+        created_at = created_at.replace(tzinfo=UTC)
+
+    response = {
+        "plan_id": plan_uuid,
+        "state": state,
+        "progress_percentage": progress_percentage,
+        "timing": {
+            "started_at": (
+                created_at.replace(microsecond=0).isoformat().replace("+00:00", "Z")
+                if created_at
+                else None
+            ),
+            "elapsed_sec": (datetime.now(UTC) - created_at).total_seconds() if created_at else 0,
+        },
+        "files": files[:10],  # Limit to 10 most recent
+    }
+
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(response))],
+        structuredContent=response,
+        isError=False,
+    )
+
+async def handle_plan_stop(arguments: dict[str, Any]) -> CallToolResult:
+    """Request an active plan to stop.
+
+    Examples:
+        - {"plan_id": "uuid"} → stop request accepted
+
+    Args:
+        - plan_id: Plan UUID returned by plan_create.
+
+    Returns:
+        - content: JSON string matching structuredContent.
+        - structuredContent: {"state": "pending|processing|completed|failed", "stop_requested": bool} or error payload.
+        - isError: True only when plan_id is unknown.
+    """
+    req = PlanStopRequest(**arguments)
+    task_id = req.plan_id
+
+    stop_result = await asyncio.to_thread(_request_plan_stop_sync, task_id)
+    if stop_result is None:
+        response = {
+            "error": {
+                "code": "PLAN_NOT_FOUND",
+                "message": f"Plan not found: {task_id}",
+            }
+        }
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+    response = stop_result
+
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(response))],
+        structuredContent=response,
+        isError=False,
+    )
+
+
+async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult:
+    """Retry a failed plan by resetting it back to pending."""
+    req = PlanRetryRequest(**arguments)
+    task_id = req.plan_id
+    retry_result = await asyncio.to_thread(_retry_failed_plan_sync, task_id, req.model_profile)
+
+    if retry_result is None:
+        response = {
+            "error": {
+                "code": "PLAN_NOT_FOUND",
+                "message": f"Plan not found: {task_id}",
+            }
+        }
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+    if isinstance(retry_result.get("error"), dict):
+        response = retry_result
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+    response = retry_result
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(response))],
+        structuredContent=response,
+        isError=False,
+    )
+
+
+async def handle_plan_file_info(arguments: dict[str, Any]) -> CallToolResult:
+    """Return download metadata for a plan's report or zip artifact.
+
+    Examples:
+        - {"plan_id": "uuid"} → report metadata (default)
+        - {"plan_id": "uuid", "artifact": "zip"} → zip metadata
+
+    Args:
+        - plan_id: Plan UUID returned by plan_create.
+        - artifact: Optional "report" or "zip".
+
+    Returns:
+        - content: JSON string matching structuredContent.
+        - structuredContent: metadata (content_type, sha256, download_size,
+          optional download_url) or {} if not ready, or error payload.
+        - isError: True only when plan_id is unknown.
+    """
+    req = PlanFileInfoRequest(**arguments)
+    task_id = req.plan_id
+    artifact = req.artifact.strip().lower() if isinstance(req.artifact, str) else "report"
+    if artifact not in ("report", "zip"):
+        response = {
+            "error": {
+                "code": "INVALID_ARGUMENT",
+                "message": f"Invalid artifact type: {req.artifact!r}. Must be 'report' or 'zip'.",
+            }
+        }
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+    plan_snapshot = await asyncio.to_thread(_get_plan_for_report_sync, task_id)
+    if plan_snapshot is None:
+        response = {
+            "error": {
+                "code": "PLAN_NOT_FOUND",
+                "message": f"Plan not found: {task_id}",
+            }
+        }
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+
+    run_id = plan_snapshot["id"]
+    if artifact == "zip":
+        content_bytes = await fetch_user_downloadable_zip(run_id)
+        if content_bytes is None:
+            plan_state = plan_snapshot["state"]
+            if plan_state in (PlanState.pending, PlanState.processing) or plan_state is None:
+                response = {"ready": False, "reason": "processing"}
+            else:
+                response = {
+                    "error": {
+                        "code": "content_unavailable",
+                        "message": "zip content_bytes is None",
+                    },
+                }
+            return CallToolResult(
+                content=[TextContent(type="text", text=json.dumps(response))],
+                structuredContent=response,
+                isError=False,
+            )
+
+        total_size = len(content_bytes)
+        content_hash = compute_sha256(content_bytes)
+        response = {
+            "content_type": ZIP_CONTENT_TYPE,
+            "sha256": content_hash,
+            "download_size": total_size,
+        }
+        download_url = build_zip_download_url(run_id)
+        if download_url:
+            response["download_url"] = download_url
+
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=False,
+        )
+
+    plan_state = plan_snapshot["state"]
+    if plan_state in (PlanState.pending, PlanState.processing) or plan_state is None:
+        response = {"ready": False, "reason": "processing"}
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=False,
+        )
+    if plan_state == PlanState.failed:
+        message = plan_snapshot["progress_message"] or "Plan generation failed."
+        response = {"ready": False, "reason": "failed", "error": {"code": "generation_failed", "message": message}}
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=False,
+        )
+
+    content_bytes = await fetch_artifact_from_worker_plan(run_id, REPORT_FILENAME)
+    if content_bytes is None:
+        response = {
+            "error": {
+                "code": "content_unavailable",
+                "message": "content_bytes is None",
+            },
+        }
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=False,
+        )
+
+    total_size = len(content_bytes)
+    content_hash = compute_sha256(content_bytes)
+    response = {
+        "content_type": REPORT_CONTENT_TYPE,
+        "sha256": content_hash,
+        "download_size": total_size,
+    }
+    download_url = build_report_download_url(run_id)
+    if download_url:
+        response["download_url"] = download_url
+
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(response))],
+        structuredContent=response,
+        isError=False,
+    )
+
+async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult:
+    """Return recent plans for an authenticated user."""
+    try:
+        req = PlanListRequest(**arguments)
+    except Exception as exc:
+        response = {"error": {"code": "INVALID_ARGUMENTS", "message": str(exc)}}
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+    require_user_key = os.environ.get("PLANEXE_MCP_REQUIRE_USER_KEY", "false").lower() in ("1", "true", "yes", "on")
+    user_context = None
+    if req.user_api_key:
+        user_context = _resolve_user_from_api_key(req.user_api_key.strip())
+        if not user_context:
+            response = {"error": {"code": "INVALID_USER_API_KEY", "message": "Invalid user_api_key."}}
+            return CallToolResult(
+                content=[TextContent(type="text", text=json.dumps(response))],
+                structuredContent=response,
+                isError=True,
+            )
+    elif require_user_key:
+        response = {"error": {"code": "USER_API_KEY_REQUIRED", "message": "user_api_key is required for plan_list."}}
+        return CallToolResult(
+            content=[TextContent(type="text", text=json.dumps(response))],
+            structuredContent=response,
+            isError=True,
+        )
+    user_id = str(user_context["user_id"]) if user_context else None
+    limit = max(1, min(req.limit, 50))
+    plans = await asyncio.to_thread(_list_plans_sync, user_id, limit)
+    response = {
+        "plans": plans,
+        "message": f"Returned {len(plans)} plan(s).",
+    }
+    return CallToolResult(
+        content=[TextContent(type="text", text=json.dumps(response))],
+        structuredContent=response,
+        isError=False,
+    )
+
+
+TOOL_HANDLERS = {
+    "plan_create": handle_plan_create,
+    "plan_status": handle_plan_status,
+    "plan_stop": handle_plan_stop,
+    "plan_retry": handle_plan_retry,
+    "plan_file_info": handle_plan_file_info,
+    "plan_list": handle_plan_list,
+    "prompt_examples": handle_prompt_examples,
+    "model_profiles": handle_model_profiles,
+}
diff --git a/mcp_cloud/http_server.py b/mcp_cloud/http_server.py
index c1ea7f7d1..55c9ab37d 100644
--- a/mcp_cloud/http_server.py
+++ b/mcp_cloud/http_server.py
@@ -28,6 +28,7 @@
     ModelProfilesOutput,
     PlanCreateOutput,
     PlanFileInfoOutput,
+    PlanListOutput,
     PlanRetryOutput,
     PlanStatusOutput,
     PlanStopOutput,
@@ -65,11 +66,13 @@
     handle_plan_stop,
     handle_plan_file_info,
     handle_prompt_examples,
-    resolve_task_for_task_id,
+    resolve_plan_for_task_id,
     set_download_base_url,
     validate_download_token,
     _resolve_user_from_api_key,
 )
+from mcp_cloud.auth import validate_api_key_secret
+from mcp_cloud.download_tokens import validate_download_token_secret
 
 REQUIRED_API_KEY = os.environ.get("PLANEXE_MCP_API_KEY")
 
@@ -78,6 +81,8 @@
 MAX_BODY_BYTES = int(os.environ.get("PLANEXE_MCP_MAX_BODY_BYTES", "1048576"))
 RATE_LIMIT_REQUESTS = int(os.environ.get("PLANEXE_MCP_RATE_LIMIT", "60"))
 RATE_LIMIT_WINDOW_SECONDS = float(os.environ.get("PLANEXE_MCP_RATE_WINDOW_SECONDS", "60"))
+DOWNLOAD_RATE_LIMIT_REQUESTS = int(os.environ.get("PLANEXE_MCP_DOWNLOAD_RATE_LIMIT", "10"))
+DOWNLOAD_RATE_LIMIT_WINDOW_SECONDS = float(os.environ.get("PLANEXE_MCP_DOWNLOAD_RATE_WINDOW_SECONDS", "60"))
 GLAMA_MAINTAINER_EMAIL = os.environ.get(
     "PLANEXE_MCP_GLAMA_MAINTAINER_EMAIL",
     "neoneye@gmail.com",
@@ -99,6 +104,10 @@ def _parse_bool_env(name: str, default: bool) -> bool:
 
 AUTH_REQUIRED = _parse_bool_env("PLANEXE_MCP_REQUIRE_AUTH", default=True)
 
+if AUTH_REQUIRED:
+    validate_api_key_secret()
+    validate_download_token_secret()
+
 
 def _split_csv_env(value: Optional[str]) -> list[str]:
     if not value:
@@ -136,10 +145,18 @@ def _split_csv_env(value: Optional[str]) -> list[str]:
 
 CORS_ORIGINS = _split_csv_env(os.environ.get("PLANEXE_MCP_CORS_ORIGINS"))
 if not CORS_ORIGINS:
-    # Use wildcard so that browser-based tools (e.g. MCP Inspector at
-    # localhost:6274) can connect directly.  API-key auth is the primary
-    # access control; CORS is defence-in-depth only.
-    CORS_ORIGINS = ["*"]
+    if AUTH_REQUIRED:
+        # Production default: only allow known PlanExe origins.
+        # Override via PLANEXE_MCP_CORS_ORIGINS if additional origins are needed.
+        CORS_ORIGINS = [
+            "https://mcp.planexe.org",
+            "https://home.planexe.org",
+        ]
+    else:
+        # Dev mode: allow any origin so browser-based tools (e.g. MCP Inspector
+        # at localhost:6274) can connect without extra configuration.
+        CORS_ORIGINS = ["*"]
+        logger.info("CORS wildcard enabled (PLANEXE_MCP_REQUIRE_AUTH=false)")
 
 PUBLIC_JSONRPC_METHODS_NO_AUTH = {
     "initialize",
@@ -317,6 +334,7 @@ async def _log_auth_rejection(request: Request, reason: str) -> None:
 
 _rate_lock = asyncio.Lock()
 _rate_buckets: dict[str, deque[float]] = defaultdict(deque)
+_download_rate_buckets: dict[str, deque[float]] = defaultdict(deque)
 _authenticated_user_api_key_ctx: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar(
     "authenticated_user_api_key", default=None
 )
@@ -448,6 +466,27 @@ async def _enforce_rate_limit(request: Request) -> Optional[JSONResponse]:
     return None
 
 
+async def _enforce_download_rate_limit(request: Request) -> Optional[JSONResponse]:
+    if DOWNLOAD_RATE_LIMIT_REQUESTS <= 0:
+        return None
+    if not request.url.path.startswith("/download"):
+        return None
+
+    identifier = _client_identifier(request)
+    now = monotonic()
+    async with _rate_lock:
+        bucket = _download_rate_buckets[identifier]
+        while bucket and now - bucket[0] > DOWNLOAD_RATE_LIMIT_WINDOW_SECONDS:
+            bucket.popleft()
+        if len(bucket) >= DOWNLOAD_RATE_LIMIT_REQUESTS:
+            return JSONResponse(
+                status_code=429,
+                content={"detail": "Download rate limit exceeded"},
+            )
+        bucket.append(now)
+    return None
+
+
 async def _sweep_rate_buckets(stop_event: asyncio.Event) -> None:
     while not stop_event.is_set():
         try:
@@ -462,18 +501,30 @@ async def _sweep_rate_buckets(stop_event: asyncio.Event) -> None:
                     bucket.popleft()
                 if not bucket:
                     del _rate_buckets[key]
+            for key in list(_download_rate_buckets):
+                bucket = _download_rate_buckets[key]
+                while bucket and now - bucket[0] > DOWNLOAD_RATE_LIMIT_WINDOW_SECONDS:
+                    bucket.popleft()
+                if not bucket:
+                    del _download_rate_buckets[key]
 
 
 async def _enforce_body_size(request: Request) -> Optional[JSONResponse]:
-    if request.method != "POST" or request.url.path != "/mcp/tools/call":
+    if request.method != "POST":
+        return None
+    if request.url.path not in ("/mcp/tools/call", "/mcp/"):
         return None
 
     content_length = request.headers.get("content-length")
     if not content_length:
-        return JSONResponse(
-            status_code=411,
-            content={"detail": "Length Required"},
-        )
+        # Streamable HTTP (/mcp/) may use chunked encoding without Content-Length.
+        # Only require it on the REST endpoint.
+        if request.url.path == "/mcp/tools/call":
+            return JSONResponse(
+                status_code=411,
+                content={"detail": "Length Required"},
+            )
+        return None
 
     try:
         if int(content_length) > MAX_BODY_BYTES:
@@ -597,35 +648,35 @@ async def plan_create(
 
 
 async def plan_status(
-    task_id: str = Field(..., description="Task UUID returned by plan_create."),
+    plan_id: str = Field(..., description="Plan UUID returned by plan_create."),
 ) -> Annotated[CallToolResult, PlanStatusOutput]:
-    return await handle_plan_status({"task_id": task_id})
+    return await handle_plan_status({"plan_id": plan_id})
 
 
 async def plan_stop(
-    task_id: str = Field(..., description="Task UUID returned by plan_create. Use it to stop the plan creation."),
+    plan_id: str = Field(..., description="Plan UUID returned by plan_create. Use it to stop the plan creation."),
 ) -> Annotated[CallToolResult, PlanStopOutput]:
-    return await handle_plan_stop({"task_id": task_id})
+    return await handle_plan_stop({"plan_id": plan_id})
 
 
 async def plan_retry(
-    task_id: str = Field(..., description="UUID of the failed task to retry."),
+    plan_id: str = Field(..., description="UUID of the failed plan to retry."),
     model_profile: Annotated[
         ModelProfileInput,
         Field(description="Model profile used for retry. Defaults to baseline."),
     ] = "baseline",
 ) -> Annotated[CallToolResult, PlanRetryOutput]:
-    return await handle_plan_retry({"task_id": task_id, "model_profile": model_profile})
+    return await handle_plan_retry({"plan_id": plan_id, "model_profile": model_profile})
 
 
 async def plan_file_info(
-    task_id: str = Field(..., description="Task UUID returned by plan_create. Use it to download the created plan."),
+    plan_id: str = Field(..., description="Plan UUID returned by plan_create. Use it to download the created plan."),
     artifact: Annotated[
         ResultArtifactInput,
         Field(description="Download artifact type: report or zip."),
     ] = "report",
 ) -> Annotated[CallToolResult, PlanFileInfoOutput]:
-    return await handle_plan_file_info({"task_id": task_id, "artifact": artifact})
+    return await handle_plan_file_info({"plan_id": plan_id, "artifact": artifact})
 
 
 async def prompt_examples() -> CallToolResult:
@@ -638,6 +689,17 @@ async def model_profiles() -> Annotated[CallToolResult, ModelProfilesOutput]:
     return await handle_model_profiles({})
 
 
+async def plan_list(
+    limit: int = Field(default=10, ge=1, le=50, description="Maximum number of plans to return (1–50). Newest plans are returned first."),
+) -> Annotated[CallToolResult, PlanListOutput]:
+    """List the most recent plans for an authenticated user."""
+    authenticated_user_api_key = _get_authenticated_user_api_key()
+    arguments: dict[str, Any] = {"limit": limit}
+    if authenticated_user_api_key:
+        arguments["user_api_key"] = authenticated_user_api_key
+    return await handle_plan_list(arguments)
+
+
 def _register_tools(server: FastMCP) -> None:
     handler_map = {
         "plan_create": plan_create,
@@ -645,6 +707,7 @@ def _register_tools(server: FastMCP) -> None:
         "plan_stop": plan_stop,
         "plan_retry": plan_retry,
         "plan_file_info": plan_file_info,
+        "plan_list": plan_list,
         "prompt_examples": prompt_examples,
         "model_profiles": model_profiles,
     }
@@ -769,6 +832,10 @@ async def enforce_api_key(
     if error_response:
         return _append_cors_headers(request, error_response)
 
+    error_response = await _enforce_download_rate_limit(request)
+    if error_response:
+        return _append_cors_headers(request, error_response)
+
     if request.url.path.startswith("/mcp"):
         set_download_base_url(_request_origin(request))
     try:
@@ -845,10 +912,12 @@ async def call_tool(
     This endpoint wraps the stdio-based MCP tool handlers for HTTP access.
     """
     arguments = dict(payload.arguments or {})
-    if payload.tool == "plan_create":
+    if payload.tool in ("plan_create", "plan_list"):
         authenticated_user_api_key = _get_authenticated_user_api_key()
         if authenticated_user_api_key and not arguments.get("user_api_key"):
             arguments["user_api_key"] = authenticated_user_api_key
+
+    if payload.tool == "plan_create":
         if isinstance(payload.metadata, dict):
             arguments["metadata"] = dict(payload.metadata)
 
@@ -905,17 +974,17 @@ async def download_report(
     # Defence-in-depth: if a token was supplied, it must be valid for this artifact.
     if token is not None and not validate_download_token(token, task_id, filename):
         raise HTTPException(status_code=401, detail="Invalid or expired download token")
-    task = await asyncio.to_thread(resolve_task_for_task_id, task_id)
-    if task is None:
+    plan = await asyncio.to_thread(resolve_plan_for_task_id, task_id)
+    if plan is None:
         raise HTTPException(status_code=404, detail="Task not found")
     if filename == ZIP_FILENAME:
-        content_bytes = await fetch_user_downloadable_zip(str(task.id))
+        content_bytes = await fetch_user_downloadable_zip(str(plan.id))
         if content_bytes is None:
             raise HTTPException(status_code=404, detail="Report not found")
         headers = {"Content-Disposition": f'attachment; filename="{task_id}.zip"'}
         return Response(content=content_bytes, media_type=ZIP_CONTENT_TYPE, headers=headers)
 
-    content_bytes = await fetch_artifact_from_worker_plan(str(task.id), REPORT_FILENAME)
+    content_bytes = await fetch_artifact_from_worker_plan(str(plan.id), REPORT_FILENAME)
     if content_bytes is None:
         raise HTTPException(status_code=404, detail="Report not found")
     headers = {"Content-Disposition": f'inline; filename="{REPORT_FILENAME}"'}
@@ -945,7 +1014,7 @@ def root() -> dict[str, Any]:
             "call": "/mcp/tools/call",
             "health": "/healthcheck",
             "glama_connector": "/.well-known/glama.json",
-            "download": f"/download/{{task_id}}/{REPORT_FILENAME}",
+            "download": f"/download/{{plan_id}}/{REPORT_FILENAME}",
             "llms_txt": "/llms.txt",
         },
         "documentation": "See /docs for OpenAPI documentation",
diff --git a/mcp_cloud/model_profiles.py b/mcp_cloud/model_profiles.py
new file mode 100644
index 000000000..83d8ea1df
--- /dev/null
+++ b/mcp_cloud/model_profiles.py
@@ -0,0 +1,146 @@
+"""PlanExe MCP Cloud – model profile introspection."""
+import json
+import logging
+import os
+from typing import Any, Optional
+
+from worker_plan_api.model_profile import (
+    ModelProfileEnum,
+    default_filename_for_profile,
+    resolve_model_profile_from_env,
+)
+from worker_plan_api.planexe_config import PlanExeConfig
+from worker_plan_api.llm_class_filter import (
+    ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES,
+    is_llm_class_allowed,
+    parse_llm_class_whitelist,
+)
+
+from mcp_cloud.db_setup import MODEL_PROFILE_TITLES, MODEL_PROFILE_SUMMARIES
+
+logger = logging.getLogger(__name__)
+
+
+def _sort_llm_config_entries(items: list[tuple[str, Any]]) -> list[tuple[str, Any]]:
+    def sort_key(item: tuple[str, Any]) -> tuple[int, str]:
+        key, model_data = item
+        priority = None
+        if isinstance(model_data, dict):
+            maybe_priority = model_data.get("priority")
+            if isinstance(maybe_priority, int):
+                priority = maybe_priority
+        if priority is None:
+            priority = 999999
+        return priority, key
+
+    return sorted(items, key=sort_key)
+
+
+def _extract_model_profile_entries(
+    model_map: dict[str, Any],
+    whitelist: Optional[set[str]],
+) -> list[dict[str, Any]]:
+    models: list[dict[str, Any]] = []
+
+    for model_key, model_data in _sort_llm_config_entries(list(model_map.items())):
+        class_name = model_data.get("class") if isinstance(model_data, dict) else None
+        if not is_llm_class_allowed(class_name, whitelist):
+            continue
+
+        model_name = None
+        priority = None
+        if isinstance(model_data, dict):
+            arguments = model_data.get("arguments")
+            if isinstance(arguments, dict):
+                maybe_model = arguments.get("model")
+                if isinstance(maybe_model, str):
+                    model_name = maybe_model
+            maybe_priority = model_data.get("priority")
+            if isinstance(maybe_priority, int):
+                priority = maybe_priority
+            elif isinstance(model_data.get("prio"), int):
+                priority = model_data["prio"]
+
+        models.append(
+            {
+                "key": model_key,
+                "provider_class": class_name if isinstance(class_name, str) else None,
+                "model": model_name,
+                "priority": priority,
+            }
+        )
+
+    return models
+
+
+def _profile_models_payload(
+    profile: ModelProfileEnum,
+    whitelist: Optional[set[str]],
+) -> dict[str, Any]:
+    config_filename = default_filename_for_profile(profile)
+    planexe_config_path = PlanExeConfig.resolve_planexe_config_path()
+    config_path = PlanExeConfig.find_file_in_search_order(config_filename, planexe_config_path)
+    if config_path is None:
+        return {
+            "profile": profile.value,
+            "title": MODEL_PROFILE_TITLES[profile.value],
+            "summary": MODEL_PROFILE_SUMMARIES[profile.value],
+            "model_count": 0,
+            "models": [],
+        }
+
+    try:
+        with config_path.open("r", encoding="utf-8") as fh:
+            model_map = json.load(fh)
+    except Exception as exc:
+        logger.warning(
+            "Unable to read profile config %s for model profile %s: %s",
+            config_filename,
+            profile.value,
+            exc,
+        )
+        return {
+            "profile": profile.value,
+            "title": MODEL_PROFILE_TITLES[profile.value],
+            "summary": MODEL_PROFILE_SUMMARIES[profile.value],
+            "model_count": 0,
+            "models": [],
+        }
+
+    if not isinstance(model_map, dict):
+        return {
+            "profile": profile.value,
+            "title": MODEL_PROFILE_TITLES[profile.value],
+            "summary": MODEL_PROFILE_SUMMARIES[profile.value],
+            "model_count": 0,
+            "models": [],
+        }
+
+    models = _extract_model_profile_entries(model_map, whitelist)
+    return {
+        "profile": profile.value,
+        "title": MODEL_PROFILE_TITLES[profile.value],
+        "summary": MODEL_PROFILE_SUMMARIES[profile.value],
+        "model_count": len(models),
+        "models": models,
+    }
+
+
+def _get_model_profiles_sync() -> dict[str, Any]:
+    raw_whitelist = os.environ.get(ENV_PLANEXE_LLM_CONFIG_WHITELISTED_CLASSES)
+    whitelist = parse_llm_class_whitelist(raw_whitelist)
+    default_profile = resolve_model_profile_from_env().value
+    profiles_all = [
+        _profile_models_payload(profile, whitelist)
+        for profile in ModelProfileEnum
+    ]
+    profiles = [profile for profile in profiles_all if int(profile.get("model_count") or 0) > 0]
+
+    return {
+        "default_profile": default_profile,
+        "profiles": profiles,
+        "message": (
+            "Use one of these profile values in plan_create.model_profile. "
+            "Model lists show what is currently available in each profile."
+        ),
+    }
diff --git a/mcp_cloud/prompt_examples.py b/mcp_cloud/prompt_examples.py
new file mode 100644
index 000000000..e55bf5ae1
--- /dev/null
+++ b/mcp_cloud/prompt_examples.py
@@ -0,0 +1,69 @@
+"""PlanExe MCP Cloud – example prompt loading."""
+import logging
+from pathlib import Path
+
+logger = logging.getLogger(__name__)
+
+
+def _load_mcp_example_prompts() -> list[str]:
+    """Load prompts from the catalog that are marked as MCP examples (mcp_example or mcp-example-prompt true).
+
+    Uses worker_plan_api.PromptCatalog the same way as frontend_single_user and frontend_multi_user
+    (no env var). Tries repo-root import first, then adds worker_plan to sys.path so worker_plan_api
+    is top-level (same as frontends). Falls back to built-in examples if the catalog is unavailable.
+    """
+    catalog = None
+    try:
+        from worker_plan.worker_plan_api.prompt_catalog import PromptCatalog
+
+        catalog = PromptCatalog()
+        catalog.load_simple_plan_prompts()
+    except Exception:
+        try:
+            # Same as frontends when worker_plan exists; when not (e.g. Docker), repo_root has worker_plan_api
+            import sys
+
+            repo_root = Path(__file__).resolve().parent.parent
+            worker_plan_dir = repo_root / "worker_plan"
+            path_to_add = str(worker_plan_dir if worker_plan_dir.exists() else repo_root)
+            if path_to_add not in sys.path:
+                sys.path.insert(0, path_to_add)
+            from worker_plan_api.prompt_catalog import PromptCatalog
+
+            catalog = PromptCatalog()
+            catalog.load_simple_plan_prompts()
+        except Exception as e:
+            logger.warning(
+                "Prompt catalog unavailable (%s); using built-in examples.",
+                e,
+            )
+            return _builtin_mcp_example_prompts()
+
+    if catalog is None:
+        return _builtin_mcp_example_prompts()
+
+    samples: list[str] = []
+    for item in catalog.all():
+        if item.extras.get("mcp_example") is True or item.extras.get("mcp-example-prompt") is True:
+            samples.append(item.prompt)
+    if not samples:
+        return _builtin_mcp_example_prompts()
+    return samples
+
+
+def _builtin_mcp_example_prompts() -> list[str]:
+    """Fallback example prompts when the catalog file is missing or has no mcp_example entries."""
+    return [
+        (
+            "Vegan Butcher Shop. That sells artificial meat (Plant-Based). Location Kødbyen, Copenhagen. "
+            "Sell sandwiches and sausages. Provocative marketing. Budget: 10 million DKK. Grand Opening in month 3. "
+            "Profitability Goal: month 12. Create a signature item that is a social media hit. "
+            "Pick a realistic scenario. I already have negotiated a 2 year lease inside Kødbyen. "
+            "Banned words: blockchain, VR, AR, AI, Robots."
+        ),
+        (
+            "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. "
+            "Budget 2.5M DKK. Open within 12 months. Include equipment, staffing, permits, and marketing. "
+            "Pick a realistic scenario; avoid overly ambitious timelines."
+        ),
+    ]
diff --git a/mcp_cloud/schemas.py b/mcp_cloud/schemas.py
new file mode 100644
index 000000000..2c2f98077
--- /dev/null
+++ b/mcp_cloud/schemas.py
@@ -0,0 +1,239 @@
+"""PlanExe MCP Cloud – tool schema constants and ToolDefinition."""
+from dataclasses import dataclass
+from typing import Any, Optional
+
+from mcp_cloud.tool_models import (
+    ModelProfilesInput,
+    ModelProfilesOutput,
+    PromptExamplesInput,
+    PromptExamplesOutput,
+    PlanCreateInput,
+    PlanCreateOutput,
+    PlanRetryInput,
+    PlanRetryOutput,
+    PlanStopOutput,
+    PlanStatusInput,
+    PlanStopInput,
+    PlanFileInfoInput,
+    PlanFileInfoNotReadyOutput,
+    PlanStatusSuccess,
+    PlanFileInfoReadyOutput,
+    PlanListInput,
+    PlanListOutput,
+    ErrorDetail,
+)
+
+PLAN_CREATE_INPUT_SCHEMA = PlanCreateInput.model_json_schema()
+PLAN_CREATE_OUTPUT_SCHEMA = PlanCreateOutput.model_json_schema()
+PLAN_STATUS_SUCCESS_SCHEMA = PlanStatusSuccess.model_json_schema()
+PLAN_STATUS_OUTPUT_SCHEMA = {
+    "oneOf": [
+        {
+            "type": "object",
+            "properties": {"error": ErrorDetail.model_json_schema()},
+            "required": ["error"],
+        },
+        PLAN_STATUS_SUCCESS_SCHEMA,
+    ]
+}
+PLAN_STOP_OUTPUT_SCHEMA = PlanStopOutput.model_json_schema()
+PLAN_RETRY_OUTPUT_SCHEMA = PlanRetryOutput.model_json_schema()
+PLAN_FILE_INFO_READY_OUTPUT_SCHEMA = PlanFileInfoReadyOutput.model_json_schema()
+PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA = PlanFileInfoNotReadyOutput.model_json_schema()
+PLAN_FILE_INFO_OUTPUT_SCHEMA = {
+    "oneOf": [
+        {
+            "type": "object",
+            "properties": {"error": ErrorDetail.model_json_schema()},
+            "required": ["error"],
+        },
+        PLAN_FILE_INFO_NOT_READY_OUTPUT_SCHEMA,
+        PLAN_FILE_INFO_READY_OUTPUT_SCHEMA,
+    ]
+}
+PLAN_STATUS_INPUT_SCHEMA = PlanStatusInput.model_json_schema()
+PLAN_STOP_INPUT_SCHEMA = PlanStopInput.model_json_schema()
+PLAN_RETRY_INPUT_SCHEMA = PlanRetryInput.model_json_schema()
+PLAN_FILE_INFO_INPUT_SCHEMA = PlanFileInfoInput.model_json_schema()
+
+PROMPT_EXAMPLES_INPUT_SCHEMA = PromptExamplesInput.model_json_schema()
+PROMPT_EXAMPLES_OUTPUT_SCHEMA = PromptExamplesOutput.model_json_schema()
+MODEL_PROFILES_INPUT_SCHEMA = ModelProfilesInput.model_json_schema()
+MODEL_PROFILES_OUTPUT_SCHEMA = ModelProfilesOutput.model_json_schema()
+PLAN_LIST_INPUT_SCHEMA = PlanListInput.model_json_schema()
+PLAN_LIST_OUTPUT_SCHEMA = PlanListOutput.model_json_schema()
+
+
+@dataclass(frozen=True)
+class ToolDefinition:
+    name: str
+    description: str
+    input_schema: dict[str, Any]
+    output_schema: Optional[dict[str, Any]] = None
+    annotations: Optional[dict[str, Any]] = None
+
+TOOL_DEFINITIONS = [
+    ToolDefinition(
+        name="prompt_examples",
+        description=(
+            "Call this first. Returns example prompts that define what a good prompt looks like. "
+            "Do NOT call plan_create yet. Optional before plan_create: call model_profiles to choose model_profile. "
+            "Next is a non-tool step: formulate a detailed prompt (typically ~300-800 words; use examples as a baseline, similar structure) and get user approval. "
+            "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. "
+            "Write the prompt as flowing prose, not structured markdown with headers or bullet lists. "
+            "Weave technical specs, constraints, and targets naturally into sentences. Include banned words/approaches and governance preferences inline. "
+            "The examples demonstrate this prose style — match their tone and density. "
+            "Then call plan_create. "
+            "PlanExe is not for tiny one-shot outputs like a 5-point checklist; and it does not support selecting only some internal pipeline steps."
+        ),
+        input_schema=PROMPT_EXAMPLES_INPUT_SCHEMA,
+        output_schema=PROMPT_EXAMPLES_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": True,
+            "destructiveHint": False,
+            "idempotentHint": True,
+            "openWorldHint": False,
+        },
+    ),
+    ToolDefinition(
+        name="model_profiles",
+        description=(
+            "Optional helper before plan_create. Returns model_profile options with plain-language guidance "
+            "and currently available models in each profile. "
+            "If no models are available, returns error code MODEL_PROFILES_UNAVAILABLE."
+        ),
+        input_schema=MODEL_PROFILES_INPUT_SCHEMA,
+        output_schema=MODEL_PROFILES_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": True,
+            "destructiveHint": False,
+            "idempotentHint": True,
+            "openWorldHint": False,
+        },
+    ),
+    ToolDefinition(
+        name="plan_create",
+        description=(
+            "Call only after prompt_examples and after you have completed prompt drafting/approval (non-tool step). "
+            "PlanExe turns the approved prompt into a strategic project-plan draft (20+ sections) in ~10-20 min. "
+            "Sections include: executive summary, interactive Gantt charts, investor pitch, project plan with SMART criteria, "
+            "strategic decision analysis, scenario comparison, assumptions with expert review, governance structure, "
+            "SWOT analysis, team role profiles, simulated expert criticism, work breakdown structure, "
+            "plan review (critical issues, KPIs, financial strategy, automation opportunities), Q&A, "
+            "premortem with failure scenarios, self-audit checklist, and adversarial premise attacks that argue against the project. "
+            "The adversarial sections (premortem, self-audit, premise attacks) surface risks and questions the prompter may not have considered. "
+            "Returns plan_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_file_info. "
+            "If you lose a plan_id, call plan_list to recover it. "
+            "Each plan_create call creates a new plan_id (no server-side dedup). "
+            "If you are unsure which model_profile to choose, call model_profiles first. "
+            "If your deployment uses credits, include user_api_key to charge the correct account. "
+            "Common error codes: INVALID_USER_API_KEY, USER_API_KEY_REQUIRED, INSUFFICIENT_CREDITS."
+        ),
+        input_schema=PLAN_CREATE_INPUT_SCHEMA,
+        output_schema=PLAN_CREATE_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": False,
+            "destructiveHint": False,
+            "idempotentHint": False,
+            "openWorldHint": True,
+        },
+    ),
+    ToolDefinition(
+        name="plan_status",
+        description=(
+            "Returns status and progress of the plan currently being created. "
+            "Poll at reasonable intervals only (e.g. every 5 minutes): plan generation typically takes 10-20 minutes "
+            "(baseline profile) and may take longer on higher-quality profiles. "
+            "State contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. "
+            "progress_percentage is 0-100 (integer-like float); 100 when completed. "
+            "files lists intermediate outputs produced so far; use their updated_at timestamps to detect stalls. "
+            "Unknown plan_id returns error code PLAN_NOT_FOUND. "
+            "Troubleshooting: pending for >5 minutes likely means queued but not picked up by a worker. "
+            "processing with no file-output changes for >20 minutes likely means failed/stalled. "
+            "Report these issues to https://github.com/PlanExeOrg/PlanExe/issues ."
+        ),
+        input_schema=PLAN_STATUS_INPUT_SCHEMA,
+        output_schema=PLAN_STATUS_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": True,
+            "destructiveHint": False,
+            "idempotentHint": True,
+            "openWorldHint": False,
+        },
+    ),
+    ToolDefinition(
+        name="plan_stop",
+        description=(
+            "Request the plan generation to stop. Pass the plan_id (the UUID returned by plan_create). "
+            "Stopping is asynchronous: the stop flag is set immediately but the plan may continue briefly before halting. "
+            "A stopped plan will eventually transition to the failed state. "
+            "If the plan is already completed or failed, stop_requested returns false (the plan already finished). "
+            "Unknown plan_id returns error code PLAN_NOT_FOUND."
+        ),
+        input_schema=PLAN_STOP_INPUT_SCHEMA,
+        output_schema=PLAN_STOP_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": False,
+            "destructiveHint": True,
+            "idempotentHint": True,
+            "openWorldHint": False,
+        },
+    ),
+    ToolDefinition(
+        name="plan_retry",
+        description=(
+            "Retry a plan that is currently in failed state. "
+            "Pass the failed plan_id and optionally model_profile (defaults to baseline). "
+            "The plan is reset to pending, prior artifacts are cleared, and the same plan_id is requeued for processing. "
+            "Returns PLAN_NOT_FOUND when plan_id is unknown and PLAN_NOT_FAILED when the plan is not in failed state."
+        ),
+        input_schema=PLAN_RETRY_INPUT_SCHEMA,
+        output_schema=PLAN_RETRY_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": False,
+            "destructiveHint": False,
+            "idempotentHint": False,
+            "openWorldHint": True,
+        },
+    ),
+    ToolDefinition(
+        name="plan_file_info",
+        description=(
+            "Returns file metadata (content_type, download_url, download_size) for the report or zip artifact. "
+            "Use artifact='report' (default) for the interactive HTML report (~700KB, self-contained with embedded JS "
+            "for collapsible sections and interactive Gantt charts — open in a browser). "
+            "Use artifact='zip' for the full pipeline output bundle (md, json, csv intermediary files that fed the report). "
+            "While the task is still pending or processing, returns {ready:false,reason:\"processing\"}. "
+            "Check readiness by testing whether download_url is present in the response. "
+            "Once ready, present download_url to the user or fetch and save the file locally. "
+            "If your client exposes plan_download (e.g. mcp_local), prefer that to save the file locally. "
+            "Terminal error codes: generation_failed (plan failed), content_unavailable (artifact missing). "
+            "Unknown plan_id returns error code PLAN_NOT_FOUND."
+        ),
+        input_schema=PLAN_FILE_INFO_INPUT_SCHEMA,
+        output_schema=PLAN_FILE_INFO_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": True,
+            "destructiveHint": False,
+            "idempotentHint": True,
+            "openWorldHint": False,
+        },
+    ),
+    ToolDefinition(
+        name="plan_list",
+        description=(
+            "List the most recent plans for an authenticated user. "
+            "Returns up to `limit` plans (default 10, max 50) newest-first, each with plan_id, state, "
+            "progress_percentage, created_at (ISO 8601), and a prompt_excerpt (first 100 chars). "
+            "Use this to recover a lost plan_id or to review recent activity."
+        ),
+        input_schema=PLAN_LIST_INPUT_SCHEMA,
+        output_schema=PLAN_LIST_OUTPUT_SCHEMA,
+        annotations={
+            "readOnlyHint": True,
+            "destructiveHint": False,
+            "idempotentHint": True,
+            "openWorldHint": False,
+        },
+    ),
+]
diff --git a/mcp_cloud/tests/test_download_rate_limit.py b/mcp_cloud/tests/test_download_rate_limit.py
new file mode 100644
index 000000000..03eb8417a
--- /dev/null
+++ b/mcp_cloud/tests/test_download_rate_limit.py
@@ -0,0 +1,60 @@
+import asyncio
+import unittest
+from unittest.mock import MagicMock, patch
+
+import mcp_cloud.http_server as http_server
+
+
+def _fake_request(path: str, client_host: str = "10.0.0.1") -> MagicMock:
+    request = MagicMock()
+    request.url.path = path
+    request.headers = {}
+    request.client.host = client_host
+    return request
+
+
+class TestDownloadRateLimit(unittest.TestCase):
+    def setUp(self):
+        """Clear download rate buckets between tests."""
+        http_server._download_rate_buckets.clear()
+
+    def test_non_download_path_is_not_rate_limited(self):
+        request = _fake_request("/mcp/tools/call")
+        result = asyncio.run(http_server._enforce_download_rate_limit(request))
+        self.assertIsNone(result)
+
+    def test_download_path_is_rate_limited(self):
+        request = _fake_request("/download/abc-123/030-report.html")
+        for _ in range(http_server.DOWNLOAD_RATE_LIMIT_REQUESTS):
+            result = asyncio.run(http_server._enforce_download_rate_limit(request))
+            self.assertIsNone(result)
+        # Next request should be rejected
+        result = asyncio.run(http_server._enforce_download_rate_limit(request))
+        self.assertIsNotNone(result)
+        self.assertEqual(result.status_code, 429)
+
+    def test_different_clients_have_separate_buckets(self):
+        req_a = _fake_request("/download/abc/030-report.html", client_host="10.0.0.1")
+        req_b = _fake_request("/download/abc/030-report.html", client_host="10.0.0.2")
+        for _ in range(http_server.DOWNLOAD_RATE_LIMIT_REQUESTS):
+            asyncio.run(http_server._enforce_download_rate_limit(req_a))
+        # Client A is exhausted
+        result_a = asyncio.run(http_server._enforce_download_rate_limit(req_a))
+        self.assertIsNotNone(result_a)
+        # Client B still has quota
+        result_b = asyncio.run(http_server._enforce_download_rate_limit(req_b))
+        self.assertIsNone(result_b)
+
+    def test_disabled_when_limit_is_zero(self):
+        request = _fake_request("/download/abc/030-report.html")
+        original = http_server.DOWNLOAD_RATE_LIMIT_REQUESTS
+        try:
+            http_server.DOWNLOAD_RATE_LIMIT_REQUESTS = 0
+            result = asyncio.run(http_server._enforce_download_rate_limit(request))
+            self.assertIsNone(result)
+        finally:
+            http_server.DOWNLOAD_RATE_LIMIT_REQUESTS = original
+
+
+if __name__ == "__main__":
+    unittest.main()
diff --git a/mcp_cloud/tests/test_download_token.py b/mcp_cloud/tests/test_download_token.py
index c3b62471c..cd23983e8 100644
--- a/mcp_cloud/tests/test_download_token.py
+++ b/mcp_cloud/tests/test_download_token.py
@@ -3,13 +3,14 @@
 from unittest.mock import patch
 
 import mcp_cloud.app as cloud_app
+import mcp_cloud.download_tokens as _dt_mod
 
 
 class TestGenerateAndValidateDownloadToken(unittest.TestCase):
     def setUp(self):
         # Pin the secret so tests are deterministic regardless of env vars.
         self._secret_patch = patch.object(
-            cloud_app,
+            _dt_mod,
             "_get_download_token_secret",
             return_value=b"test-secret-for-unit-tests",
         )
@@ -69,27 +70,27 @@ def test_different_tasks_get_different_tokens(self):
         self.assertNotEqual(t1, t2)
 
     def test_report_url_contains_token(self):
-        with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"):
+        with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"):
             url = cloud_app.build_report_download_url("task-abc")
         self.assertIsNotNone(url)
         self.assertIn("?token=", url)
         self.assertIn("/download/task-abc/030-report.html", url)
 
     def test_zip_url_contains_token(self):
-        with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"):
+        with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"):
             url = cloud_app.build_zip_download_url("task-abc")
         self.assertIsNotNone(url)
         self.assertIn("?token=", url)
         self.assertIn("/download/task-abc/run.zip", url)
 
     def test_token_embedded_in_report_url_is_valid(self):
-        with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"):
+        with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"):
             url = cloud_app.build_report_download_url("task-abc")
         token = url.split("?token=")[1]
         self.assertTrue(cloud_app.validate_download_token(token, "task-abc", "030-report.html"))
 
     def test_token_embedded_in_zip_url_is_valid(self):
-        with patch.object(cloud_app, "_get_download_base_url", return_value="https://example.com"):
+        with patch.object(_dt_mod, "_get_download_base_url", return_value="https://example.com"):
             url = cloud_app.build_zip_download_url("task-abc")
         token = url.split("?token=")[1]
         self.assertTrue(cloud_app.validate_download_token(token, "task-abc", "run.zip"))
diff --git a/mcp_cloud/tests/test_model_profiles_tool.py b/mcp_cloud/tests/test_model_profiles_tool.py
index c8650a352..4838c7b0e 100644
--- a/mcp_cloud/tests/test_model_profiles_tool.py
+++ b/mcp_cloud/tests/test_model_profiles_tool.py
@@ -33,7 +33,7 @@ def test_model_profiles_returns_structured_content(self):
             "message": "Use one of these profile values in plan_create.model_profile.",
         }
 
-        with patch("mcp_cloud.app._get_model_profiles_sync", return_value=payload):
+        with patch("mcp_cloud.handlers._get_model_profiles_sync", return_value=payload):
             result = asyncio.run(handle_model_profiles({}))
 
         self.assertFalse(result.isError)
@@ -48,7 +48,7 @@ def test_model_profiles_returns_error_when_none_available(self):
             "message": "Use one of these profile values in plan_create.model_profile.",
         }
 
-        with patch("mcp_cloud.app._get_model_profiles_sync", return_value=payload):
+        with patch("mcp_cloud.handlers._get_model_profiles_sync", return_value=payload):
             result = asyncio.run(handle_model_profiles({}))
 
         self.assertTrue(result.isError)
diff --git a/mcp_cloud/tests/test_task_create_tool.py b/mcp_cloud/tests/test_plan_create_tool.py
similarity index 84%
rename from mcp_cloud/tests/test_task_create_tool.py
rename to mcp_cloud/tests/test_plan_create_tool.py
index f4a278182..8f429e4f2 100644
--- a/mcp_cloud/tests/test_task_create_tool.py
+++ b/mcp_cloud/tests/test_plan_create_tool.py
@@ -29,18 +29,18 @@ def __init__(self, prompt: str, state, user_id: str, parameters):
                 self.parameters = parameters
                 self.timestamp_created = datetime.now(UTC)
 
-        with patch("mcp_cloud.app.app.app_context", return_value=nullcontext()), patch(
-            "mcp_cloud.app.db.session", fake_session
+        with patch("mcp_cloud.db_queries.app.app_context", return_value=nullcontext()), patch(
+            "mcp_cloud.db_queries.db.session", fake_session
         ), patch(
-            "mcp_cloud.app.PlanItem", StubPlanItem
+            "mcp_cloud.db_queries.PlanItem", StubPlanItem
         ):
             result = asyncio.run(handle_plan_create(arguments))
 
         self.assertIsInstance(result, CallToolResult)
         self.assertIsInstance(result.structuredContent, dict)
-        self.assertIn("task_id", result.structuredContent)
+        self.assertIn("plan_id", result.structuredContent)
         self.assertIn("created_at", result.structuredContent)
-        self.assertIsInstance(uuid.UUID(result.structuredContent["task_id"]), uuid.UUID)
+        self.assertIsInstance(uuid.UUID(result.structuredContent["plan_id"]), uuid.UUID)
 
 
 if __name__ == "__main__":
diff --git a/mcp_cloud/tests/test_task_file_info_tool.py b/mcp_cloud/tests/test_plan_file_info_tool.py
similarity index 76%
rename from mcp_cloud/tests/test_task_file_info_tool.py
rename to mcp_cloud/tests/test_plan_file_info_tool.py
index 7f7d2b835..3af33f2b6 100644
--- a/mcp_cloud/tests/test_task_file_info_tool.py
+++ b/mcp_cloud/tests/test_plan_file_info_tool.py
@@ -40,17 +40,17 @@ def test_zip_helpers(self):
     def test_report_read_defaults_to_metadata(self):
         task_id = str(uuid.uuid4())
         content_bytes = b"a" * 10
-        task_snapshot = {
+        plan_snapshot = {
             "id": "task-id",
             "state": PlanState.completed,
             "progress_message": None,
         }
-        with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot):
+        with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot):
             with patch(
-                "mcp_cloud.app.fetch_artifact_from_worker_plan",
+                "mcp_cloud.handlers.fetch_artifact_from_worker_plan",
                 new=AsyncMock(return_value=content_bytes),
             ):
-                result = asyncio.run(handle_plan_file_info({"task_id": task_id}))
+                result = asyncio.run(handle_plan_file_info({"plan_id": task_id}))
 
         payload = result.structuredContent
         self.assertEqual(payload["download_size"], len(content_bytes))
@@ -62,17 +62,17 @@ def test_report_read_defaults_to_metadata(self):
     def test_report_read_zip(self):
         task_id = str(uuid.uuid4())
         content_bytes = b"zipdata"
-        task_snapshot = {
+        plan_snapshot = {
             "id": "task-id",
             "state": PlanState.completed,
             "progress_message": None,
         }
-        with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot):
+        with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot):
             with patch(
-                "mcp_cloud.app.fetch_user_downloadable_zip",
+                "mcp_cloud.handlers.fetch_user_downloadable_zip",
                 new=AsyncMock(return_value=content_bytes),
             ):
-                result = asyncio.run(handle_plan_file_info({"task_id": task_id, "artifact": "zip"}))
+                result = asyncio.run(handle_plan_file_info({"plan_id": task_id, "artifact": "zip"}))
 
         payload = result.structuredContent
         self.assertEqual(payload["download_size"], len(content_bytes))
@@ -81,17 +81,17 @@ def test_report_read_zip(self):
     def test_report_read_zip_for_failed_task(self):
         task_id = str(uuid.uuid4())
         content_bytes = b"zipdata"
-        task_snapshot = {
+        plan_snapshot = {
             "id": "task-id",
             "state": PlanState.failed,
             "progress_message": "Stopped",
         }
-        with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot):
+        with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot):
             with patch(
-                "mcp_cloud.app.fetch_user_downloadable_zip",
+                "mcp_cloud.handlers.fetch_user_downloadable_zip",
                 new=AsyncMock(return_value=content_bytes),
             ):
-                result = asyncio.run(handle_plan_file_info({"task_id": task_id, "artifact": "zip"}))
+                result = asyncio.run(handle_plan_file_info({"plan_id": task_id, "artifact": "zip"}))
 
         payload = result.structuredContent
         self.assertEqual(payload["download_size"], len(content_bytes))
@@ -99,26 +99,26 @@ def test_report_read_zip_for_failed_task(self):
 
     def test_plan_file_info_returns_empty_object_when_pending(self):
         task_id = str(uuid.uuid4())
-        task_snapshot = {
+        plan_snapshot = {
             "id": "task-id",
             "state": PlanState.pending,
             "progress_message": None,
         }
-        with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot):
-            result = asyncio.run(handle_plan_file_info({"task_id": task_id}))
+        with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot):
+            result = asyncio.run(handle_plan_file_info({"plan_id": task_id}))
 
         self.assertFalse(result.isError)
-        self.assertEqual(result.structuredContent, {})
+        self.assertEqual(result.structuredContent, {"ready": False, "reason": "processing"})
 
     def test_plan_file_info_returns_generation_failed_payload(self):
         task_id = str(uuid.uuid4())
-        task_snapshot = {
+        plan_snapshot = {
             "id": "task-id",
             "state": PlanState.failed,
             "progress_message": "Pipeline failed",
         }
-        with patch("mcp_cloud.app._get_task_for_report_sync", return_value=task_snapshot):
-            result = asyncio.run(handle_plan_file_info({"task_id": task_id, "artifact": "report"}))
+        with patch("mcp_cloud.handlers._get_plan_for_report_sync", return_value=plan_snapshot):
+            result = asyncio.run(handle_plan_file_info({"plan_id": task_id, "artifact": "report"}))
 
         self.assertFalse(result.isError)
         self.assertEqual(result.structuredContent["error"]["code"], "generation_failed")
diff --git a/mcp_cloud/tests/test_plan_list_tool.py b/mcp_cloud/tests/test_plan_list_tool.py
new file mode 100644
index 000000000..4d7b46b1c
--- /dev/null
+++ b/mcp_cloud/tests/test_plan_list_tool.py
@@ -0,0 +1,99 @@
+import asyncio
+import unittest
+from unittest.mock import patch
+
+from mcp.types import CallToolResult
+from mcp_cloud.app import handle_list_tools, handle_plan_list
+
+
+class TestPlanListTool(unittest.TestCase):
+    def test_plan_list_tool_listed(self):
+        tools = asyncio.run(handle_list_tools())
+        tool_names = {tool.name for tool in tools}
+        self.assertIn("plan_list", tool_names)
+
+    def test_plan_list_returns_plans(self):
+        fake_plans = [
+            {
+                "plan_id": "aaa-111",
+                "state": "completed",
+                "progress_percentage": 100.0,
+                "created_at": "2026-01-01T00:00:00Z",
+                "prompt_excerpt": "Build a rocket",
+            },
+            {
+                "plan_id": "bbb-222",
+                "state": "processing",
+                "progress_percentage": 42.0,
+                "created_at": "2026-01-02T00:00:00Z",
+                "prompt_excerpt": "Open a bakery",
+            },
+        ]
+        user_context = {"user_id": "user-1", "credits_balance": 10.0}
+        with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \
+             patch("mcp_cloud.handlers._list_plans_sync", return_value=fake_plans):
+            result = asyncio.run(handle_plan_list({"user_api_key": "pex_test", "limit": 10}))
+
+        self.assertIsInstance(result, CallToolResult)
+        self.assertFalse(result.isError)
+        self.assertEqual(len(result.structuredContent["plans"]), 2)
+        self.assertIn("Returned 2 plan(s)", result.structuredContent["message"])
+
+    def test_plan_list_empty_result(self):
+        user_context = {"user_id": "user-1", "credits_balance": 10.0}
+        with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \
+             patch("mcp_cloud.handlers._list_plans_sync", return_value=[]):
+            result = asyncio.run(handle_plan_list({"user_api_key": "pex_test"}))
+
+        self.assertFalse(result.isError)
+        self.assertEqual(result.structuredContent["plans"], [])
+        self.assertIn("Returned 0 plan(s)", result.structuredContent["message"])
+
+    def test_plan_list_clamps_limit(self):
+        """Limit is clamped to [1, 50]."""
+        user_context = {"user_id": "user-1", "credits_balance": 10.0}
+        with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \
+             patch("mcp_cloud.handlers._list_plans_sync", return_value=[]) as mock_list:
+            asyncio.run(handle_plan_list({"user_api_key": "pex_test", "limit": 999}))
+            _, call_args = mock_list.call_args[0][0], mock_list.call_args[0][1]
+            self.assertEqual(call_args, 50)
+
+            asyncio.run(handle_plan_list({"user_api_key": "pex_test", "limit": -5}))
+            _, call_args = mock_list.call_args[0][0], mock_list.call_args[0][1]
+            self.assertEqual(call_args, 1)
+
+    def test_plan_list_invalid_user_api_key(self):
+        with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=None):
+            result = asyncio.run(handle_plan_list({"user_api_key": "pex_bad"}))
+
+        self.assertTrue(result.isError)
+        self.assertEqual(result.structuredContent["error"]["code"], "INVALID_USER_API_KEY")
+
+    def test_plan_list_requires_key_when_env_set(self):
+        with patch.dict("os.environ", {"PLANEXE_MCP_REQUIRE_USER_KEY": "true"}):
+            result = asyncio.run(handle_plan_list({"limit": 5}))
+
+        self.assertTrue(result.isError)
+        self.assertEqual(result.structuredContent["error"]["code"], "USER_API_KEY_REQUIRED")
+
+    def test_plan_list_no_key_when_not_required(self):
+        """When key is not required and not provided, returns all tasks (user_id=None)."""
+        with patch.dict("os.environ", {"PLANEXE_MCP_REQUIRE_USER_KEY": "false"}), \
+             patch("mcp_cloud.handlers._list_plans_sync", return_value=[]) as mock_list:
+            result = asyncio.run(handle_plan_list({"limit": 5}))
+
+        self.assertFalse(result.isError)
+        # user_id should be None
+        self.assertIsNone(mock_list.call_args[0][0])
+
+    def test_plan_list_uses_default_limit(self):
+        user_context = {"user_id": "user-1", "credits_balance": 10.0}
+        with patch("mcp_cloud.handlers._resolve_user_from_api_key", return_value=user_context), \
+             patch("mcp_cloud.handlers._list_plans_sync", return_value=[]) as mock_list:
+            asyncio.run(handle_plan_list({"user_api_key": "pex_test"}))
+            _, call_args = mock_list.call_args[0][0], mock_list.call_args[0][1]
+            self.assertEqual(call_args, 10)
+
+
+if __name__ == "__main__":
+    unittest.main()
diff --git a/mcp_cloud/tests/test_task_retry_tool.py b/mcp_cloud/tests/test_plan_retry_tool.py
similarity index 61%
rename from mcp_cloud/tests/test_task_retry_tool.py
rename to mcp_cloud/tests/test_plan_retry_tool.py
index 1c0ef4c9d..3b1c1bf7a 100644
--- a/mcp_cloud/tests/test_task_retry_tool.py
+++ b/mcp_cloud/tests/test_plan_retry_tool.py
@@ -16,36 +16,36 @@ def test_plan_retry_tool_listed(self):
     def test_plan_retry_returns_structured_content(self):
         task_id = str(uuid.uuid4())
         payload = {
-            "task_id": task_id,
+            "plan_id": task_id,
             "state": "pending",
             "model_profile": "baseline",
             "retried_at": "2026-01-01T00:00:00Z",
         }
-        with patch("mcp_cloud.app._retry_failed_task_sync", return_value=payload):
-            result = asyncio.run(handle_plan_retry({"task_id": task_id}))
+        with patch("mcp_cloud.handlers._retry_failed_plan_sync", return_value=payload):
+            result = asyncio.run(handle_plan_retry({"plan_id": task_id}))
 
         self.assertIsInstance(result, CallToolResult)
         self.assertFalse(result.isError)
-        self.assertEqual(result.structuredContent["task_id"], task_id)
+        self.assertEqual(result.structuredContent["plan_id"], task_id)
         self.assertEqual(result.structuredContent["state"], "pending")
         self.assertEqual(result.structuredContent["model_profile"], "baseline")
 
-    def test_plan_retry_returns_task_not_found(self):
+    def test_plan_retry_returns_plan_not_found(self):
         task_id = str(uuid.uuid4())
-        with patch("mcp_cloud.app._retry_failed_task_sync", return_value=None):
-            result = asyncio.run(handle_plan_retry({"task_id": task_id}))
+        with patch("mcp_cloud.handlers._retry_failed_plan_sync", return_value=None):
+            result = asyncio.run(handle_plan_retry({"plan_id": task_id}))
 
         self.assertTrue(result.isError)
-        self.assertEqual(result.structuredContent["error"]["code"], "TASK_NOT_FOUND")
+        self.assertEqual(result.structuredContent["error"]["code"], "PLAN_NOT_FOUND")
 
-    def test_plan_retry_returns_task_not_failed(self):
+    def test_plan_retry_returns_plan_not_failed(self):
         task_id = str(uuid.uuid4())
-        payload = {"error": {"code": "TASK_NOT_FAILED", "message": "Task is not failed."}}
-        with patch("mcp_cloud.app._retry_failed_task_sync", return_value=payload):
-            result = asyncio.run(handle_plan_retry({"task_id": task_id}))
+        payload = {"error": {"code": "PLAN_NOT_FAILED", "message": "Plan is not failed."}}
+        with patch("mcp_cloud.handlers._retry_failed_plan_sync", return_value=payload):
+            result = asyncio.run(handle_plan_retry({"plan_id": task_id}))
 
         self.assertTrue(result.isError)
-        self.assertEqual(result.structuredContent["error"]["code"], "TASK_NOT_FAILED")
+        self.assertEqual(result.structuredContent["error"]["code"], "PLAN_NOT_FAILED")
 
 
 if __name__ == "__main__":
diff --git a/mcp_cloud/tests/test_task_status_tool.py b/mcp_cloud/tests/test_plan_status_tool.py
similarity index 66%
rename from mcp_cloud/tests/test_task_status_tool.py
rename to mcp_cloud/tests/test_plan_status_tool.py
index bff6aa3f5..6d44f0f27 100644
--- a/mcp_cloud/tests/test_task_status_tool.py
+++ b/mcp_cloud/tests/test_plan_status_tool.py
@@ -12,7 +12,7 @@
 class TestPlanStatusTool(unittest.TestCase):
     def test_plan_status_returns_structured_content(self):
         task_id = str(uuid.uuid4())
-        task_snapshot = {
+        plan_snapshot = {
             "id": task_id,
             "state": PlanState.completed,
             "stop_requested": False,
@@ -20,16 +20,16 @@ def test_plan_status_returns_structured_content(self):
             "timestamp_created": datetime.now(UTC),
         }
         with patch(
-            "mcp_cloud.app._get_task_status_snapshot_sync",
-            return_value=task_snapshot,
+            "mcp_cloud.handlers._get_plan_status_snapshot_sync",
+            return_value=plan_snapshot,
         ), patch(
-            "mcp_cloud.app.fetch_file_list_from_worker_plan", new=AsyncMock(return_value=[])
+            "mcp_cloud.handlers.fetch_file_list_from_worker_plan", new=AsyncMock(return_value=[])
         ):
-            result = asyncio.run(handle_plan_status({"task_id": task_id}))
+            result = asyncio.run(handle_plan_status({"plan_id": task_id}))
 
         self.assertIsInstance(result, CallToolResult)
         self.assertIsInstance(result.structuredContent, dict)
-        self.assertEqual(result.structuredContent["task_id"], task_id)
+        self.assertEqual(result.structuredContent["plan_id"], task_id)
         self.assertIn("state", result.structuredContent)
         self.assertIn("progress_percentage", result.structuredContent)
         self.assertIsInstance(result.structuredContent["progress_percentage"], float)
@@ -37,7 +37,7 @@ def test_plan_status_returns_structured_content(self):
 
     def test_plan_status_falls_back_to_zip_snapshot_files_when_primary_source_empty(self):
         task_id = str(uuid.uuid4())
-        task_snapshot = {
+        plan_snapshot = {
             "id": task_id,
             "state": PlanState.processing,
             "stop_requested": False,
@@ -45,19 +45,19 @@ def test_plan_status_falls_back_to_zip_snapshot_files_when_primary_source_empty(
             "timestamp_created": datetime.now(UTC),
         }
         with patch(
-            "mcp_cloud.app._get_task_status_snapshot_sync",
-            return_value=task_snapshot,
+            "mcp_cloud.handlers._get_plan_status_snapshot_sync",
+            return_value=plan_snapshot,
         ), patch(
-            "mcp_cloud.app.fetch_file_list_from_worker_plan",
+            "mcp_cloud.handlers.fetch_file_list_from_worker_plan",
             new=AsyncMock(return_value=[]),
         ), patch(
-            "mcp_cloud.app.list_files_from_zip_snapshot",
+            "mcp_cloud.handlers.list_files_from_zip_snapshot",
             return_value=["001-2-plan.txt", "log.txt"],
         ), patch(
-            "mcp_cloud.app.list_files_from_local_run_dir",
+            "mcp_cloud.handlers.list_files_from_local_run_dir",
             return_value=None,
         ):
-            result = asyncio.run(handle_plan_status({"task_id": task_id}))
+            result = asyncio.run(handle_plan_status({"plan_id": task_id}))
 
         files = result.structuredContent["files"]
         self.assertEqual(len(files), 1)
@@ -65,7 +65,7 @@ def test_plan_status_falls_back_to_zip_snapshot_files_when_primary_source_empty(
 
     def test_plan_status_uses_processing_state_name(self):
         task_id = str(uuid.uuid4())
-        task_snapshot = {
+        plan_snapshot = {
             "id": task_id,
             "state": PlanState.processing,
             "stop_requested": True,
@@ -73,23 +73,23 @@ def test_plan_status_uses_processing_state_name(self):
             "timestamp_created": datetime.now(UTC),
         }
         with patch(
-            "mcp_cloud.app._get_task_status_snapshot_sync",
-            return_value=task_snapshot,
+            "mcp_cloud.handlers._get_plan_status_snapshot_sync",
+            return_value=plan_snapshot,
         ), patch(
-            "mcp_cloud.app.fetch_file_list_from_worker_plan",
+            "mcp_cloud.handlers.fetch_file_list_from_worker_plan",
             new=AsyncMock(return_value=[]),
         ):
-            result = asyncio.run(handle_plan_status({"task_id": task_id}))
+            result = asyncio.run(handle_plan_status({"plan_id": task_id}))
 
         self.assertEqual(result.structuredContent["state"], "processing")
 
-    def test_plan_status_returns_task_not_found_error(self):
+    def test_plan_status_returns_plan_not_found_error(self):
         task_id = str(uuid.uuid4())
-        with patch("mcp_cloud.app._get_task_status_snapshot_sync", return_value=None):
-            result = asyncio.run(handle_plan_status({"task_id": task_id}))
+        with patch("mcp_cloud.handlers._get_plan_status_snapshot_sync", return_value=None):
+            result = asyncio.run(handle_plan_status({"plan_id": task_id}))
 
         self.assertTrue(result.isError)
-        self.assertEqual(result.structuredContent["error"]["code"], "TASK_NOT_FOUND")
+        self.assertEqual(result.structuredContent["error"]["code"], "PLAN_NOT_FOUND")
 
 
 if __name__ == "__main__":
diff --git a/mcp_cloud/tests/test_secret_validation.py b/mcp_cloud/tests/test_secret_validation.py
new file mode 100644
index 000000000..3c1222842
--- /dev/null
+++ b/mcp_cloud/tests/test_secret_validation.py
@@ -0,0 +1,38 @@
+"""Tests for startup secret validation (4.1 fail-hard on missing secrets)."""
+import unittest
+from unittest.mock import patch
+
+from mcp_cloud.auth import validate_api_key_secret
+from mcp_cloud.download_tokens import validate_download_token_secret
+
+
+class TestValidateApiKeySecret(unittest.TestCase):
+    def test_raises_when_not_set(self):
+        with patch.dict("os.environ", {}, clear=True):
+            with self.assertRaises(RuntimeError) as ctx:
+                validate_api_key_secret()
+            self.assertIn("PLANEXE_API_KEY_SECRET", str(ctx.exception))
+
+    def test_passes_when_set(self):
+        with patch.dict("os.environ", {"PLANEXE_API_KEY_SECRET": "my-secret"}):
+            validate_api_key_secret()  # should not raise
+
+
+class TestValidateDownloadTokenSecret(unittest.TestCase):
+    def test_raises_when_neither_set(self):
+        with patch.dict("os.environ", {}, clear=True):
+            with self.assertRaises(RuntimeError) as ctx:
+                validate_download_token_secret()
+            self.assertIn("PLANEXE_DOWNLOAD_TOKEN_SECRET", str(ctx.exception))
+
+    def test_passes_with_download_token_secret(self):
+        with patch.dict("os.environ", {"PLANEXE_DOWNLOAD_TOKEN_SECRET": "tok-secret"}, clear=True):
+            validate_download_token_secret()
+
+    def test_passes_with_api_key_secret(self):
+        with patch.dict("os.environ", {"PLANEXE_API_KEY_SECRET": "api-secret"}, clear=True):
+            validate_download_token_secret()
+
+
+if __name__ == "__main__":
+    unittest.main()
diff --git a/mcp_cloud/tests/test_tool_surface_consistency.py b/mcp_cloud/tests/test_tool_surface_consistency.py
index 9adbf8660..245056dca 100644
--- a/mcp_cloud/tests/test_tool_surface_consistency.py
+++ b/mcp_cloud/tests/test_tool_surface_consistency.py
@@ -51,19 +51,19 @@ def test_local_plan_create_schema_has_user_api_key(self):
 
 
 class TestPlanListInputSchemaHasUserApiKey(unittest.TestCase):
-    """user_api_key must be required in the plan_list input schema."""
+    """user_api_key must be in plan_list input schema but NOT required."""
 
-    def test_cloud_plan_list_schema_requires_user_api_key(self):
+    def test_cloud_plan_list_schema_has_optional_user_api_key(self):
         props = cloud_app.PLAN_LIST_INPUT_SCHEMA.get("properties", {})
         self.assertIn("user_api_key", props)
         required = cloud_app.PLAN_LIST_INPUT_SCHEMA.get("required", [])
-        self.assertIn("user_api_key", required)
+        self.assertNotIn("user_api_key", required)
 
-    def test_local_plan_list_schema_requires_user_api_key(self):
+    def test_local_plan_list_schema_has_optional_user_api_key(self):
         props = local_app.PLAN_LIST_INPUT_SCHEMA.get("properties", {})
         self.assertIn("user_api_key", props)
         required = local_app.PLAN_LIST_INPUT_SCHEMA.get("required", [])
-        self.assertIn("user_api_key", required)
+        self.assertNotIn("user_api_key", required)
 
 
 class TestPlanRetryInputSchemaDefaults(unittest.TestCase):
diff --git a/mcp_cloud/tool_models.py b/mcp_cloud/tool_models.py
index 4925f0360..13355a0df 100644
--- a/mcp_cloud/tool_models.py
+++ b/mcp_cloud/tool_models.py
@@ -72,23 +72,23 @@ class ModelProfilesOutput(BaseModel):
 
 
 class PlanStatusInput(BaseModel):
-    task_id: str = Field(
+    plan_id: str = Field(
         ...,
-        description="Task UUID returned by plan_create. Use it to reference the plan being created.",
+        description="Plan UUID returned by plan_create. Use it to reference the plan being created.",
     )
 
 
 class PlanStopInput(BaseModel):
-    task_id: str = Field(
+    plan_id: str = Field(
         ...,
-        description="The UUID returned by plan_create. Call plan_stop with this task_id to request the plan generation to stop.",
+        description="The UUID returned by plan_create. Call plan_stop with this plan_id to request the plan generation to stop.",
     )
 
 
 class PlanRetryInput(BaseModel):
-    task_id: str = Field(
+    plan_id: str = Field(
         ...,
-        description="UUID of the failed task to retry.",
+        description="UUID of the failed plan to retry.",
     )
     model_profile: Literal["baseline", "premium", "frontier", "custom"] = Field(
         default="baseline",
@@ -99,9 +99,9 @@ class PlanRetryInput(BaseModel):
 
 
 class PlanFileInfoInput(BaseModel):
-    task_id: str = Field(
+    plan_id: str = Field(
         ...,
-        description="Task UUID returned by plan_create. Use it to download the created plan.",
+        description="Plan UUID returned by plan_create. Use it to download the created plan.",
     )
     artifact: str = Field(
         default="report",
@@ -110,9 +110,9 @@ class PlanFileInfoInput(BaseModel):
 
 
 class PlanCreateOutput(BaseModel):
-    task_id: str = Field(
+    plan_id: str = Field(
         ...,
-        description="Task UUID returned by plan_create. Stable across plan_status/plan_stop/plan_file_info."
+        description="Plan UUID returned by plan_create. Stable across plan_status/plan_stop/plan_file_info."
     )
     created_at: str
 
@@ -128,9 +128,9 @@ class PlanStatusFile(BaseModel):
 
 
 class PlanStatusSuccess(BaseModel):
-    task_id: str = Field(
+    plan_id: str = Field(
         ...,
-        description="Task UUID returned by plan_create."
+        description="Plan UUID returned by plan_create."
     )
     state: Literal["pending", "processing", "completed", "failed"] = Field(
         ...,
@@ -149,15 +149,15 @@ class PlanStatusSuccess(BaseModel):
         description=(
             "Intermediate output files produced so far. "
             "Use updated_at timestamps to detect stalls. "
-            "These files are included in the zip artifact when the task completes."
+            "These files are included in the zip artifact when the plan completes."
         ),
     )
 
 
 class PlanStatusOutput(BaseModel):
-    task_id: str | None = Field(
+    plan_id: str | None = Field(
         default=None,
-        description="Task UUID returned by plan_create."
+        description="Plan UUID returned by plan_create."
     )
     state: Literal["pending", "processing", "completed", "failed"] | None = Field(
         default=None,
@@ -176,7 +176,7 @@ class PlanStatusOutput(BaseModel):
         description=(
             "Intermediate output files produced so far. "
             "Use updated_at timestamps to detect stalls. "
-            "These files are included in the zip artifact when the task completes."
+            "These files are included in the zip artifact when the plan completes."
         ),
     )
     error: ErrorDetail | None = None
@@ -185,7 +185,7 @@ class PlanStatusOutput(BaseModel):
 class PlanStopOutput(BaseModel):
     state: Literal["pending", "processing", "completed", "failed"] | None = Field(
         default=None,
-        description="Current task state after stop request.",
+        description="Current plan state after stop request.",
     )
     stop_requested: bool | None = Field(
         default=None,
@@ -195,13 +195,13 @@ class PlanStopOutput(BaseModel):
 
 
 class PlanRetryOutput(BaseModel):
-    task_id: str | None = Field(
+    plan_id: str | None = Field(
         default=None,
-        description="Task UUID that was retried (same ID as the failed task).",
+        description="Plan UUID that was retried (same ID as the failed plan).",
     )
     state: Literal["pending", "processing", "completed", "failed"] | None = Field(
         default=None,
-        description="Current task state after retry request.",
+        description="Current plan state after retry request.",
     )
     model_profile: Literal["baseline", "premium", "frontier", "custom"] | None = Field(
         default=None,
@@ -241,23 +241,23 @@ class PlanFileInfoOutput(BaseModel):
 
 
 class PlanListInput(BaseModel):
-    user_api_key: str = Field(
-        ...,
-        description="User API key (pex_...) to scope the task list to the authenticated user.",
+    user_api_key: str | None = Field(
+        default=None,
+        description="Optional user API key for credits and attribution.",
     )
     limit: int = Field(
         default=10,
         ge=1,
         le=50,
-        description="Maximum number of tasks to return (1–50). Newest tasks are returned first.",
+        description="Maximum number of plans to return (1–50). Newest plans are returned first.",
     )
 
 
 class PlanListItem(BaseModel):
-    task_id: str = Field(..., description="Task UUID.")
+    plan_id: str = Field(..., description="Plan UUID.")
     state: Literal["pending", "processing", "completed", "failed"] = Field(
         ...,
-        description="Current task state.",
+        description="Current plan state.",
     )
     progress_percentage: float = Field(..., description="Progress from 0 to 100.")
     created_at: str = Field(..., description="UTC creation timestamp (ISO 8601).")
@@ -265,8 +265,8 @@ class PlanListItem(BaseModel):
 
 
 class PlanListOutput(BaseModel):
-    tasks: list[PlanListItem] = Field(..., description="Tasks for the authenticated user, newest first.")
-    message: str = Field(..., description="Human-readable summary (e.g. how many tasks were returned).")
+    plans: list[PlanListItem] = Field(..., description="Plans for the authenticated user, newest first.")
+    message: str = Field(..., description="Human-readable summary (e.g. how many plans were returned).")
 
 
 class PlanCreateInput(BaseModel):
@@ -296,25 +296,3 @@ class PlanCreateInput(BaseModel):
         description="Optional user API key for credits and attribution.",
     )
 
-
-# ---------------------------------------------------------------------------
-# Backward-compatible aliases for old Task* names (used internally in app.py)
-# ---------------------------------------------------------------------------
-TaskCreateInput = PlanCreateInput
-TaskCreateOutput = PlanCreateOutput
-TaskStatusInput = PlanStatusInput
-TaskStatusOutput = PlanStatusOutput
-TaskStatusTiming = PlanStatusTiming
-TaskStatusFile = PlanStatusFile
-TaskStatusSuccess = PlanStatusSuccess
-TaskStopInput = PlanStopInput
-TaskStopOutput = PlanStopOutput
-TaskRetryInput = PlanRetryInput
-TaskRetryOutput = PlanRetryOutput
-TaskFileInfoInput = PlanFileInfoInput
-TaskFileInfoOutput = PlanFileInfoOutput
-TaskFileInfoNotReadyOutput = PlanFileInfoNotReadyOutput
-TaskFileInfoReadyOutput = PlanFileInfoReadyOutput
-TaskListInput = PlanListInput
-TaskListItem = PlanListItem
-TaskListOutput = PlanListOutput
diff --git a/mcp_cloud/worker_fetchers.py b/mcp_cloud/worker_fetchers.py
new file mode 100644
index 000000000..0819b61db
--- /dev/null
+++ b/mcp_cloud/worker_fetchers.py
@@ -0,0 +1,216 @@
+"""PlanExe MCP Cloud – HTTP fetchers for worker_plan artifacts."""
+import asyncio
+import logging
+import tempfile
+from io import BytesIO
+from typing import Optional
+
+import httpx
+
+from mcp_cloud.db_setup import (
+    BASE_DIR_RUN,
+    REPORT_FILENAME,
+    WORKER_PLAN_URL,
+    ZIP_SNAPSHOT_MAX_BYTES,
+)
+from mcp_cloud.db_queries import get_plan_by_id
+from mcp_cloud.zip_utils import (
+    _sanitize_legacy_zip_snapshot,
+    extract_file_from_zip_file,
+    fetch_file_from_zip_snapshot,
+    fetch_report_from_db,
+    fetch_zip_snapshot,
+    list_files_from_zip_snapshot,
+)
+
+logger = logging.getLogger(__name__)
+
+
+async def fetch_artifact_from_worker_plan(run_id: str, file_path: str) -> Optional[bytes]:
+    """Fetch an artifact file from worker_plan via HTTP."""
+    try:
+        async with httpx.AsyncClient(timeout=60.0) as client:
+            # For report.html, use the dedicated report endpoint (most efficient)
+            if (
+                file_path == "report.html"
+                or file_path.endswith("/report.html")
+                or file_path == REPORT_FILENAME
+                or file_path.endswith(f"/{REPORT_FILENAME}")
+            ):
+                report_response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/report")
+                if report_response.status_code == 200:
+                    return report_response.content
+                logger.warning(f"Worker plan returned {report_response.status_code} for report: {run_id}")
+                report_from_db = await asyncio.to_thread(fetch_report_from_db, run_id)
+                if report_from_db is not None:
+                    return report_from_db
+                report_from_zip = await asyncio.to_thread(
+                    fetch_file_from_zip_snapshot, run_id, REPORT_FILENAME
+                )
+                if report_from_zip is not None:
+                    return report_from_zip
+                return None
+
+            # For other files, fetch the zip and extract the file
+            # This is less efficient but works without a file serving endpoint
+            async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as zip_response:
+                if zip_response.status_code != 200:
+                    logger.warning(f"Worker plan returned {zip_response.status_code} for zip: {run_id}")
+                else:
+                    zip_too_large = False
+                    content_length = zip_response.headers.get("content-length")
+                    if content_length:
+                        try:
+                            if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES:
+                                logger.warning(
+                                    "Zip snapshot too large (%s bytes) for run %s; skipping.",
+                                    content_length,
+                                    run_id,
+                                )
+                                zip_too_large = True
+                        except ValueError:
+                            logger.warning(
+                                "Invalid Content-Length for zip snapshot: %s", content_length
+                            )
+                    if not zip_too_large:
+                        with tempfile.TemporaryFile() as tmp_file:
+                            size = 0
+                            async for chunk in zip_response.aiter_bytes():
+                                size += len(chunk)
+                                if size > ZIP_SNAPSHOT_MAX_BYTES:
+                                    logger.warning(
+                                        "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.",
+                                        ZIP_SNAPSHOT_MAX_BYTES,
+                                        run_id,
+                                    )
+                                    zip_too_large = True
+                                    break
+                                tmp_file.write(chunk)
+                            if not zip_too_large:
+                                tmp_file.seek(0)
+                                file_data = extract_file_from_zip_file(tmp_file, file_path)
+                                if file_data is not None:
+                                    return file_data
+
+            snapshot_file = await asyncio.to_thread(fetch_file_from_zip_snapshot, run_id, file_path)
+            if snapshot_file is not None:
+                return snapshot_file
+            return None
+
+    except Exception as e:
+        logger.error(f"Error fetching artifact from worker_plan: {e}", exc_info=True)
+        return None
+
+async def fetch_file_list_from_worker_plan(run_id: str) -> Optional[list[str]]:
+    """Fetch the list of files from worker_plan via HTTP."""
+    try:
+        async with httpx.AsyncClient(timeout=30.0) as client:
+            response = await client.get(f"{WORKER_PLAN_URL}/runs/{run_id}/files")
+            if response.status_code == 200:
+                data = response.json()
+                files = data.get("files", [])
+                if files:
+                    return files
+                fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id)
+                if fallback_files:
+                    return fallback_files
+                return files
+            logger.warning(f"Worker plan returned {response.status_code} for files list: {run_id}")
+            fallback_files = await asyncio.to_thread(list_files_from_zip_snapshot, run_id)
+            if fallback_files is not None:
+                return fallback_files
+            return None
+    except Exception as e:
+        logger.error(f"Error fetching file list from worker_plan: {e}", exc_info=True)
+        return None
+
+
+def list_files_from_local_run_dir(run_id: str) -> Optional[list[str]]:
+    """
+    List files from local run directory when this service shares PLANEXE_RUN_DIR
+    with the worker (e.g., Docker compose).
+    """
+    run_dir = (BASE_DIR_RUN / run_id).resolve()
+    try:
+        if not run_dir.is_relative_to(BASE_DIR_RUN):
+            return None
+    except ValueError:
+        return None
+    if not run_dir.exists() or not run_dir.is_dir():
+        return None
+    try:
+        return sorted([path.name for path in run_dir.iterdir() if path.is_file()])
+    except Exception as exc:
+        logger.warning("Unable to list local run dir files for %s: %s", run_id, exc)
+        return None
+
+async def fetch_zip_from_worker_plan(run_id: str) -> Optional[bytes]:
+    """Fetch the zip snapshot from worker_plan via HTTP."""
+    try:
+        async with httpx.AsyncClient(timeout=60.0) as client:
+            async with client.stream("GET", f"{WORKER_PLAN_URL}/runs/{run_id}/zip") as response:
+                if response.status_code != 200:
+                    logger.warning("Worker plan returned %s for zip: %s", response.status_code, run_id)
+                else:
+                    zip_too_large = False
+                    content_length = response.headers.get("content-length")
+                    if content_length:
+                        try:
+                            if int(content_length) > ZIP_SNAPSHOT_MAX_BYTES:
+                                logger.warning(
+                                    "Zip snapshot too large (%s bytes) for run %s; skipping.",
+                                    content_length,
+                                    run_id,
+                                )
+                                zip_too_large = True
+                        except ValueError:
+                            logger.warning(
+                                "Invalid Content-Length for zip snapshot: %s", content_length
+                            )
+                    if not zip_too_large:
+                        buffer = BytesIO()
+                        size = 0
+                        async for chunk in response.aiter_bytes():
+                            size += len(chunk)
+                            if size > ZIP_SNAPSHOT_MAX_BYTES:
+                                logger.warning(
+                                    "Zip snapshot exceeded max size (%s bytes) for run %s; skipping.",
+                                    ZIP_SNAPSHOT_MAX_BYTES,
+                                    run_id,
+                                )
+                                zip_too_large = True
+                                break
+                            buffer.write(chunk)
+                        if not zip_too_large:
+                            return buffer.getvalue()
+
+            snapshot_bytes = await asyncio.to_thread(fetch_zip_snapshot, run_id)
+            if snapshot_bytes is not None:
+                return snapshot_bytes
+            return None
+    except Exception as e:
+        logger.error(f"Error fetching zip from worker_plan: {e}", exc_info=True)
+        return None
+
+
+async def fetch_user_downloadable_zip(task_id: str) -> Optional[bytes]:
+    """
+    Fetch a user-downloadable zip for a task.
+    New layout snapshots are served directly from PlanItem.run_zip_snapshot.
+    Legacy/task-dir fallbacks are sanitized to remove track_activity.jsonl.
+    """
+    plan = await asyncio.to_thread(get_plan_by_id, task_id)
+    if plan is None:
+        return None
+
+    snapshot_bytes = plan.run_zip_snapshot if plan.run_zip_snapshot is not None else None
+    layout_version = plan.run_artifact_layout_version or 0
+    if snapshot_bytes is not None:
+        if layout_version >= 2:
+            return snapshot_bytes
+        return _sanitize_legacy_zip_snapshot(snapshot_bytes)
+
+    worker_plan_zip = await fetch_zip_from_worker_plan(str(plan.id))
+    if worker_plan_zip is None:
+        return None
+    return _sanitize_legacy_zip_snapshot(worker_plan_zip)
diff --git a/mcp_cloud/zip_utils.py b/mcp_cloud/zip_utils.py
new file mode 100644
index 000000000..0a10ef546
--- /dev/null
+++ b/mcp_cloud/zip_utils.py
@@ -0,0 +1,100 @@
+"""PlanExe MCP Cloud – zip extraction, sanitization, and hashing utilities."""
+import hashlib
+import io
+import logging
+import zipfile
+from io import BytesIO
+from typing import Optional
+
+from mcp_cloud.db_queries import get_plan_by_id
+
+logger = logging.getLogger(__name__)
+
+
+def list_files_from_zip_bytes(zip_bytes: bytes) -> list[str]:
+    """List file entries from an in-memory zip archive."""
+    try:
+        with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file:
+            files = [name for name in zip_file.namelist() if not name.endswith("/")]
+            return sorted(files)
+    except Exception as exc:
+        logger.warning("Unable to list files from zip snapshot: %s", exc)
+        return []
+
+def extract_file_from_zip_bytes(zip_bytes: bytes, file_path: str) -> Optional[bytes]:
+    """Extract a file from an in-memory zip archive."""
+    try:
+        with zipfile.ZipFile(BytesIO(zip_bytes), 'r') as zip_file:
+            file_path_normalized = file_path.lstrip('/')
+            try:
+                return zip_file.read(file_path_normalized)
+            except KeyError:
+                return None
+    except Exception as exc:
+        logger.warning("Unable to read %s from zip snapshot: %s", file_path, exc)
+        return None
+
+def extract_file_from_zip_file(file_handle: io.BufferedIOBase, file_path: str) -> Optional[bytes]:
+    """Extract a file from a seekable zip file handle."""
+    try:
+        with zipfile.ZipFile(file_handle, 'r') as zip_file:
+            file_path_normalized = file_path.lstrip('/')
+            try:
+                return zip_file.read(file_path_normalized)
+            except KeyError:
+                return None
+    except Exception as exc:
+        logger.warning("Unable to read %s from zip stream: %s", file_path, exc)
+        return None
+
+def fetch_report_from_db(task_id: str) -> Optional[bytes]:
+    """Fetch the report HTML stored in the PlanItem."""
+    plan = get_plan_by_id(task_id)
+    if plan and plan.generated_report_html is not None:
+        return plan.generated_report_html.encode("utf-8")
+    return None
+
+def fetch_zip_snapshot(task_id: str) -> Optional[bytes]:
+    """Fetch the zip snapshot stored in the PlanItem."""
+    plan = get_plan_by_id(task_id)
+    if plan and plan.run_zip_snapshot is not None:
+        return plan.run_zip_snapshot
+    return None
+
+def fetch_file_from_zip_snapshot(task_id: str, file_path: str) -> Optional[bytes]:
+    """Fetch a file from the PlanItem zip snapshot."""
+    plan = get_plan_by_id(task_id)
+    if plan and plan.run_zip_snapshot is not None:
+        return extract_file_from_zip_bytes(plan.run_zip_snapshot, file_path)
+    return None
+
+def list_files_from_zip_snapshot(task_id: str) -> Optional[list[str]]:
+    """List files from the PlanItem zip snapshot."""
+    plan = get_plan_by_id(task_id)
+    if plan and plan.run_zip_snapshot is not None:
+        return list_files_from_zip_bytes(plan.run_zip_snapshot)
+    return None
+
+def _sanitize_legacy_zip_snapshot(zip_bytes: bytes) -> Optional[bytes]:
+    """Remove internal track_activity.jsonl files from legacy zip snapshots."""
+    try:
+        with zipfile.ZipFile(BytesIO(zip_bytes), "r") as in_zip:
+            entries = [name for name in in_zip.namelist() if not name.endswith("/")]
+            if not any(name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl" for name in entries):
+                return zip_bytes
+            out_buffer = BytesIO()
+            with zipfile.ZipFile(out_buffer, "w", compression=zipfile.ZIP_DEFLATED) as out_zip:
+                for name in entries:
+                    if name.endswith("/track_activity.jsonl") or name == "track_activity.jsonl":
+                        continue
+                    out_zip.writestr(name, in_zip.read(name))
+            return out_buffer.getvalue()
+    except Exception as exc:
+        logger.warning("Unable to sanitize legacy run zip snapshot: %s", exc)
+        return None
+
+def compute_sha256(content: str | bytes) -> str:
+    """Compute SHA256 hash of content."""
+    if isinstance(content, str):
+        content = content.encode('utf-8')
+    return hashlib.sha256(content).hexdigest()
diff --git a/mcp_local/README.md b/mcp_local/README.md
index c082b983e..a21c5d06e 100644
--- a/mcp_local/README.md
+++ b/mcp_local/README.md
@@ -13,7 +13,7 @@ proxy forwards tool calls over HTTP and downloads artifacts from `/download/{tas
 `plan_create` - Initiate creation of a plan.
 `plan_status` - Get status and progress about the creation of a plan.
 `plan_stop` - Abort creation of a plan.
-`plan_retry` - Retry a failed task using the same task id (optional model_profile, defaults to baseline).
+`plan_retry` - Retry a failed plan using the same plan id (optional model_profile, defaults to baseline).
 `plan_download` - Download the plan, either html report or a zip with everything, and save it to disk.
 
 `plan_status` caller contract:
@@ -22,15 +22,15 @@ proxy forwards tool calls over HTTP and downloads artifacts from `/download/{tas
 - `failed`: terminal error.
 
 Concurrency semantics:
-- Each `plan_create` call creates a new `task_id`.
-- `plan_retry` reuses the same failed `task_id`.
-- Server does not enforce a global one-task-at-a-time cap per client.
-- Local clients should track task ids explicitly when running tasks in parallel.
+- Each `plan_create` call creates a new `plan_id`.
+- `plan_retry` reuses the same failed `plan_id`.
+- Server does not enforce a global one-plan-at-a-time cap per client.
+- Local clients should track plan ids explicitly when running plans in parallel.
 
 Minimal error contract:
 - Tool errors use `{"error":{"code","message","details?"}}`.
-- Common proxied cloud codes include: `TASK_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`.
-- `plan_retry` may return `TASK_NOT_FAILED` if the task is not currently failed.
+- Common proxied cloud codes include: `PLAN_NOT_FOUND`, `INVALID_USER_API_KEY`, `USER_API_KEY_REQUIRED`, `INSUFFICIENT_CREDITS`, `INTERNAL_ERROR`, `generation_failed`, `content_unavailable`.
+- `plan_retry` may return `PLAN_NOT_FAILED` if the task is not currently failed.
 - Local proxy specific codes: `REMOTE_ERROR`, `DOWNLOAD_FAILED`.
 - `plan_file_info` (called under the hood by plan_download) may return `{}` while output is not ready.
 
@@ -44,14 +44,14 @@ file locally into `PLANEXE_PATH`.
 - If unset, downloads are saved to the current working directory.
 - If the path does not exist, it is created.
 - If the path points to a file (not a directory), download fails.
-- Filenames are `<task_id>-030-report.html` or `<task_id>-run.zip` (with `-1`, `-2`, ... suffixes on collisions).
+- Filenames are `<plan_id>-030-report.html` or `<plan_id>-run.zip` (with `-1`, `-2`, ... suffixes on collisions).
 - `plan_download` returns `saved_path` with the final file location.
 
 ## Run as task (MCP tasks protocol)
 
 Some MCP clients (e.g. the MCP Inspector) show a **"Run as task"** option for tools. That refers to the MCP **tasks** protocol: a separate mechanism where the client runs a tool in the background using RPC methods like `tasks/run`, `tasks/get`, `tasks/result`, and `tasks/cancel`, instead of a single blocking tool call.
 
-**PlanExe does not use or advertise the MCP tasks protocol.** Our interface is **tool-based** only: the agent calls `prompt_examples` and `model_profiles` for setup, completes a non-tool prompt drafting/approval step, then `plan_create` → gets a `task_id` → polls `plan_status` → optionally calls `plan_retry` if failed → uses `plan_download`. That flow is defined in `docs/mcp/planexe_mcp_interface.md` and is the intended design.
+**PlanExe does not use or advertise the MCP tasks protocol.** Our interface is **tool-based** only: the agent calls `prompt_examples` and `model_profiles` for setup, completes a non-tool prompt drafting/approval step, then `plan_create` → gets a `plan_id` → polls `plan_status` → optionally calls `plan_retry` if failed → uses `plan_download`. That flow is defined in `docs/mcp/planexe_mcp_interface.md` and is the intended design.
 
 You should **not** enable "Run as task" for PlanExe. The Python MCP SDK and clients like Cursor do not properly support the tasks protocol (method registration and initialization fail). Use the tools directly: create a task, poll status, then download when done.
 
diff --git a/mcp_local/planexe_mcp_local.py b/mcp_local/planexe_mcp_local.py
index 81e306291..1033ccb53 100644
--- a/mcp_local/planexe_mcp_local.py
+++ b/mcp_local/planexe_mcp_local.py
@@ -37,32 +37,32 @@
 ]
 
 
-class TaskCreateRequest(BaseModel):
+class PlanCreateRequest(BaseModel):
     prompt: str
     model_profile: Optional[ModelProfileInput] = None
     user_api_key: Optional[str] = None
 
 
-class TaskStatusRequest(BaseModel):
-    task_id: str
+class PlanStatusRequest(BaseModel):
+    plan_id: str
 
 
-class TaskStopRequest(BaseModel):
-    task_id: str
+class PlanStopRequest(BaseModel):
+    plan_id: str
 
 
-class TaskRetryRequest(BaseModel):
-    task_id: str
+class PlanRetryRequest(BaseModel):
+    plan_id: str
     model_profile: ModelProfileInput = "baseline"
 
 
-class TaskDownloadRequest(BaseModel):
-    task_id: str
+class PlanDownloadRequest(BaseModel):
+    plan_id: str
     artifact: str = "report"
 
 
-class TaskListRequest(BaseModel):
-    user_api_key: str
+class PlanListRequest(BaseModel):
+    user_api_key: Optional[str] = None
     limit: int = 10
 
 
@@ -365,31 +365,31 @@ class ToolDefinition:
 PLAN_STATUS_INPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "task_id": {
+        "plan_id": {
             "type": "string",
-            "description": "UUID of the task (returned by plan_create).",
+            "description": "UUID of the plan (returned by plan_create).",
         },
     },
-    "required": ["task_id"],
+    "required": ["plan_id"],
 }
 
 PLAN_STOP_INPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "task_id": {
+        "plan_id": {
             "type": "string",
-            "description": "UUID of the task to stop (returned by plan_create).",
+            "description": "UUID of the plan to stop (returned by plan_create).",
         },
     },
-    "required": ["task_id"],
+    "required": ["plan_id"],
 }
 
 PLAN_RETRY_INPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "task_id": {
+        "plan_id": {
             "type": "string",
-            "description": "UUID of the failed task to retry.",
+            "description": "UUID of the failed plan to retry.",
         },
         "model_profile": {
             "type": "string",
@@ -398,15 +398,15 @@ class ToolDefinition:
             "description": "Model profile used for retry. Defaults to baseline.",
         },
     },
-    "required": ["task_id"],
+    "required": ["plan_id"],
 }
 
 PLAN_DOWNLOAD_INPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "task_id": {
+        "plan_id": {
             "type": "string",
-            "description": "UUID of the task (returned by plan_create).",
+            "description": "UUID of the plan (returned by plan_create).",
         },
         "artifact": {
             "type": "string",
@@ -415,16 +415,9 @@ class ToolDefinition:
             "description": "What to download: 'report' = HTML report, 'zip' = full output bundle.",
         },
     },
-    "required": ["task_id"],
+    "required": ["plan_id"],
 }
 
-# Backward-compatible aliases
-TASK_CREATE_INPUT_SCHEMA = PLAN_CREATE_INPUT_SCHEMA
-TASK_STATUS_INPUT_SCHEMA = PLAN_STATUS_INPUT_SCHEMA
-TASK_STOP_INPUT_SCHEMA = PLAN_STOP_INPUT_SCHEMA
-TASK_RETRY_INPUT_SCHEMA = PLAN_RETRY_INPUT_SCHEMA
-TASK_DOWNLOAD_INPUT_SCHEMA = PLAN_DOWNLOAD_INPUT_SCHEMA
-
 PROMPT_EXAMPLES_INPUT_SCHEMA = {
     "type": "object",
     "properties": {},
@@ -501,16 +494,16 @@ class ToolDefinition:
 PLAN_CREATE_OUTPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "task_id": {"type": "string"},
+        "plan_id": {"type": "string"},
         "created_at": {"type": "string"},
     },
-    "required": ["task_id", "created_at"],
+    "required": ["plan_id", "created_at"],
 }
 
 PLAN_STATUS_OUTPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "task_id": {"type": ["string", "null"]},
+        "plan_id": {"type": ["string", "null"]},
         "state": {"type": ["string", "null"]},
         "progress_percentage": {"type": ["number", "null"]},
         "timing": {
@@ -545,7 +538,7 @@ class ToolDefinition:
 PLAN_RETRY_OUTPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "task_id": {"type": "string"},
+        "plan_id": {"type": "string"},
         "state": {"type": "string"},
         "model_profile": {
             "type": "string",
@@ -576,35 +569,36 @@ class ToolDefinition:
     "type": "object",
     "properties": {
         "user_api_key": {
-            "type": "string",
-            "description": "User API key (pex_...) to scope the task list to the authenticated user.",
+            "type": ["string", "null"],
+            "default": None,
+            "description": "Optional user API key for credits and attribution.",
         },
         "limit": {
             "type": "integer",
             "default": 10,
             "minimum": 1,
             "maximum": 50,
-            "description": "Maximum number of tasks to return (1-50). Newest tasks are returned first.",
+            "description": "Maximum number of plans to return (1-50). Newest plans are returned first.",
         },
     },
-    "required": ["user_api_key"],
+    "required": [],
 }
 PLAN_LIST_OUTPUT_SCHEMA = {
     "type": "object",
     "properties": {
-        "tasks": {
+        "plans": {
             "type": "array",
             "items": {
                 "type": "object",
                 "properties": {
-                    "task_id": {"type": "string"},
+                    "plan_id": {"type": "string"},
                     "state": {"type": "string"},
                     "progress_percentage": {"type": "number"},
                     "created_at": {"type": "string"},
                     "prompt_excerpt": {"type": "string"},
                 },
             },
-            "description": "Tasks for the authenticated user, newest first.",
+            "description": "Plans for the authenticated user, newest first.",
         },
         "message": {"type": "string"},
         "error": ERROR_SCHEMA,
@@ -612,15 +606,6 @@ class ToolDefinition:
     "additionalProperties": False,
 }
 
-# Backward-compatible aliases
-TASK_CREATE_OUTPUT_SCHEMA = PLAN_CREATE_OUTPUT_SCHEMA
-TASK_STATUS_OUTPUT_SCHEMA = PLAN_STATUS_OUTPUT_SCHEMA
-TASK_STOP_OUTPUT_SCHEMA = PLAN_STOP_OUTPUT_SCHEMA
-TASK_RETRY_OUTPUT_SCHEMA = PLAN_RETRY_OUTPUT_SCHEMA
-TASK_DOWNLOAD_OUTPUT_SCHEMA = PLAN_DOWNLOAD_OUTPUT_SCHEMA
-TASK_LIST_INPUT_SCHEMA = PLAN_LIST_INPUT_SCHEMA
-TASK_LIST_OUTPUT_SCHEMA = PLAN_LIST_OUTPUT_SCHEMA
-
 TOOL_DEFINITIONS = [
     ToolDefinition(
         name="prompt_examples",
@@ -671,9 +656,9 @@ class ToolDefinition:
             "plan review (critical issues, KPIs, financial strategy, automation opportunities), Q&A, "
             "premortem with failure scenarios, self-audit checklist, and adversarial premise attacks that argue against the project. "
             "The adversarial sections (premortem, self-audit, premise attacks) surface risks and questions the prompter may not have considered. "
-            "Returns task_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_download. "
-            "If you lose a task_id, call plan_list with your user_api_key to recover it. "
-            "Each plan_create call creates a new task_id (proxied to cloud; no server-side dedup). "
+            "Returns plan_id (UUID); use it for plan_status, plan_stop, plan_retry, and plan_download. "
+            "If you lose a plan_id, call plan_list to recover it. "
+            "Each plan_create call creates a new plan_id (proxied to cloud; no server-side dedup). "
             "If you are unsure which model_profile to choose, call model_profiles first. "
             "If your deployment uses credits, include user_api_key to charge the correct account. "
             "Common proxied error codes: INVALID_USER_API_KEY, USER_API_KEY_REQUIRED, INSUFFICIENT_CREDITS, REMOTE_ERROR."
@@ -696,7 +681,7 @@ class ToolDefinition:
             "State contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. "
             "progress_percentage is 0-100 (integer-like float); 100 when completed. "
             "files lists intermediate outputs produced so far; use their updated_at timestamps to detect stalls. "
-            "Unknown task_id returns TASK_NOT_FOUND (or REMOTE_ERROR when transport fails). "
+            "Unknown plan_id returns PLAN_NOT_FOUND (or REMOTE_ERROR when transport fails). "
             "Troubleshooting: pending for >5 minutes likely means queued but not picked up by a worker. "
             "processing with no file-output changes for >20 minutes likely means failed/stalled. "
             "Report these issues to https://github.com/PlanExeOrg/PlanExe/issues ."
@@ -713,11 +698,11 @@ class ToolDefinition:
     ToolDefinition(
         name="plan_stop",
         description=(
-            "Request the plan generation to stop. Pass the task_id (the UUID returned by plan_create). "
-            "Stopping is asynchronous: the stop flag is set immediately but the task may continue briefly before halting. "
-            "A stopped task will eventually transition to the failed state. "
-            "If the task is already completed or failed, stop_requested returns false (the task already finished). "
-            "Unknown task_id returns TASK_NOT_FOUND (or REMOTE_ERROR when transport fails)."
+            "Request the plan generation to stop. Pass the plan_id (the UUID returned by plan_create). "
+            "Stopping is asynchronous: the stop flag is set immediately but the plan may continue briefly before halting. "
+            "A stopped plan will eventually transition to the failed state. "
+            "If the plan is already completed or failed, stop_requested returns false (the plan already finished). "
+            "Unknown plan_id returns PLAN_NOT_FOUND (or REMOTE_ERROR when transport fails)."
         ),
         input_schema=PLAN_STOP_INPUT_SCHEMA,
         output_schema=PLAN_STOP_OUTPUT_SCHEMA,
@@ -731,10 +716,10 @@ class ToolDefinition:
     ToolDefinition(
         name="plan_retry",
         description=(
-            "Retry a task that is currently in failed state. "
-            "Pass the failed task_id and optionally model_profile (defaults to baseline). "
-            "The same task_id is requeued and reset to pending on the cloud service. "
-            "Unknown task_id returns TASK_NOT_FOUND; non-failed tasks return TASK_NOT_FAILED."
+            "Retry a plan that is currently in failed state. "
+            "Pass the failed plan_id and optionally model_profile (defaults to baseline). "
+            "The same plan_id is requeued and reset to pending on the cloud service. "
+            "Unknown plan_id returns PLAN_NOT_FOUND; non-failed plans return PLAN_NOT_FAILED."
         ),
         input_schema=PLAN_RETRY_INPUT_SCHEMA,
         output_schema=PLAN_RETRY_OUTPUT_SCHEMA,
@@ -753,7 +738,7 @@ class ToolDefinition:
             "for collapsible sections and interactive Gantt charts — open in a browser). "
             "Use artifact='zip' for the full pipeline output bundle (md, json, csv intermediary files that fed the report). "
             "If PLANEXE_PATH is unset, files are saved to the current working directory. "
-            "Filename format is <task_id>-<artifact_name> with numeric suffixes when collisions occur. "
+            "Filename format is <plan_id>-<artifact_name> with numeric suffixes when collisions occur. "
             "Common local error codes: DOWNLOAD_FAILED, REMOTE_ERROR."
         ),
         input_schema=PLAN_DOWNLOAD_INPUT_SCHEMA,
@@ -768,11 +753,10 @@ class ToolDefinition:
     ToolDefinition(
         name="plan_list",
         description=(
-            "List the most recent tasks for an authenticated user. "
-            "Requires user_api_key (pex_...). "
-            "Returns up to `limit` tasks (default 10, max 50) newest-first, each with task_id, state, "
+            "List the most recent plans for an authenticated user. "
+            "Returns up to `limit` plans (default 10, max 50) newest-first, each with plan_id, state, "
             "progress_percentage, created_at (ISO 8601), and a prompt_excerpt (first 100 chars). "
-            "Use this to recover a lost task_id or to review recent activity."
+            "Use this to recover a lost plan_id or to review recent activity."
         ),
         input_schema=PLAN_LIST_INPUT_SCHEMA,
         output_schema=PLAN_LIST_OUTPUT_SCHEMA,
@@ -803,16 +787,16 @@ class ToolDefinition:
     "Good prompt shape: objective, scope, constraints, timeline, stakeholders, budget/resources, and success criteria. "
     "Write the prompt as flowing prose — weave specs, constraints, and targets naturally into sentences. "
     "Only after approval, call plan_create. "
-    "Each plan_create call creates a new task_id; the server does not enforce a global per-client concurrency limit. "
+    "Each plan_create call creates a new plan_id; the server does not enforce a global per-client concurrency limit. "
     "Then poll plan_status (about every 5 minutes); use plan_download when complete. "
-    "If a run fails, call plan_retry with the failed task_id to requeue it (optional model_profile, defaults to baseline). "
-    "To stop, call plan_stop with the task_id from plan_create; stopping is asynchronous and the task will eventually transition to failed. "
+    "If a run fails, call plan_retry with the failed plan_id to requeue it (optional model_profile, defaults to baseline). "
+    "To stop, call plan_stop with the plan_id from plan_create; stopping is asynchronous and the plan will eventually transition to failed. "
     "If model_profiles returns MODEL_PROFILES_UNAVAILABLE, inform the user that no models are currently configured and the server administrator needs to set up model profiles. "
     "Tool errors use {error:{code,message}}. plan_download may return REMOTE_ERROR or DOWNLOAD_FAILED. "
     "plan_download saves to PLANEXE_PATH (default: current working directory) and returns saved_path. "
-    "To list recent tasks for a user call plan_list with user_api_key; returns task_id, state, progress_percentage, created_at, and prompt_excerpt. "
+    "To list recent plans for a user call plan_list; returns plan_id, state, progress_percentage, created_at, and prompt_excerpt. "
     "plan_status state contract: pending/processing => keep polling; completed => download is ready; failed => terminal error. "
-    "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the task was likely queued but not picked up by a worker (server issue). "
+    "Troubleshooting: if plan_status stays in pending for longer than 5 minutes, the plan was likely queued but not picked up by a worker (server issue). "
     "If plan_status is in processing and output files do not change for longer than 20 minutes, the run likely failed/stalled. "
     "In both cases, report the issue to PlanExe developers on GitHub: https://github.com/PlanExeOrg/PlanExe/issues . "
     "Main output: a self-contained interactive HTML report (~700KB) with collapsible sections and interactive Gantt charts — open in a browser. "
@@ -860,10 +844,10 @@ async def handle_call_tool(name: str, arguments: dict[str, Any]) -> CallToolResu
 
 
 async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult:
-    """Create a task in mcp_cloud via the local HTTP proxy.
+    """Create a plan in mcp_cloud via the local HTTP proxy.
 
     Examples:
-        - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → task_id + created_at
+        - {"prompt": "Start a dental clinic in Copenhagen with 3 treatment rooms, targeting families and children. Budget 2.5M DKK. Open within 12 months."} → plan_id + created_at
 
     Args:
         - prompt: What the plan should cover (goal, context, constraints).
@@ -871,10 +855,10 @@ async def handle_plan_create(arguments: dict[str, Any]) -> CallToolResult:
 
     Returns:
         - content: JSON string matching structuredContent.
-        - structuredContent: task_id/created_at payload or error.
+        - structuredContent: plan_id/created_at payload or error.
         - isError: True when the remote tool call fails.
     """
-    req = TaskCreateRequest(**arguments)
+    req = PlanCreateRequest(**arguments)
     payload: dict[str, Any] = {"prompt": req.prompt}
     if req.model_profile:
         payload["model_profile"] = req.model_profile
@@ -911,53 +895,53 @@ async def handle_model_profiles(arguments: dict[str, Any]) -> CallToolResult:
 
 
 async def handle_plan_status(arguments: dict[str, Any]) -> CallToolResult:
-    """Fetch status/progress for a task from mcp_cloud.
+    """Fetch status/progress for a plan from mcp_cloud.
 
     Examples:
-        - {"task_id": "uuid"} → state/progress/timing
+        - {"plan_id": "uuid"} → state/progress/timing
 
     Args:
-        - task_id: Task UUID returned by plan_create.
+        - plan_id: Plan UUID returned by plan_create.
 
     Returns:
         - content: JSON string matching structuredContent.
         - structuredContent: status payload or error.
         - isError: True when the remote tool call fails.
     """
-    req = TaskStatusRequest(**arguments)
-    payload, error = _call_remote_tool("plan_status", {"task_id": req.task_id})
+    req = PlanStatusRequest(**arguments)
+    payload, error = _call_remote_tool("plan_status", {"plan_id": req.plan_id})
     if error:
         return _wrap_response({"error": error}, is_error=True)
     return _wrap_response(payload)
 
 
 async def handle_plan_stop(arguments: dict[str, Any]) -> CallToolResult:
-    """Request mcp_cloud to stop an active task.
+    """Request mcp_cloud to stop an active plan.
 
     Examples:
-        - {"task_id": "uuid"} → stop request acknowledged
+        - {"plan_id": "uuid"} → stop request acknowledged
 
     Args:
-        - task_id: Task UUID returned by plan_create.
+        - plan_id: Plan UUID returned by plan_create.
 
     Returns:
         - content: JSON string matching structuredContent.
         - structuredContent: {"state": "pending|processing|completed|failed", "stop_requested": bool} or error.
         - isError: True when the remote tool call fails.
     """
-    req = TaskStopRequest(**arguments)
-    payload, error = _call_remote_tool("plan_stop", {"task_id": req.task_id})
+    req = PlanStopRequest(**arguments)
+    payload, error = _call_remote_tool("plan_stop", {"plan_id": req.plan_id})
     if error:
         return _wrap_response({"error": error}, is_error=True)
     return _wrap_response(payload)
 
 
 async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult:
-    """Request mcp_cloud to retry a failed task."""
-    req = TaskRetryRequest(**arguments)
+    """Request mcp_cloud to retry a failed plan."""
+    req = PlanRetryRequest(**arguments)
     payload, error = _call_remote_tool(
         "plan_retry",
-        {"task_id": req.task_id, "model_profile": req.model_profile},
+        {"plan_id": req.plan_id, "model_profile": req.model_profile},
     )
     if error:
         return _wrap_response({"error": error}, is_error=True)
@@ -965,14 +949,14 @@ async def handle_plan_retry(arguments: dict[str, Any]) -> CallToolResult:
 
 
 async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult:
-    """Download report/zip for a task from mcp_cloud and save it locally.
+    """Download report/zip for a plan from mcp_cloud and save it locally.
 
     Examples:
-        - {"task_id": "uuid"} → download report (default)
-        - {"task_id": "uuid", "artifact": "zip"} → download zip
+        - {"plan_id": "uuid"} → download report (default)
+        - {"plan_id": "uuid", "artifact": "zip"} → download zip
 
     Args:
-        - task_id: Task UUID returned by plan_create.
+        - plan_id: Plan UUID returned by plan_create.
         - artifact: Optional "report" or "zip".
 
     Returns:
@@ -980,14 +964,17 @@ async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult:
         - structuredContent: metadata + saved_path or error.
         - isError: True when download fails or remote tool errors.
     """
-    req = TaskDownloadRequest(**arguments)
+    req = PlanDownloadRequest(**arguments)
     artifact = (req.artifact or "report").strip().lower()
     if artifact not in ("report", "zip"):
-        artifact = "report"
+        return _wrap_response(
+            {"error": {"code": "INVALID_ARGUMENT", "message": f"Invalid artifact type: {req.artifact!r}. Must be 'report' or 'zip'."}},
+            is_error=True,
+        )
 
     payload, error = _call_remote_tool(
         "plan_file_info",
-        {"task_id": req.task_id, "artifact": artifact},
+        {"plan_id": req.plan_id, "artifact": artifact},
     )
     if error:
         return _wrap_response({"error": error}, is_error=True)
@@ -998,10 +985,10 @@ async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult:
     if isinstance(download_url, str) and download_url.startswith("/"):
         download_url = urljoin(_get_download_base_url().rstrip("/") + "/", download_url.lstrip("/"))
     if not download_url:
-        download_url = _derive_download_url(req.task_id, artifact)
+        download_url = _derive_download_url(req.plan_id, artifact)
 
     try:
-        destination = _choose_output_path(req.task_id, download_url, artifact)
+        destination = _choose_output_path(req.plan_id, download_url, artifact)
         downloaded_size = _download_to_path(download_url, destination)
     except Exception as exc:
         return _wrap_response(
@@ -1031,9 +1018,11 @@ async def handle_plan_download(arguments: dict[str, Any]) -> CallToolResult:
 
 
 async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult:
-    """List recent tasks for an authenticated user via mcp_cloud."""
-    req = TaskListRequest(**arguments)
-    payload_args: dict[str, Any] = {"user_api_key": req.user_api_key, "limit": req.limit}
+    """List recent plans for an authenticated user via mcp_cloud."""
+    req = PlanListRequest(**arguments)
+    payload_args: dict[str, Any] = {"limit": req.limit}
+    if req.user_api_key:
+        payload_args["user_api_key"] = req.user_api_key
     payload, error = _call_remote_tool("plan_list", payload_args)
     if error:
         return _wrap_response({"error": error}, is_error=True)
@@ -1051,14 +1040,6 @@ async def handle_plan_list(arguments: dict[str, Any]) -> CallToolResult:
     "model_profiles": handle_model_profiles,
 }
 
-# Backward-compatible aliases
-handle_task_create = handle_plan_create
-handle_task_status = handle_plan_status
-handle_task_stop = handle_plan_stop
-handle_task_retry = handle_plan_retry
-handle_task_download = handle_plan_download
-handle_task_list = handle_plan_list
-
 
 async def main() -> None:
     logger.info("Starting PlanExe MCP local proxy using %s", _get_mcp_base_url())