SpecLeft MCP Server — Design Specification (v0.3)
Goal
Add a new SpecLeft feature to accommodate MCP server for optimised agent utilisation and discovery.
Acceptance criteria
Architecture
SpecLeft's MCP surface is intentionally minimal. The server exposes 3 resources for context and 1 tool for project setup. All other CLI capabilities are delegated to the agent's native shell execution via a skill file generated during init.
Include ./src/mcp in to the current project.
MCP Server (always in context)
├── specleft://contract → safety guarantees
├── specleft://guide → workflow + spec format
├── specleft://status → project state + coverage
└── specleft_init → project setup + skill file generation
Skill file (read once from disk)
└── .specleft/SKILL.md → full CLI command reference
Rationale: Target agents (Claude Code, Cursor, Aider) already have native shell execution. Wrapping every CLI command as an MCP tool adds ~180 tokens per turn in declarations with no functional benefit over specleft <command> --format json. The skill file teaches workflow and sequencing more effectively than tool descriptions can, at zero per-turn cost.
Testing
Test the SpecLeft MCP server (3 resources + 1 tool) using FastMCP’s in-memory Client for automated tests and MCP Inspector for manual debugging.
Test location: tests/mcp/
Test strategy: Details in comments
Dependencies
[project.optional-dependencies]
mcp = ["fastmcp<3"]
dev = [
# ... existing dev deps ...
"pytest-asyncio",
"tiktoken",
]
Server Metadata
{
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
"name": "io.github.SpecLeft/specleft",
"title": "SpecLeft",
"description": "Track and enforce Python feature coverage with verifiable safety guarantees. Generates pytest test scaffolding from markdown specifications, monitors implementation progress, and blocks PRs that violate coverage policies. Agent-optimised: 3 resources, 1 tool, offline-only, no telemetry, no API keys required.",
"version": "0.3.0",
"websiteUrl": "https://specleft.dev",
"repository": {
"url": "https://github.com/SpecLeft/specleft",
"source": "github"
},
"packages": [
{
"registryType": "pypi",
"registryBaseUrl": "https://pypi.org",
"identifier": "specleft",
"version": "0.3.0",
"runtimeHint": "uvx",
"transport": {
"type": "stdio"
},
"packageArguments": [
{
"type": "positional",
"value": "mcp"
}
],
"environmentVariables": []
}
]
}
Registry tags: testing, python, pytest, specifications, tdd, bdd, code-quality
Resources
Important: Payload shapes demonstrate the shape only, not the actual output. No functional change to the output is expected - unless stated otherwise.
specleft://contract
| Field |
Value |
| URI |
specleft://contract |
| Name |
SpecLeft Agent Contract |
| Parameter |
pretty: bool (default: false) |
| MIME type |
application/json |
| Description |
Safety and determinism guarantees for this SpecLeft installation. Read before invoking any tools or CLI commands. Defines what is safe to retry, what never writes without confirmation, and exit code semantics. |
When agents should read this: Before first interaction. Establishes trust boundary.
Payload shape:
{
"contract_version": "1.0",
"specleft_version": "0.3.0",
"guarantees": {
"safety": {
"no_implicit_writes": true,
"dry_run_never_writes": true,
"existing_tests_not_modified_by_default": true
},
"execution": {
"skeletons_skipped_by_default": true,
"skipped_never_fail": true,
"validation_non_destructive": true
},
"determinism": {
"deterministic_for_same_inputs": true,
"safe_for_retries": true
},
"cli_api": {
"json_supported_globally": true,
"json_additive_within_minor": true,
"exit_codes": {
"success": 0,
"error": 1,
"cancelled": 2
}
},
"skill_security": {
"skill_file_integrity_check": true,
"skill_file_commands_are_simple": true
},
"exit_codes": { "success": 0, "error": 1, "cancelled": 2 },
}
}
Target size: ~80 tokens
specleft://guide
| Field |
Value |
| URI |
specleft://guide |
| Name |
SpecLeft Workflow Guide |
| MIME type |
application/json |
| Parameter |
pretty: bool (default: false) |
| Description |
How to use SpecLeft: workflow sequence, canonical feature template, spec format rules, and scenario structure. Read once before creating or modifying feature specs. For full CLI command reference, read .specleft/SKILL.md after running init. |
When agents should read this: Before creating any feature specs or generating tests.
Expected Payload:
{
"workflow": {
"bulk_setup": [
{"step": 1, "command": "plan --analyze", "description": "Understand how the current plan aligns with the prd-template.yml"},
{"step": 2, "action": "modify", "description": "If required - modify the .specleft/prd-template.yml to fully align PRD"},
{"step": 3, "command": "plan", "description": "Generate features from calibrated PRD"},
{"step": 4, "command": "features add-scenario --add-test skeleton", "description": "Add scenarios with test scaffolding"}
],
"incremental_setup": [
{"step": 1, "command": "features add", "description": "Add individual feature"},
{"step": 2, "command": "features add-scenario --add-test skeleton", "description": "Add scenarios with test scaffolding"}
],
"implementation": [
{"step": 1, "command": "next --limit 1", "description": "Pick next scenario to implement"},
{"step": 2, "action": "implement", "description": "Write test logic first, followed by application code"},
{"step": 3, "action": "pytest", "description": "Run test, fix if failing, repeat from step 1"},
{"step": 4, "command": "features validate --strict", "description": "Validate all specs before commit"},
{"step": 5, "command": "coverage --threshold 100", "description": "Verify full feature coverage"}
]
},
"skill_file": "Run specleft_init to generate .specleft/SKILL.md with full CLI reference"
}
Target size: ~220 tokens
specleft://status
| Field |
Value |
| URI |
specleft://status |
| Name |
SpecLeft Project Status |
| MIME type |
application/json |
| Arguments |
verbose (default: false) , pretty (default: false) |
| Description |
Current project state: feature list, scenario counts, implementation coverage, and priority breakdown. Summary-first — use CLI commands for scenario-level detail. Returns empty features array if project is not initialised. |
When agents should read this: Before planning work. Periodically during implementation to track progress.
Default Payload shape (Verbose = False):
{"initialised": true, "features":3,"scenarios":12,"implemented":7,"skipped":5,"coverage_percent":58.3}
**Note: ** "initialised": true has to be added to the current status payload. Then re-use that function for the output (same as CLI command)
Payload shape (Verbose = True):
{
"initialised": true,
"summary": {
"features": 12,
"scenarios": 48,
"implemented": 31,
"skipped": 17,
"coverage_percent": 64.6
},
"by_priority": {
"critical": { "total": 5, "implemented": 5, "percent": 100.0 },
"high": { "total": 12, "implemented": 8, "percent": 66.7 },
"medium": { "total": 25, "implemented": 15, "percent": 60.0 },
"low": { "total": 6, "implemented": 3, "percent": 50.0 }
},
"features": [
{
"feature_id": "auth",
"scenarios": 6,
"implemented": 6,
"coverage_percent": 100.0
},
{
"feature_id": "payments",
"scenarios": 8,
"implemented": 3,
"coverage_percent": 37.5
}
]
}
Scaling behaviour: Summary is fixed ~30 tokens. Each feature entry is ~15 tokens. A 20-feature project costs ~330 tokens. Scenario-level detail is never inlined — agents drill down via specleft next --feature <id> --format json in the shell.
Uninitialised project response:
{
"initialised": false,
"summary": { "features": 0, "scenarios": 0, "implemented": 0, "skipped": 0, "coverage_percent": 0 },
"by_priority": {},
"features": []
}
This signals the agent to call specleft_init before proceeding.
Tools
specleft_init
| Field |
Value |
| Name |
specleft_init |
| Description |
Initialise SpecLeft project: creates directory structure (features/, tests/, .specleft/), runs environment health checks, and generates .specleft/SKILL.md with full CLI command reference. Call this first if specleft://status shows initialised=false. |
Input schema:
{
"type": "object",
"properties": {
"example": {
"type": "boolean",
"default": false,
"description": "Create example feature specs to demonstrate format"
},
"blank": {
"type": "boolean",
"default": false,
"description": "Create empty directory structure only"
},
"dry_run": {
"type": "boolean",
"default": false,
"description": "Show what would be created without writing"
}
}
}
Output shape:
{
"success": true,
"health": {
"python_version": "3.11.0",
"dependencies_ok": true,
"plugin_registered": true
},
"created": [
".specleft/specs",
".specleft/policies",
".specleft/",
".specleft/SKILL.md"
],
"skill_file": ".specleft/SKILL.md",
"next_steps": "Read .specleft/SKILL.md for full CLI reference"
}
Health check integration: Init runs specleft doctor checks internally. If the environment is unhealthy, init fails with diagnostic output rather than creating a broken project. No separate doctor resource needed.
Idempotent: Safe to re-run. Existing directories and files are preserved. Skill file is regenerated to reflect current CLI version.
Add the following snippet to the README.md
## MCP Server Setup
SpecLeft includes an MCP server that connects directly to AI coding agents like Claude Code, Cursor, Codex, OpenCode etc. Once connected, your agent can read specs, track coverage, and generate test scaffolding without leaving the conversation.
See [GET_STARTED.md](https://github.com/SpecLeft/specleft/blob/main/GET_STARTED.md) for details
Add the following snippet to the GET_STARTED.md
## MCP Server Setup
SpecLeft includes an MCP server that connects directly to AI coding agents like Claude Code, Cursor, and Windsurf. Once connected, your agent can read specs, track coverage, and generate test scaffolding without leaving the conversation.
### Prerequisites
- Python 3.10+
- SpecLeft installed (`pip install specleft[mcp]`)
### Option 1: uvx (recommended)
If you have [uv](https://docs.astral.sh/uv/getting-started/installation/) installed, this is the fastest path. No separate install step — the MCP client handles everything.
**Claude Code** (`.claude/settings.json`):
```json
{
"mcpServers": {
"specleft": {
"command": "uvx",
"args": ["specleft", "mcp"]
}
}
}
Cursor (.cursor/mcp.json):
{
"mcpServers": {
"specleft": {
"command": "uvx",
"args": ["specleft", "mcp"]
}
}
}
Option 2: pip install
If you don't have uv, install SpecLeft first, then point your MCP client at the CLI directly.
pip install specleft[mcp]
Claude Code (.claude/settings.json):
{
"mcpServers": {
"specleft": {
"command": "specleft",
"args": ["mcp"]
}
}
}
Cursor (.cursor/mcp.json):
{
"mcpServers": {
"specleft": {
"command": "specleft",
"args": ["mcp"]
}
}
}
Verify the connection
Once configured, restart your editor. You should see SpecLeft listed as a connected MCP server with 3 resources and 1 tool. If the connection fails, run the health check:
This checks Python version, dependencies, and plugin registration. Fix any issues it reports, then restart your editor to reconnect.
---
## CLI → MCP Classification Reference
Complete mapping of all CLI commands to their MCP role.
### Exposed via MCP (always in context)
| CLI Command | MCP Type | MCP Identifier |
|---|---|---|
| `specleft contract` | Resource | `specleft://contract` |
| N/A | Resource | `specleft://guide` |
| `specleft status` + `coverage` + `features list` + `features stats` | Resource (merged) | `specleft://status` |
| `specleft init` + `doctor` | Tool (merged) | `specleft_init` |
### Delegated to skill file (zero per-turn cost)
| CLI Command | Skill file section | Mutation |
|---|---|---|
| `specleft features validate` | Features | Read-only |
| `specleft features add` | Features | Write (creates file) |
| `specleft features add-scenario` | Features | Write (appends to file) |
| `specleft next` | Status & Planning | Read-only |
| `specleft test skeleton` | Test Generation | Write (creates files) |
| `specleft test stub` | Test Generation | Write (creates files) |
| `specleft test report` | Test Generation | Write (creates HTML) |
| `specleft skill status` | Skill Integrity | Read-only |
| `specleft plan` | Planning | Write (creates feature files) |
| `specleft contract test` | Contract | Read-only (isolated temp dir) |
| `specleft enforce` | Enforcement | Read-only |
| `specleft license status` | License | Read-only |
| `specleft guide` | Status & Planning | Read-only |
---
## Agent Discovery Flow
- Agent connects to MCP server
- Sees 3 resources + 1 tool (minimal surface)
- Reads specleft://contract → understands safety model
- Reads specleft://guide → understands workflow
- Reads specleft://status → understands project state
- If initialised=false → calls specleft_init
- Reads .specleft/SKILL.md → has full CLI reference (19 commands)
- Works using shell commands with --format json
Steps 3–5 happen naturally based on resource descriptions. Step 6 is triggered by the status response. Step 7 is prompted by both the guide payload and init output.
---
## Token Budget
### Per-turn overhead (every turn of conversation)
| Component | Tokens | Notes |
|---|---|---|
| Resource declarations (×3) | ~60 | URI + name + description each |
| Tool declaration (×1) | ~25 | Name + description + input schema |
| **Total per turn** | **~85** | |
| **30-turn session** | **~2,550** | |
### One-time costs (read once per session)
| Component | Tokens | Notes |
|---|---|---|
| Contract payload | ~80 | Read at session start |
| Guide payload | ~500 | Read at session start |
| Status payload | ~200 | Read at start + periodically (12-feature project) |
| Skill file | ~800 | Read from disk after init |
| **Total one-time** | **~1,580** | |
### Session total
**30-turn session: ~4,130 tokens**
Compared to wrapping all 19 commands as MCP tools (~15,000+ tokens), this is a **72% reduction**.
### Net-positive threshold
One avoided wrong-path-then-correction cycle costs 2,000–5,000 tokens. SpecLeft's MCP presence pays for itself if it prevents a **single** misstep per session. At ~4,130 tokens total overhead, the margin is wide.
SpecLeft MCP Server — Design Specification (v0.3)
Goal
Add a new SpecLeft feature to accommodate MCP server for optimised agent utilisation and discovery.
Acceptance criteria
features/with end-to-end scenarios to cover the resources and tool calls.python -m specleft.mcpArchitecture
SpecLeft's MCP surface is intentionally minimal. The server exposes 3 resources for context and 1 tool for project setup. All other CLI capabilities are delegated to the agent's native shell execution via a skill file generated during init.
Include ./src/mcp in to the current project.
Rationale: Target agents (Claude Code, Cursor, Aider) already have native shell execution. Wrapping every CLI command as an MCP tool adds ~180 tokens per turn in declarations with no functional benefit over
specleft <command> --format json. The skill file teaches workflow and sequencing more effectively than tool descriptions can, at zero per-turn cost.Testing
Test the SpecLeft MCP server (3 resources + 1 tool) using FastMCP’s in-memory
Clientfor automated tests and MCP Inspector for manual debugging.Test location:
tests/mcp/Test strategy: Details in comments
Dependencies
Server Metadata
{ "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json", "name": "io.github.SpecLeft/specleft", "title": "SpecLeft", "description": "Track and enforce Python feature coverage with verifiable safety guarantees. Generates pytest test scaffolding from markdown specifications, monitors implementation progress, and blocks PRs that violate coverage policies. Agent-optimised: 3 resources, 1 tool, offline-only, no telemetry, no API keys required.", "version": "0.3.0", "websiteUrl": "https://specleft.dev", "repository": { "url": "https://github.com/SpecLeft/specleft", "source": "github" }, "packages": [ { "registryType": "pypi", "registryBaseUrl": "https://pypi.org", "identifier": "specleft", "version": "0.3.0", "runtimeHint": "uvx", "transport": { "type": "stdio" }, "packageArguments": [ { "type": "positional", "value": "mcp" } ], "environmentVariables": [] } ] }Registry tags:
testing,python,pytest,specifications,tdd,bdd,code-qualityResources
Important: Payload shapes demonstrate the shape only, not the actual output. No functional change to the output is expected - unless stated otherwise.
specleft://contractspecleft://contractSpecLeft Agent Contractpretty: bool(default: false)application/jsonWhen agents should read this: Before first interaction. Establishes trust boundary.
Payload shape:
{ "contract_version": "1.0", "specleft_version": "0.3.0", "guarantees": { "safety": { "no_implicit_writes": true, "dry_run_never_writes": true, "existing_tests_not_modified_by_default": true }, "execution": { "skeletons_skipped_by_default": true, "skipped_never_fail": true, "validation_non_destructive": true }, "determinism": { "deterministic_for_same_inputs": true, "safe_for_retries": true }, "cli_api": { "json_supported_globally": true, "json_additive_within_minor": true, "exit_codes": { "success": 0, "error": 1, "cancelled": 2 } }, "skill_security": { "skill_file_integrity_check": true, "skill_file_commands_are_simple": true }, "exit_codes": { "success": 0, "error": 1, "cancelled": 2 }, } }Target size: ~80 tokens
specleft://guidespecleft://guideSpecLeft Workflow Guideapplication/jsonpretty: bool(default: false).specleft/SKILL.mdafter running init.When agents should read this: Before creating any feature specs or generating tests.
Expected Payload:
{ "workflow": { "bulk_setup": [ {"step": 1, "command": "plan --analyze", "description": "Understand how the current plan aligns with the prd-template.yml"}, {"step": 2, "action": "modify", "description": "If required - modify the .specleft/prd-template.yml to fully align PRD"}, {"step": 3, "command": "plan", "description": "Generate features from calibrated PRD"}, {"step": 4, "command": "features add-scenario --add-test skeleton", "description": "Add scenarios with test scaffolding"} ], "incremental_setup": [ {"step": 1, "command": "features add", "description": "Add individual feature"}, {"step": 2, "command": "features add-scenario --add-test skeleton", "description": "Add scenarios with test scaffolding"} ], "implementation": [ {"step": 1, "command": "next --limit 1", "description": "Pick next scenario to implement"}, {"step": 2, "action": "implement", "description": "Write test logic first, followed by application code"}, {"step": 3, "action": "pytest", "description": "Run test, fix if failing, repeat from step 1"}, {"step": 4, "command": "features validate --strict", "description": "Validate all specs before commit"}, {"step": 5, "command": "coverage --threshold 100", "description": "Verify full feature coverage"} ] }, "skill_file": "Run specleft_init to generate .specleft/SKILL.md with full CLI reference" }Target size: ~220 tokens
specleft://statusspecleft://statusSpecLeft Project Statusapplication/jsonverbose(default: false) ,pretty(default: false)When agents should read this: Before planning work. Periodically during implementation to track progress.
Default Payload shape (Verbose = False):
{"initialised": true, "features":3,"scenarios":12,"implemented":7,"skipped":5,"coverage_percent":58.3}**Note: **
"initialised": truehas to be added to the current status payload. Then re-use that function for the output (same as CLI command)Payload shape (Verbose = True):
{ "initialised": true, "summary": { "features": 12, "scenarios": 48, "implemented": 31, "skipped": 17, "coverage_percent": 64.6 }, "by_priority": { "critical": { "total": 5, "implemented": 5, "percent": 100.0 }, "high": { "total": 12, "implemented": 8, "percent": 66.7 }, "medium": { "total": 25, "implemented": 15, "percent": 60.0 }, "low": { "total": 6, "implemented": 3, "percent": 50.0 } }, "features": [ { "feature_id": "auth", "scenarios": 6, "implemented": 6, "coverage_percent": 100.0 }, { "feature_id": "payments", "scenarios": 8, "implemented": 3, "coverage_percent": 37.5 } ] }Scaling behaviour: Summary is fixed ~30 tokens. Each feature entry is ~15 tokens. A 20-feature project costs ~330 tokens. Scenario-level detail is never inlined — agents drill down via
specleft next --feature <id> --format jsonin the shell.Uninitialised project response:
{ "initialised": false, "summary": { "features": 0, "scenarios": 0, "implemented": 0, "skipped": 0, "coverage_percent": 0 }, "by_priority": {}, "features": [] }This signals the agent to call
specleft_initbefore proceeding.Tools
specleft_initspecleft_initInput schema:
{ "type": "object", "properties": { "example": { "type": "boolean", "default": false, "description": "Create example feature specs to demonstrate format" }, "blank": { "type": "boolean", "default": false, "description": "Create empty directory structure only" }, "dry_run": { "type": "boolean", "default": false, "description": "Show what would be created without writing" } } }Output shape:
{ "success": true, "health": { "python_version": "3.11.0", "dependencies_ok": true, "plugin_registered": true }, "created": [ ".specleft/specs", ".specleft/policies", ".specleft/", ".specleft/SKILL.md" ], "skill_file": ".specleft/SKILL.md", "next_steps": "Read .specleft/SKILL.md for full CLI reference" }Health check integration: Init runs
specleft doctorchecks internally. If the environment is unhealthy, init fails with diagnostic output rather than creating a broken project. No separate doctor resource needed.Idempotent: Safe to re-run. Existing directories and files are preserved. Skill file is regenerated to reflect current CLI version.
Add the following snippet to the README.md
Add the following snippet to the GET_STARTED.md
Cursor (
.cursor/mcp.json):{ "mcpServers": { "specleft": { "command": "uvx", "args": ["specleft", "mcp"] } } }Option 2: pip install
If you don't have
uv, install SpecLeft first, then point your MCP client at the CLI directly.Claude Code (
.claude/settings.json):{ "mcpServers": { "specleft": { "command": "specleft", "args": ["mcp"] } } }Cursor (
.cursor/mcp.json):{ "mcpServers": { "specleft": { "command": "specleft", "args": ["mcp"] } } }Verify the connection
Once configured, restart your editor. You should see SpecLeft listed as a connected MCP server with 3 resources and 1 tool. If the connection fails, run the health check:
This checks Python version, dependencies, and plugin registration. Fix any issues it reports, then restart your editor to reconnect.