Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
250 changes: 128 additions & 122 deletions docs/guides/spec-driven-development.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,68 +47,72 @@ Run the planning process:

It will perform the following refinement process to update the task file with a more detailed specification:

```mermaid
flowchart TB
subgraph Input
A[📄 Draft Task File<br/>.specs/tasks/draft/*.md]
end

subgraph Phase2["Phase 2: Parallel Analysis"]
direction LR
B1[🔬 Research<br/>researcher · sonnet]
B2[📂 Codebase Analysis<br/>code-explorer · sonnet]
B3[💼 Business Analysis<br/>business-analyst · opus]

J1[⚖️ Judge 2a]
J2[⚖️ Judge 2b]
J3[⚖️ Judge 2c]

B1 --> J1
B2 --> J2
B3 --> J3
end

subgraph Phase3["Phase 3: Architecture Synthesis"]
C[🏗️ Architecture Synthesis<br/>software-architect · opus]
JC[⚖️ Judge 3]
C --> JC
end

subgraph Phase4["Phase 4: Decomposition"]
D[📋 Decomposition<br/>tech-lead · opus]
JD[⚖️ Judge 4]
D --> JD
end

subgraph Phase5["Phase 5: Parallelize"]
E[🔀 Parallelize Steps<br/>team-lead · opus]
JE[⚖️ Judge 5]
E --> JE
end

subgraph Phase6["Phase 6: Verifications"]
F[✅ Define Verifications<br/>qa-engineer · opus]
JF[⚖️ Judge 6]
F --> JF
end

subgraph Output
G[📄 Refined Task File<br/>.specs/tasks/todo/*.md]
H[📚 Skill File<br/>.claude/skills/*/SKILL.md]
I[📊 Analysis File<br/>.specs/analysis/*.md]
end

A --> Phase2
J1 & J2 & J3 --> Phase3
JC --> Phase4
JD --> Phase5
JE --> Phase6
JF --> G & H & I

style A fill:#e1f5fe
style G fill:#c8e6c9
style H fill:#c8e6c9
style I fill:#c8e6c9
```
+----------------------------+
| Draft Task File |
| .specs/tasks/draft/*.md |
+-------------+--------------+
|
v
+----------------------------------------------------------+
| Phase 2: Parallel Analysis |
| |
| +----------------+ +------------------+ +-------------+|
| | Research | | Codebase | | Business ||
| | researcher | | Analysis | | Analysis ||
| | (sonnet) | | code-explorer | | business- ||
| | | | | (sonnet) | | analyst ||
| | v | | | | | (opus) ||
| | Judge 2a | | Judge 2b | | | ||
| +------+---------+ +--------+---------+ +------+------+|
| | | | |
+----------------------------------------------------------+
| | |
+----------+----------+--------------------+
|
v
+-----------------------------+
| Phase 3: Architecture |
| software-architect (opus) |
| | |
| v |
| Judge 3 |
+--------------+--------------+
|
v
+-----------------------------+
| Phase 4: Decomposition |
| tech-lead (opus) |
| | |
| v |
| Judge 4 |
+--------------+--------------+
|
v
+-----------------------------+
| Phase 5: Parallelize |
| team-lead (opus) |
| | |
| v |
| Judge 5 |
+--------------+--------------+
|
v
+-----------------------------+
| Phase 6: Verifications |
| qa-engineer (opus) |
| | |
| v |
| Judge 6 |
+--------------+--------------+
|
+-----------------+-----------------+
| | |
v v v
+--------------+ +--------------+ +---------------+
| Refined Task | | Skill File | | Analysis File |
| todo/*.md | | SKILL.md | | analysis-*.md |
+--------------+ +--------------+ +---------------+
```

It will output the updated task file to `.specs/tasks/todo/design-implement-authentication-middleware-with-jwt-support.feature.md` and create new skills if needed. It also produces scratchpads and verification reports along the way to properly evaluate each step of the process. You can safely ignore all of them.
Expand All @@ -125,66 +129,68 @@ Once you are happy with the specification, you can run the implementation proces

It will perform the following actions:

```mermaid
flowchart TB
subgraph Phase0["Phase 0: Select Task"]
A[📄 Task from todo/<br/>or in-progress/]
A --> B[📁 Move to in-progress/]
end

subgraph Phase1["Phase 1: Load Task"]
C[📖 Parse Implementation Steps<br/>& Verification Requirements]
end

subgraph Phase2["Phase 2: Execute Steps"]
D[🔄 For Each Step]

subgraph StepExec["Step Execution Loop"]
E[👨‍💻 Developer Agent<br/>Implement Step]
F{Verification<br/>Level?}

G1[⏭️ None<br/>Skip Judge]
G2[⚖️ Single Judge<br/>threshold: 4.0]
G3[⚖️⚖️ Panel of 2<br/>threshold: 4.5]
G4[⚖️ Per-Item<br/>Parallel Judges]

H{PASS?}
I[🔧 Fix & Retry<br/>with feedback]
J[✅ Mark Step DONE]
end

D --> E
E --> F
F -->|None| G1 --> J
F -->|Single| G2 --> H
F -->|Panel| G3 --> H
F -->|Per-Item| G4 --> H
H -->|Yes| J
H -->|No| I --> E
J --> D
end

subgraph Phase3["Phase 3: Final Verification"]
K[📋 Verify Definition of Done]
L{All DoD<br/>PASS?}
M[🔧 Fix Failing Items]
end

subgraph Phase4["Phase 4: Complete"]
N[📁 Move to done/]
O[📊 Final Report]
end

Phase0 --> Phase1
Phase1 --> Phase2
Phase2 --> Phase3
K --> L
L -->|No| M --> K
L -->|Yes| Phase4

style A fill:#e1f5fe
style N fill:#c8e6c9
style O fill:#c8e6c9
```
+--------------------------------------+
| Phase 0: Select Task |
| Task from todo/ or in-progress/ |
| | |
| v |
| Move to in-progress/ |
+------------------+-------------------+
|
v
+--------------------------------------+
| Phase 1: Load Task |
| Parse Implementation Steps |
| & Verification Requirements |
+------------------+-------------------+
|
v
+------------------------------------------------------+
| Phase 2: Execute Steps |
| |
| For Each Step: |
| |
| Developer Agent: Implement Step <--+ |
| | | |
| v | |
| Verification Level? | |
| | | | | | |
| None Single Panel Per-Item | |
| | (4.0) (4.5) (Parallel) | |
| | | | | | |
| | +---+---+-------+ | |
| | | | |
| | v | |
| | PASS? --No--> Fix & Retry |
| | | |
| | Yes |
| +-----+-----+ |
| | |
| v |
| Mark Step DONE |
+----------------------+-------------------------------+
|
v
+--------------------------------------+
| Phase 3: Final Verification |
| |
| Verify Definition of Done <--+ |
| | | |
| v | |
| All DoD PASS? | |
| / \ | |
| Yes No | |
| | \ | |
| | Fix Failing Items--+ |
+--------+-----------------------------+
|
v
+--------------------------------------+
| Phase 4: Complete |
| Move to done/ |
| Final Report |
+--------------------------------------+
```

It will automatically write tests, verify them, build the solution, and confirm it works as expected.
Expand Down
72 changes: 24 additions & 48 deletions docs/plugins/sdd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This plugin is designed to consistently and reproducibly produce working code. I

- **Development as compilation** — The plugin works like a "compilation" or "nightly build" for your development process: `task specs → run /sdd:implement → working code`. After writing your prompt, you can launch the plugin and expect a working result when you come back. The time it takes depends on task complexity — simple tasks may finish in 30 minutes, while complex ones can take a few days.
- **Benchmark-level quality in real life** — Model benchmarks improve with each release, yet real-world results usually stay the same. That's because benchmarks reflect the best possible output a model can achieve, whereas in practice LLMs tend to drift toward sub-optimal solutions that can be wrong or non-functional. This plugin uses a variety of patterns to keep the model working at its peak performance.
- **Customizable** — Balance between result quality and process speed by adjusting command parameters. Learn more in the [Customization](./customization) section.
- **Customizable** — Balance between result quality and process speed by adjusting command parameters. Learn more in the [Customization](customization.md) section.
- **Developer time-efficient** — The overall process is designed to minimize developer time and reduce the number of interactions, while still producing results better than what a model can generate from scratch. However, overall quality is highly proportional to the time you invest in iterating and refining the specification.
- **Industry-standard** — The plugin's specification template is based on the arc42 standard, adjusted for LLM capabilities. Arc42 is a widely adopted, high-quality standard for software development documentation used by many companies and organizations.
- **Works best in complex or large codebases** — While most other frameworks work best for new projects and greenfield development, this plugin is designed to perform better the more existing code and well-structured architecture you have. At each planning phase it includes a **codebase impact analysis** step that evaluates which files may be affected and which patterns to follow to achieve the desired result.
Expand Down Expand Up @@ -46,71 +46,47 @@ Restart the Claude Code session to clear context and start fresh. Then run the f
# produces working implementation and moves the task to .specs/tasks/done/ folder
```

- [Detailed guide](../../guides/spec-driven-development)
- [Usage Examples](./usage-examples)
- [Detailed guide](../../guides/spec-driven-development.md)
- [Usage Examples](usage-examples.md)

## Overall Flow

End-to-end task implementation process from initial prompt to pull request, including commands from the [git](../git) plugin:
End-to-end task implementation process from initial prompt to pull request, including commands from the [git](../git/README.md) plugin:

- `/sdd:add-task` → creates a `.specs/tasks/draft/<task-name>.<type>.md` file with the initial task description.
- `/sdd:plan` → generates a `.claude/skills/<skill-name>/SKILL.md` file with skills needed to implement the task (by analyzing library and framework documentation used in the codebase), then updates the task file with a refined specification and moves it to `.specs/tasks/todo/`.
- `/sdd:implement` → produces a working implementation, verifies it, then moves the task to `.specs/tasks/done/`.
- `/git:commit` → commits changes.
- `/git:create-pr` → creates a pull request.

```mermaid
flowchart LR
subgraph Create["1. Create"]
A["/sdd:add-task"]
end

subgraph Plan["2. Plan"]
B["/sdd:plan"]
end

subgraph Implement["3. Implement"]
C["/sdd:implement"]
end

subgraph Ship["4. Ship"]
D["/git:commit"]
E["/git:create-pr"]
end

subgraph Files["Task Lifecycle"]
F1[📄 draft/*.md]
F2[📄 todo/*.md]
F3[📄 in-progress/*.md]
F4[📄 done/*.md]
end

A --> F1
F1 --> B
B --> F2
F2 --> C
C --> F3
F3 --> F4
F4 --> D --> E

style F1 fill:#ffecb3
style F2 fill:#e1f5fe
style F3 fill:#fff3e0
style F4 fill:#c8e6c9
```
1. Create 2. Plan 3. Implement 4. Ship
+-------------+ +-----------+ +---------------+ +-----------------+
|/sdd:add-task| | /sdd:plan | |/sdd:implement | | /git:commit |
+------+------+ +-----+-----+ +------+--------+ | | |
| | | | v |
v v v |/git:create-pr |
+-------+---------+
|
Task Lifecycle |
+----------+ +----------+ +--------------+ +---------+
| draft/ +-->| todo/ +-->| in-progress/ +-->| done/ |
| *.md | | *.md | | *.md | | *.md |
+----------+ +----------+ +--------------+ +---------+
```

## Commands

Core workflow commands:

- [/sdd:add-task](./add-task) - Create task template file with initial prompt
- [/sdd:plan](./plan) - Analyze prompt, generate required skills and refine task specification
- [/sdd:implement](./implement) - Produce working implementation of the task and verify it
- [/sdd:add-task](add-task.md) - Create task template file with initial prompt
- [/sdd:plan](plan.md) - Analyze prompt, generate required skills and refine task specification
- [/sdd:implement](implement.md) - Produce working implementation of the task and verify it

Additional commands useful before creating a task:

- [/sdd:create-ideas](./create-ideas) - Generate diverse ideas on a given topic using creative sampling techniques
- [/sdd:brainstorm](./brainstorm) - Refine vague ideas into fully-formed designs through collaborative dialogue
- [/sdd:create-ideas](create-ideas.md) - Generate diverse ideas on a given topic using creative sampling techniques
- [/sdd:brainstorm](brainstorm.md) - Refine vague ideas into fully-formed designs through collaborative dialogue

## Available Agents

Expand Down Expand Up @@ -149,7 +125,7 @@ Our tests showed that even when the initially generated specification was incorr

Even if you don't want to spend much time on this process, you can still use the plugin for complex tasks without decomposition or human verification — but you will likely need tools like ralph-loop to keep the agent running for a longer time.

Learn more about available customization options in [Customization](./customization).
Learn more about available customization options in [Customization](customization.md).

## Theoretical Foundation

Expand Down
Loading