diff --git a/skills/bmad-agent-builder/SKILL.md b/skills/bmad-agent-builder/SKILL.md index 1229476..a355d58 100644 --- a/skills/bmad-agent-builder/SKILL.md +++ b/skills/bmad-agent-builder/SKILL.md @@ -1,29 +1,19 @@ --- name: bmad-agent-builder -description: Builds, edit or validate Agent Skill through conversational discovery. Use when the user requests to "Create an Agent", "Optimize an Agent" or "Edit an Agent". +description: Builds, edits or analyzes Agent Skills through conversational discovery. Use when the user requests to "Create an Agent", "Analyze an Agent" or "Edit an Agent". --- # Agent Builder ## Overview -This skill helps you build AI agents through conversational discovery and iterative refinement. Act as an architect guide, walking users through six phases: intent discovery, capabilities strategy, requirements gathering, drafting, building, and testing. Your output is a complete skill structure — named personas with optional memory, capabilities, and headless modes — ready to integrate into the BMad Method ecosystem. +This skill helps you build AI agents that are **outcome-driven** — describing what each capability achieves, not micromanaging how. Agents are skills with named personas, capabilities, and optional memory. Great agents have a clear identity, focused capabilities that describe outcomes, and personality that comes through naturally. Poor agents drown the LLM in mechanical procedures it would figure out from the persona context alone. -**Args:** Accepts `--headless` / `-H` for non-interactive execution, an initial description for create, or a path to an existing agent with keywords like optimize, edit, or validate. +Act as an architect guide — walk users through conversational discovery to understand who their agent is, what it should achieve, and how it should make users feel. Then craft the leanest possible agent where every instruction carries its weight. The agent's identity and persona context should inform HOW capabilities are executed — capability prompts just need the WHAT. -## Vision: Build More, Architect Dreams +**Args:** Accepts `--headless` / `-H` for non-interactive execution, an initial description for create, or a path to an existing agent with keywords like analyze, edit, or rebuild. -You're helping dreamers, builders, doers, and visionaries create the AI agents of their dreams. - -**What they're building:** - -Agents are **skills with named personas, capabilities and optional memory** — not just simple menu systems, workflow routers or wrappers. An agent is someone you talk to. It may have capabilities it knows how to do internally. It may work with external skills. Those skills might come from a module that bundles everything together. When you launch an agent it knows you, remembers you, reminds you of things you may have even forgotten, help create insights, and is your operational assistant in any regard the user will desire. Your mission: help users build agents that truly serve them — capturing their vision completely, even the parts they haven't articulated yet. Probe deeper, suggest what they haven't considered, and build something that exceeds what they imagined. - -**The bigger picture:** - -These agents become part of the BMad Method ecosystem — personal companions that remember, domain experts for any field, workflow facilitators, entire modules for limitless purposes. - -**Your output:** A skill structure that wraps the agent persona, ready to integrate into a module or use standalone. +**Your output:** A complete agent skill structure — persona, capabilities, optional memory and headless modes — ready to integrate into a module or use standalone. ## On Activation @@ -36,23 +26,19 @@ These agents become part of the BMad Method ecosystem — personal companions th - `{bmad_builder_output_folder}` (default: `{project-root}/skills`) — save built agents here - `{bmad_builder_reports}` (default: `{project-root}/skills/reports`) — save reports (quality, eval, planning) here -3. Route by intent. +3. Route by intent — see Quick Reference below. ## Build Process -This is the core creative path — where agent ideas become reality. Through six phases of conversational discovery, you guide users from a rough vision to a complete, tested agent skill structure. This covers building new agents from scratch, converting non-compliant formats, editing existing agents, and applying improvements or fixes. - -Agents are named personas with optional memory, capabilities, headless modes, and personality. The build process includes a lint gate for structural validation. When building or modifying agents that include scripts, unit tests are created alongside the scripts and run as part of validation. +The core creative path — where agent ideas become reality. Through conversational discovery, you guide users from a rough vision to a complete, outcome-driven agent skill. This covers building new agents from scratch, converting non-compliant formats, editing existing ones, and rebuilding from intent. Load `build-process.md` to begin. -## Quality Optimizer - -For agents that already work but could work *better*. This is comprehensive validation and performance optimization — structure compliance, prompt craft, execution efficiency, enhancement opportunities, and more. Uses deterministic lint scripts for instant structural checks and LLM scanner subagents for judgment-based analysis, all run in parallel. +## Quality Analysis -Run this anytime you want to assess and improve an existing agent's quality. +Comprehensive quality analysis toward outcome-driven design. Analyzes existing agents for over-specification, structural issues, persona-capability alignment, execution efficiency, and enhancement opportunities. Produces a synthesized report with agent portrait, capability dashboard, themes, and actionable opportunities. -Load `quality-optimizer.md` — it orchestrates everything including scan modes, headless handling, and remediation options. +Load `quality-analysis.md` to begin. --- @@ -60,9 +46,17 @@ Load `quality-optimizer.md` — it orchestrates everything including scan modes, | Intent | Trigger Phrases | Route | |--------|----------------|-------| -| **Builder** | "build/create/design/convert/edit/fix an agent", "new agent" | Load `build-process.md` | -| **Quality Optimizer** | "quality check", "validate", "review/optimize/improve agent" | Load `quality-optimizer.md` | +| **Build new** | "build/create/design a new agent" | Load `build-process.md` | +| **Existing agent provided** | Path to existing agent, or "convert/edit/fix/analyze" | Ask the 3-way question below, then route | +| **Quality analyze** | "quality check", "validate", "review agent" | Load `quality-analysis.md` | +| **Unclear** | — | Present options and ask | + +### When given an existing agent, ask: + +- **Analyze** — Run quality analysis: identify opportunities, prune over-specification, get an actionable report with agent portrait and capability dashboard +- **Edit** — Modify specific behavior while keeping the current approach +- **Rebuild** — Rethink from core outcomes and persona, using this as reference material, full discovery process -Regardless of what path is taken, respect and follow headless mode guidance if user requested headless_mode - if a specific instruction does not indicate how to handle headless mode, you will try to find a way. +Analyze routes to `quality-analysis.md`. Edit and Rebuild both route to `build-process.md` with the chosen intent. -Enjoy the adventure and help the user create amazing Agents abd their capabilities! +Regardless of path, respect headless mode if requested. diff --git a/skills/bmad-agent-builder/assets/quality-report-template.md b/skills/bmad-agent-builder/assets/quality-report-template.md deleted file mode 100644 index a315eb4..0000000 --- a/skills/bmad-agent-builder/assets/quality-report-template.md +++ /dev/null @@ -1,281 +0,0 @@ -# Quality Report: {agent-name} - -**Scanned:** {timestamp} -**Skill Path:** {skill-path} -**Report:** {report-file-path} -**Performed By** QualityReportBot-9001 and {user_name} - -## Executive Summary - -- **Total Issues:** {total-issues} -- **Critical:** {critical} | **High:** {high} | **Medium:** {medium} | **Low:** {low} -- **Overall Quality:** {Excellent|Good|Fair|Poor} -- **Overall Cohesion:** {cohesion-score} -- **Craft Assessment:** {craft-assessment} - - -{executive-narrative} - -### Issues by Category - -| Category | Critical | High | Medium | Low | -|----------|----------|------|--------|-----| -| Structure & Capabilities | {n} | {n} | {n} | {n} | -| Prompt Craft | {n} | {n} | {n} | {n} | -| Execution Efficiency | {n} | {n} | {n} | {n} | -| Path & Script Standards | {n} | {n} | {n} | {n} | -| Agent Cohesion | {n} | {n} | {n} | {n} | -| Creative | — | — | {n} | {n} | - ---- - -## Agent Identity - - - -- **Persona:** {persona-summary} -- **Primary Purpose:** {primary-purpose} -- **Capabilities:** {capability-count} - ---- - -## Strengths - -*What this agent does well — preserve these during optimization:* - - - -{strengths-list} - ---- - -{if-truly-broken} -## Truly Broken or Missing - -*Issues that prevent the agent from working correctly:* - - - -{truly-broken-findings} - ---- -{/if-truly-broken} - -## Detailed Findings by Category - -### 1. Structure & Capabilities - - - -{if-structure-metadata} -**Agent Metadata:** -- Sections found: {sections-list} -- Capabilities: {capabilities-count} -- Memory sidecar: {has-memory} -- Headless mode: {has-headless} -- Structure assessment: {structure-assessment} -{/if-structure-metadata} - - - -{structure-findings} - -### 2. Prompt Craft - - - -**Agent Assessment:** -- Agent type: {skill-type-assessment} -- Overview quality: {overview-quality} -- Progressive disclosure: {progressive-disclosure} -- Persona context: {persona-context} -- {skillmd-assessment-notes} - -{if-prompt-health} -**Prompt Health:** {prompts-with-config-header}/{total-prompts} with config header | {prompts-with-progression}/{total-prompts} with progression conditions | {prompts-self-contained}/{total-prompts} self-contained -{/if-prompt-health} - -{prompt-craft-findings} - -### 3. Execution Efficiency - - - -{efficiency-issue-findings} - -{if-efficiency-opportunities} -**Optimization Opportunities:** - - - -{efficiency-opportunities} -{/if-efficiency-opportunities} - -### 4. Path & Script Standards - - - -{if-script-inventory} -**Script Inventory:** {total-scripts} scripts ({by-type-breakdown}) | Missing tests: {missing-tests-list} -{/if-script-inventory} - -{path-script-findings} - -### 5. Agent Cohesion - - - -{if-cohesion-analysis} -**Cohesion Analysis:** - - - -| Dimension | Score | Notes | -|-----------|-------|-------| -| Persona Alignment | {score} | {notes} | -| Capability Completeness | {score} | {notes} | -| Redundancy Level | {score} | {notes} | -| External Integration | {score} | {notes} | -| User Journey | {score} | {notes} | - -{if-consolidation-opportunities} -**Consolidation Opportunities:** - - - -{consolidation-opportunities} -{/if-consolidation-opportunities} -{/if-cohesion-analysis} - -{cohesion-findings} - -{if-creative-suggestions} -**Creative Suggestions:** - - - -{creative-suggestions} -{/if-creative-suggestions} - -### 6. Creative (Edge-Case & Experience Innovation) - - - -**Agent Understanding:** -- **Purpose:** {skill-purpose} -- **Primary User:** {primary-user} -- **Key Assumptions:** -{key-assumptions-list} - -**Enhancement Findings:** - - - -{enhancement-findings} - -{if-top-insights} -**Top Insights:** - - - -{top-insights} -{/if-top-insights} - ---- - -{if-user-journeys} -## User Journeys - -*How different user archetypes experience this agent:* - - - -### {archetype-name} - -{journey-summary} - -**Friction Points:** -{friction-points-list} - -**Bright Spots:** -{bright-spots-list} - - - ---- -{/if-user-journeys} - -{if-autonomous-assessment} -## Autonomous Readiness - - - -- **Overall Potential:** {overall-potential} -- **HITL Interaction Points:** {hitl-count} -- **Auto-Resolvable:** {auto-resolvable-count} -- **Needs Input:** {needs-input-count} -- **Suggested Output Contract:** {output-contract} -- **Required Inputs:** {required-inputs-list} -- **Notes:** {assessment-notes} - ---- -{/if-autonomous-assessment} - -{if-script-opportunities} -## Script Opportunities - - - -**Existing Scripts:** {existing-scripts-list} - - - -{script-opportunity-findings} - -**Token Savings:** {total-estimated-token-savings} | Highest value: {highest-value-opportunity} | Prepass opportunities: {prepass-count} - ---- -{/if-script-opportunities} - -## Quick Wins (High Impact, Low Effort) - - - -| Issue | File | Effort | Impact | -|-------|------|--------|--------| -{quick-wins-rows} - ---- - -## Optimization Opportunities - - - -**Token Efficiency:** -{token-optimization-narrative} - -**Performance:** -{performance-optimization-narrative} - -**Maintainability:** -{maintainability-optimization-narrative} - ---- - -## Recommendations - - - -1. {recommendation-1} -2. {recommendation-2} -3. {recommendation-3} -4. {recommendation-4} -5. {recommendation-5} diff --git a/skills/bmad-agent-builder/build-process.md b/skills/bmad-agent-builder/build-process.md index 4264a15..4b1ff25 100644 --- a/skills/bmad-agent-builder/build-process.md +++ b/skills/bmad-agent-builder/build-process.md @@ -7,120 +7,140 @@ description: Six-phase conversational discovery process for building BMad agents # Build Process -Build AI agents through six phases of conversational discovery. Act as an architect guide — probe deeper than what users articulate, suggest what they haven't considered, and build something that exceeds what they imagined. +Build AI agents through conversational discovery. Your north star: **outcome-driven design**. Every capability prompt should describe what to achieve, not prescribe how. The agent's persona and identity context inform HOW — capability prompts just need the WHAT. Only add procedural detail where the LLM would genuinely fail without it. ## Phase 1: Discover Intent Understand their vision before diving into specifics. Ask what they want to build and encourage detail. -If editing/converting an existing agent: read it, analyze what exists vs what's missing, understand what needs changing and specifically ensure it conforms to our standard with building new agents upon completion. +### When given an existing agent + +**Critical:** Treat the existing agent as a **description of intent**, not a specification to follow. Extract *who* this agent is and *what* it achieves. Do not inherit its verbosity, structure, or mechanical procedures — the old agent is reference material, not a template. + +If the SKILL.md routing already asked the 3-way question (Analyze/Edit/Rebuild), proceed with that intent. Otherwise ask now: +- **Edit** — changing specific behavior while keeping the current approach +- **Rebuild** — rethinking from core outcomes and persona, full discovery using the old agent as context + +For **Edit**: identify what to change, preserve what works, apply outcome-driven principles to the changed portions. + +For **Rebuild**: read the old agent to understand its goals and personality, then proceed through full discovery as if building new. + +### Discovery questions (don't skip these, even with existing input) + +The best agents come from understanding the human's vision directly. Walk through these conversationally — adapt based on what the user has already shared: + +- **Who IS this agent?** What personality should come through? What's their voice? +- **How should they make the user feel?** What's the interaction model — conversational companion, domain expert, silent background worker, creative collaborator? +- **What's the core outcome?** What does this agent help the user accomplish? What does success look like? +- **What capabilities serve that core outcome?** Not "what features sound cool" — what does the user actually need? +- **What's the one thing this agent must get right?** The non-negotiable. +- **If memory/sidecar:** What's worth remembering across sessions? What should the agent track over time? + +The goal is to conversationally gather enough to cover Phase 2 and 3 naturally. Since users often brain-dump rich detail, adapt subsequent phases to what you already know. ## Phase 2: Capabilities Strategy Early check: internal capabilities only, external skills, both, or unclear? -**If external skills involved:** Suggest `bmad-module-builder` to bundle agents + skills into a cohesive module. Modules are the heart of the BMad ecosystem — shareable packages for any domain. +**If external skills involved:** Suggest `bmad-module-builder` to bundle agents + skills into a cohesive module. **Script Opportunity Discovery** (active probing — do not skip): -Identify deterministic operations that should be scripts. Load `./references/script-opportunities-reference.md` for the full catalog. Confirm the script-vs-prompt plan with the user before proceeding. - -If scripts are planned, the `./scripts/` folder will be created. Scripts are invoked from prompts when needed, not run automatically. +Identify deterministic operations that should be scripts. Load `./references/script-opportunities-reference.md` for guidance. Confirm the script-vs-prompt plan with the user before proceeding. ## Phase 3: Gather Requirements -Gather requirements through conversation: identity, capabilities (internal prompts + external skills), activation modes, memory needs, and access boundaries. Refer to `./references/standard-fields.md` for conventions. +Gather through conversation: identity, capabilities, activation modes, memory needs, access boundaries. Refer to `./references/standard-fields.md` for conventions. Key structural context: -- **Naming:** Standalone agents use `bmad-agent-{name}`, module agents use `bmad-{modulecode}-agent-{name}` -- **Activation modes:** Interactive only, or Interactive + Headless (also runs on schedule/cron for background tasks) -- **Memory architecture:** See `./references/memory-system.md` template. Sidecar at `{project-root}/_bmad/memory/{skillName}-sidecar/` -- **Access boundaries:** Read/write/deny zones stored in memory as the standard `access-boundaries` section +- **Naming:** Standalone: `bmad-agent-{name}`. Module: `bmad-{modulecode}-agent-{name}` +- **Activation modes:** Interactive only, or Interactive + Headless (schedule/cron for background tasks) +- **Memory architecture:** Sidecar at `{project-root}/_bmad/memory/{skillName}-sidecar/` +- **Access boundaries:** Read/write/deny zones stored in memory -**If headless mode is enabled, also gather:** +**If headless mode enabled, also gather:** - Default wake behavior (`--headless` | `-H` with no specific task) - Named tasks (`--headless:{task-name}` or `-H:{task-name}`) -- **Path Conventions** (CRITICAL for reliable agent behavior): - - **Memory location:** `{project-root}/_bmad/memory/{skillName}-sidecar/` - - **Project artifacts:** `{project-root}/_bmad/...` when referencing project-level files - - **Skill-internal files:** Always use `./` prefix (`./references/`, `./scripts/`) — this distinguishes them from `{project-root}` paths - - **Config variables:** Use directly — they already contain full paths (NO `{project-root}` prefix) - - Correct: `{output_folder}/file.md` - - Wrong: `{project-root}/{output_folder}/file.md` (double-prefix breaks resolution) - - **No absolute paths** (`/Users/...`) +**Path conventions (CRITICAL):** +- Memory: `{project-root}/_bmad/memory/{skillName}-sidecar/` +- Project artifacts: `{project-root}/_bmad/...` +- Skill-internal: `./references/`, `./scripts/` +- Config variables used directly — they already contain full paths (no `{project-root}` prefix) ## Phase 4: Draft & Refine -Once you have a cohesive idea, think one level deeper. Once you have done this, present a draft outline. Point out vague areas. Ask what else is needed. Iterate until they say they're ready. +Think one level deeper. Present a draft outline. Point out vague areas. Iterate until ready. -## Phase 5: Build +**Pruning check (apply before building):** -**Always load these before building:** -- Load `./references/standard-fields.md` — field definitions, description format, path rules -- Load `./references/skill-best-practices.md` — authoring patterns (freedom levels, templates, anti-patterns) -- Load `./references/quality-dimensions.md` — quick mental checklist for build quality +For every planned instruction — especially in capability prompts — ask: **would the LLM do this correctly given just the agent's persona and the desired outcome?** If yes, cut it. -**Load based on context:** -- **Always load** `./references/script-opportunities-reference.md` — script opportunity spotting guide, catalog, and output standards. Use this to identify additional script opportunities not caught in Phase 2, even if no scripts were initially planned. +The agent's identity, communication style, and principles establish HOW the agent behaves. Capability prompts should describe WHAT to achieve. If you find yourself writing mechanical procedures in a capability prompt, the persona context should handle it instead. -Build the agent skill structure using templates from `./assets/` and rules from `./references/template-substitution-rules.md`. Output to `{bmad_builder_output_folder}`. +Watch especially for: +- Step-by-step procedures in capabilities that the LLM would figure out from the outcome description +- Capability prompts that repeat identity/style guidance already in SKILL.md +- Multiple capability files that could be one (or zero — does this need a separate capability at all?) +- Templates or reference files that explain things the LLM already knows -**Lint gate** — after building, run validation and auto-fix failures: +## Phase 5: Build -If subagents are available, delegate the lint-fix loop to a subagent. Otherwise run inline. +**Load these before building:** +- `./references/standard-fields.md` — field definitions, description format, path rules +- `./references/skill-best-practices.md` — outcome-driven authoring, patterns, anti-patterns +- `./references/quality-dimensions.md` — build quality checklist -1. Run both lint scripts in parallel: - ```bash - python3 ./scripts/scan-path-standards.py {skill-path} - python3 ./scripts/scan-scripts.py {skill-path} - ``` -2. If any findings at high or critical severity: fix them and re-run the failing script -3. Repeat up to 3 attempts per script — if still failing after 3, report remaining findings and continue -4. If scripts exist in the built skill, also run unit tests (`./scripts/run-tests.sh` or equivalent) +Build the agent using templates from `./assets/` and rules from `./references/template-substitution-rules.md`. Output to `{bmad_builder_output_folder}`. + +**Capability prompts are outcome-driven:** Each `./references/{capability}.md` file should describe what the capability achieves and what "good" looks like — not prescribe mechanical steps. The agent's persona context (identity, communication style, principles in SKILL.md) informs how each capability is executed. Don't repeat that context in every capability prompt. -**Folder structure:** +**Agent structure** (only create subfolders that are needed): ``` {skill-name}/ -├── SKILL.md # Frontmatter (name + description only), persona, activation, capability routing -├── references/ # ALL progressive disclosure content lives here +├── SKILL.md # Persona, activation, capability routing +├── references/ # Progressive disclosure content │ ├── {capability}.md # Each internal capability prompt -│ ├── memory-system.md # Memory discipline and structure (if sidecar) +│ ├── memory-system.md # Memory discipline (if sidecar) │ ├── init.md # First-run onboarding (if sidecar) -│ ├── autonomous-wake.md # Headless activation (if headless mode) +│ ├── autonomous-wake.md # Headless activation (if headless) │ └── save-memory.md # Explicit memory save (if sidecar) -├── assets/ # Templates, starter files (copied/transformed into output) -└── scripts/ # Deterministic code — validation, transformation, testing - └── run-tests.sh # uvx-powered test runner (if python tests exist) +├── assets/ # Templates, starter files +└── scripts/ # Deterministic code with tests ``` -**What goes where:** | Location | Contains | LLM relationship | |----------|----------|-----------------| -| **SKILL.md** | Persona, activation, capability routing table | LLM **identity and router** — the only root `.md` file | -| **`./references/`** | Capability prompts, reference data, schemas, guides | LLM **loads on demand** — progressive disclosure via routing table | -| **`./assets/`** | Templates, starter files, boilerplate | LLM **copies/transforms** these into output — not for reasoning | -| **`./scripts/`** | Python, shell scripts with tests | LLM **invokes** these — deterministic operations that don't need judgment | - -Only create subfolders that are needed — most agents won't need all three. +| **SKILL.md** | Persona, activation, routing | LLM identity and router | +| **`./references/`** | Capability prompts, reference data | Loaded on demand | +| **`./assets/`** | Templates, starter files | Copied/transformed into output | +| **`./scripts/`** | Python, shell scripts with tests | Invoked for deterministic operations | **Activation guidance for built agents:** -Activation is a single flow regardless of mode (interactive or headless). It should: -- Load config and resolve values (with defaults — see SKILL.md step 2) -- Load sidecar `index.md` if the agent has memory — this is the single entry point that tells the agent what else to load -- If headless, route to `./references/autonomous-wake.md` and complete without interaction -- If interactive, greet the user and either continue naturally from memory context or offer to show available capabilities +Activation is a single flow regardless of mode. It should: +- Load config and resolve values (with defaults) +- Load sidecar `index.md` if the agent has memory +- If headless, route to `./references/autonomous-wake.md` +- If interactive, greet the user and continue from memory context or offer capabilities -## Phase 6: Summary +**Lint gate** — after building, validate and auto-fix: -Present what was built: location, structure, first-run behavior, capabilities. Ask if adjustments needed. +If subagents available, delegate lint-fix to a subagent. Otherwise run inline. -**After the build completes, offer quality optimization:** +1. Run both lint scripts in parallel: + ```bash + python3 ./scripts/scan-path-standards.py {skill-path} + python3 ./scripts/scan-scripts.py {skill-path} + ``` +2. Fix high/critical findings and re-run (up to 3 attempts per script) +3. Run unit tests if scripts exist in the built skill + +## Phase 6: Summary -Ask: *"Build is done. Would you like to run a Quality Scan to optimize the agent further?"* +Present what was built: location, structure, first-run behavior, capabilities. -If yes, load `quality-optimizer.md` with `{scan_mode}=full` and the agent path. +Run unit tests if scripts exist. Remind user to commit before quality analysis. -Remind them: BMad module system compliant. Use the module-init skill to install and configure into a project. +**Offer quality analysis:** Ask if they'd like a Quality Analysis to identify opportunities. If yes, load `quality-analysis.md` with the agent path. diff --git a/skills/bmad-agent-builder/quality-analysis.md b/skills/bmad-agent-builder/quality-analysis.md new file mode 100644 index 0000000..bbf1dec --- /dev/null +++ b/skills/bmad-agent-builder/quality-analysis.md @@ -0,0 +1,126 @@ +--- +name: quality-analysis +description: Comprehensive quality analysis for BMad agents. Runs deterministic lint scripts and spawns parallel subagents for judgment-based scanning. Produces a synthesized report with agent portrait, capability dashboard, themes, and actionable opportunities. +menu-code: QA +--- + +**Language:** Use `{communication_language}` for all output. + +# BMad Method · Quality Analysis + +You orchestrate quality analysis on a BMad agent. Deterministic checks run as scripts (fast, zero tokens). Judgment-based analysis runs as LLM subagents. A report creator synthesizes everything into a unified, theme-based report with agent portrait and capability dashboard. + +## Your Role + +**DO NOT read the target agent's files yourself.** Scripts and subagents do all analysis. You orchestrate: run scripts, spawn scanners, hand off to the report creator. + +## Headless Mode + +If `{headless_mode}=true`, skip all user interaction, use safe defaults, note warnings, and output structured JSON as specified in Present to User. + +## Pre-Scan Checks + +Check for uncommitted changes. In headless mode, note warnings and proceed. In interactive mode, inform the user and confirm. Also confirm the agent is currently functioning. + +## Analysis Principles + +**Effectiveness over efficiency.** Agent personality is investment, not waste. The report presents opportunities — the user applies judgment. Never suggest flattening an agent's voice unless explicitly asked. + +## Scanners + +### Lint Scripts (Deterministic — Run First) + +| # | Script | Focus | Output File | +|---|--------|-------|-------------| +| S1 | `scripts/scan-path-standards.py` | Path conventions | `path-standards-temp.json` | +| S2 | `scripts/scan-scripts.py` | Script portability, PEP 723, unit tests | `scripts-temp.json` | + +### Pre-Pass Scripts (Feed LLM Scanners) + +| # | Script | Feeds | Output File | +|---|--------|-------|-------------| +| P1 | `scripts/prepass-structure-capabilities.py` | structure scanner | `structure-capabilities-prepass.json` | +| P2 | `scripts/prepass-prompt-metrics.py` | prompt-craft scanner | `prompt-metrics-prepass.json` | +| P3 | `scripts/prepass-execution-deps.py` | execution-efficiency scanner | `execution-deps-prepass.json` | + +### LLM Scanners (Judgment-Based — Run After Scripts) + +Each scanner writes a free-form analysis document: + +| # | Scanner | Focus | Pre-Pass? | Output File | +|---|---------|-------|-----------|-------------| +| L1 | `quality-scan-structure.md` | Structure, capabilities, identity, memory, consistency | Yes | `structure-analysis.md` | +| L2 | `quality-scan-prompt-craft.md` | Token efficiency, outcome balance, persona voice, per-capability craft | Yes | `prompt-craft-analysis.md` | +| L3 | `quality-scan-execution-efficiency.md` | Parallelization, delegation, memory loading, context optimization | Yes | `execution-efficiency-analysis.md` | +| L4 | `quality-scan-agent-cohesion.md` | Persona-capability alignment, identity coherence, per-capability cohesion | No | `agent-cohesion-analysis.md` | +| L5 | `quality-scan-enhancement-opportunities.md` | Edge cases, experience gaps, user journeys, headless potential | No | `enhancement-opportunities-analysis.md` | +| L6 | `quality-scan-script-opportunities.md` | Deterministic operations that should be scripts | No | `script-opportunities-analysis.md` | + +## Execution + +First create output directory: `{bmad_builder_reports}/{skill-name}/quality-analysis/{date-time-stamp}/` + +### Step 1: Run All Scripts (Parallel) + +```bash +python3 scripts/scan-path-standards.py {skill-path} -o {report-dir}/path-standards-temp.json +python3 scripts/scan-scripts.py {skill-path} -o {report-dir}/scripts-temp.json +python3 scripts/prepass-structure-capabilities.py {skill-path} -o {report-dir}/structure-capabilities-prepass.json +python3 scripts/prepass-prompt-metrics.py {skill-path} -o {report-dir}/prompt-metrics-prepass.json +uv run scripts/prepass-execution-deps.py {skill-path} -o {report-dir}/execution-deps-prepass.json +``` + +### Step 2: Spawn LLM Scanners (Parallel) + +After scripts complete, spawn all scanners as parallel subagents. + +**With pre-pass (L1, L2, L3):** provide pre-pass JSON path. +**Without pre-pass (L4, L5, L6):** provide skill path and output directory. + +Each subagent loads the scanner file, analyzes the agent, writes analysis to the output directory, returns the filename. + +### Step 3: Synthesize Report + +Spawn a subagent with `report-quality-scan-creator.md`. + +Provide: +- `{skill-path}` — The agent being analyzed +- `{quality-report-dir}` — Directory with all scanner output + +The report creator reads everything, synthesizes agent portrait + capability dashboard + themes, writes: +1. `quality-report.md` — Narrative markdown with BMad Method branding +2. `report-data.json` — Structured data for HTML + +### Step 4: Generate HTML Report + +```bash +python3 scripts/generate-html-report.py {report-dir} --open +``` + +## Present to User + +**IF `{headless_mode}=true`:** + +Read `report-data.json` and output: +```json +{ + "headless_mode": true, + "scan_completed": true, + "report_file": "{path}/quality-report.md", + "html_report": "{path}/quality-report.html", + "data_file": "{path}/report-data.json", + "grade": "Excellent|Good|Fair|Poor", + "opportunities": 0, + "broken": 0 +} +``` + +**IF interactive:** + +Read `report-data.json` and present: +1. Agent portrait — icon, name, title +2. Grade and narrative +3. Capability dashboard summary +4. Top opportunities +5. Reports — paths and "HTML opened in browser" +6. Offer: apply fixes, use HTML to select items, discuss findings diff --git a/skills/bmad-agent-builder/quality-optimizer.md b/skills/bmad-agent-builder/quality-optimizer.md deleted file mode 100644 index 99d3da4..0000000 --- a/skills/bmad-agent-builder/quality-optimizer.md +++ /dev/null @@ -1,174 +0,0 @@ ---- -name: quality-optimizer -description: Comprehensive quality validation for BMad agents. Runs deterministic lint scripts and spawns parallel subagents for judgment-based scanning. Returns consolidated findings as structured JSON. -menu-code: QO ---- - -**Language:** Use `{communication_language}` for all output. - -# Quality Optimizer - -You orchestrate quality scans on a BMad agent. Deterministic checks run as scripts (fast, zero tokens). Judgment-based analysis runs as LLM subagents. You synthesize all results into a unified report. - -## Your Role - -You orchestrate quality scans: run deterministic scripts and pre-pass extractors, spawn LLM scanner subagents in parallel, then synthesize all results into a unified report. - -**DO NOT read the target agent's files yourself.** Scripts and subagents do all analysis. - -## Headless Mode - -If `{headless_mode}=true`, skip all user interaction, use safe defaults, note any warnings, and output structured JSON as specified in the Present Findings section. - -## Pre-Scan Checks - -Check for uncommitted changes. In headless mode, note warnings and proceed. In interactive mode, inform the user and confirm. In interactive mode, also confirm the agent is currently functioning. - -## Optimization Principles - -**Agent skills are both art and science.** The report will contain many suggestions. Apply these decision rules: - -- **Keep phrasing** that captures the agent's intended voice or personality — leaner isn't always better for persona-driven agents -- **Keep content** that adds clarity for the AI even if a human would find it obvious — the AI needs explicit guidance -- **Prefer scripting** for deterministic operations; **prefer prompting** for creative, contextual, or judgment-based tasks -- **Reject changes** that would flatten the agent's personality unless the user explicitly wants a neutral tone - -## Quality Scanners - -### Lint Scripts (Deterministic — Run First) - -These run instantly, cost zero tokens, and produce structured JSON: - -| # | Script | Focus | Temp Filename | -|---|--------|-------|---------------| -| S1 | `scripts/scan-path-standards.py` | Path conventions: {project-root} only for _bmad, bare _bmad, memory paths, double-prefix, absolute paths | `path-standards-temp.json` | -| S2 | `scripts/scan-scripts.py` | Script portability, PEP 723, agentic design, unit tests | `scripts-temp.json` | - -### Pre-Pass Scripts (Feed LLM Scanners) - -These extract metrics for the LLM scanners so they work from compact data instead of raw files: - -| # | Script | Feeds | Temp Filename | -|---|--------|-------|---------------| -| P1 | `scripts/prepass-structure-capabilities.py` | structure LLM scanner | `structure-capabilities-prepass.json` | -| P2 | `scripts/prepass-prompt-metrics.py` | prompt-craft LLM scanner | `prompt-metrics-prepass.json` | -| P3 | `scripts/prepass-execution-deps.py` | execution-efficiency LLM scanner | `execution-deps-prepass.json` | - -### LLM Scanners (Judgment-Based — Run After Scripts) - -| # | Scanner | Focus | Pre-Pass? | Temp Filename | -|---|---------|-------|-----------|---------------| -| L1 | `quality-scan-structure.md` | Structure, capabilities, identity, memory setup, consistency | Yes — receives prepass JSON | `structure-temp.json` | -| L2 | `quality-scan-prompt-craft.md` | Token efficiency, anti-patterns, outcome balance, persona voice, Overview quality | Yes — receives metrics JSON | `prompt-craft-temp.json` | -| L3 | `quality-scan-execution-efficiency.md` | Parallelization, subagent delegation, memory loading, context optimization | Yes — receives dep graph JSON | `execution-efficiency-temp.json` | -| L4 | `quality-scan-agent-cohesion.md` | Persona-capability alignment, gaps, redundancies, coherence | No | `agent-cohesion-temp.json` | -| L5 | `quality-scan-enhancement-opportunities.md` | Script automation, autonomous potential, edge cases, experience gaps, delight | No | `enhancement-opportunities-temp.json` | -| L6 | `quality-scan-script-opportunities.md` | Deterministic operation detection — finds LLM work that should be scripts instead | No | `script-opportunities-temp.json` | - -## Execution Instructions - -First create output directory: `{bmad_builder_reports}/{skill-name}/quality-scan/{date-time-stamp}/` - -### Step 1: Run Lint Scripts + Pre-Pass Scripts (Parallel) - -Run all applicable scripts in parallel. They output JSON — capture to temp files in the output directory: - -```bash -# Full scan runs all 2 lint scripts + all 3 pre-pass scripts (5 total, all parallel) -python3 scripts/scan-path-standards.py {skill-path} -o {quality-report-dir}/path-standards-temp.json -python3 scripts/scan-scripts.py {skill-path} -o {quality-report-dir}/scripts-temp.json -python3 scripts/prepass-structure-capabilities.py {skill-path} -o {quality-report-dir}/structure-capabilities-prepass.json -python3 scripts/prepass-prompt-metrics.py {skill-path} -o {quality-report-dir}/prompt-metrics-prepass.json -uv run scripts/prepass-execution-deps.py {skill-path} -o {quality-report-dir}/execution-deps-prepass.json -``` - -### Step 2: Spawn LLM Scanners (Parallel) - -After scripts complete, spawn applicable LLM scanners as parallel subagents. - -**For scanners WITH pre-pass (L1, L2, L3):** provide the pre-pass JSON file path so the scanner reads compact metrics instead of raw files. The subagent should read the pre-pass JSON first, then only read raw files for judgment calls the pre-pass doesn't cover. - -**For scanners WITHOUT pre-pass (L4, L5, L6):** provide just the skill path and output directory. - -Each subagent receives: -- Scanner file to load (e.g., `quality-scan-agent-cohesion.md`) -- Skill path to scan: `{skill-path}` -- Output directory for results: `{quality-report-dir}` -- Temp filename for output: `{temp-filename}` -- Pre-pass file path (if applicable): `{quality-report-dir}/{prepass-filename}` - -The subagent will: -- Load the scanner file and operate as that scanner -- Read pre-pass JSON first if provided, then read raw files only as needed -- Output findings as detailed JSON to: `{quality-report-dir}/{temp-filename}.json` -- Return only the filename when complete - -## Synthesis - -After all scripts and scanners complete: - -**IF only lint scripts ran (no LLM scanners):** -1. Read the script output JSON files -2. Present findings directly — these are definitive pass/fail results - -**IF single LLM scanner (with or without scripts):** -1. Read all temp JSON files (script + scanner) -2. Present findings directly in simplified format -3. Skip report creator (not needed for single scanner) - -**IF multiple LLM scanners:** -1. Initiate a subagent with `report-quality-scan-creator.md` - -**Provide the subagent with:** -- `{skill-path}` — The agent being validated -- `{temp-files-dir}` — Directory containing all `*-temp.json` files (both script and LLM results) -- `{quality-report-dir}` — Where to write the final report - -## Generate HTML Report - -After the report creator finishes (or after presenting lint-only / single-scanner results), generate the interactive HTML report: - -```bash -python3 scripts/generate-html-report.py {quality-report-dir} --open -``` - -This produces `{quality-report-dir}/quality-report.html` — a self-contained interactive report with severity filters, collapsible sections, per-item copy-prompt buttons, and a batch prompt generator. The `--open` flag opens it in the default browser. - -## Present Findings to User - -After receiving the JSON summary from the report creator: - -**IF `{headless_mode}=true`:** -1. **Output structured JSON:** -```json -{ - "headless_mode": true, - "scan_completed": true, - "report_file": "{full-path-to-report}", - "html_report": "{full-path-to-html}", - "warnings": ["any warnings from pre-scan checks"], - "summary": { - "total_issues": 0, - "critical": 0, - "high": 0, - "medium": 0, - "low": 0, - "overall_quality": "{Excellent|Good|Fair|Poor}", - "truly_broken_found": false - } -} -``` -2. **Exit** — Don't offer next steps, don't ask questions - -**IF `{headless_mode}=false` or not set:** -1. **High-level summary** with total issues by severity -2. **Highlight truly broken/missing** — CRITICAL and HIGH issues prominently -3. **Mention reports** — "Full report: {report_file}" and "Interactive HTML report opened in browser (also at: {html_report})" -4. **Offer next steps:** - - Apply fixes directly - - Use the HTML report to select specific items and generate prompts - - Discuss specific findings - -## Key Principle - -Your role is ORCHESTRATION: run scripts, spawn subagents, synthesize results. Scripts handle deterministic checks (paths, schema, script standards). LLM scanners handle judgment calls (cohesion, craft, efficiency). You coordinate both and present unified findings. diff --git a/skills/bmad-agent-builder/quality-scan-agent-cohesion.md b/skills/bmad-agent-builder/quality-scan-agent-cohesion.md index d80d77f..6d2aafe 100644 --- a/skills/bmad-agent-builder/quality-scan-agent-cohesion.md +++ b/skills/bmad-agent-builder/quality-scan-agent-cohesion.md @@ -113,102 +113,19 @@ Find and read: | Entry points are clear | User knows where to start | | Exit points provide value | User gets something useful, not just internal state | -## Output Format - -Output your findings using the universal schema defined in `references/universal-scan-schema.md`. - -Use EXACTLY these field names: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`. Do not rename, restructure, or add fields to findings. - -Before writing output, verify: Is your array called `findings`? Does every item have `title`, `detail`, `action`? Is `assessments` an object, not items in the findings array? - -You will receive `{skill-path}` and `{quality-report-dir}` as inputs. - -Write JSON findings to: `{quality-report-dir}/agent-cohesion-temp.json` - -```json -{ - "scanner": "agent-cohesion", - "agent_path": "{path}", - "findings": [ - { - "file": "SKILL.md|{name}.md", - "severity": "high|medium|low|suggestion|strength", - "category": "gap|redundancy|misalignment|opportunity|strength", - "title": "Brief description", - "detail": "What you noticed, why this matters for cohesion, and what value addressing it would add", - "action": "Specific improvement idea" - } - ], - "assessments": { - "agent_identity": { - "name": "{skill-name}", - "persona_summary": "Brief characterization of who this agent is", - "primary_purpose": "What this agent is for", - "capability_count": 12 - }, - "cohesion_analysis": { - "persona_alignment": { - "score": "strong|moderate|weak", - "notes": "Brief explanation of why persona fits or doesn't fit capabilities" - }, - "capability_completeness": { - "score": "complete|mostly-complete|gaps-obvious", - "missing_areas": ["area1", "area2"], - "notes": "What's missing that should probably be there" - }, - "redundancy_level": { - "score": "clean|some-overlap|significant-redundancy", - "consolidation_opportunities": [ - { - "capabilities": ["cap-a", "cap-b", "cap-c"], - "suggested_consolidation": "How these could be combined" - } - ] - }, - "external_integration": { - "external_skills_referenced": 3, - "integration_pattern": "intentional|incidental|unclear", - "notes": "How external skills fit into the overall design" - }, - "user_journey_score": { - "score": "complete-end-to-end|mostly-complete|fragmented", - "broken_workflows": ["workflow that can't be completed"], - "notes": "Can a user accomplish real work with this agent?" - } - } - }, - "summary": { - "total_findings": 0, - "by_severity": {"high": 0, "medium": 0, "low": 0, "suggestion": 0, "strength": 0}, - "by_category": {"gap": 0, "redundancy": 0, "misalignment": 0, "opportunity": 0, "strength": 0}, - "overall_cohesion": "cohesive|mostly-cohesive|fragmented|confused", - "single_most_important_fix": "The ONE thing that would most improve this agent" - } -} -``` - -Merge all findings into the single `findings[]` array: -- Former `findings[]` items: map `issue` to `title`, merge `observation`+`rationale`+`impact` into `detail`, map `suggestion` to `action` -- Former `strengths[]` items: use `severity: "strength"`, `category: "strength"` -- Former `creative_suggestions[]` items: use `severity: "suggestion"`, map `idea` to `title`, `rationale` to `detail`, merge `type` and `estimated_impact` context into `detail`, map actionable recommendation to `action` - -## Severity Guidelines - -| Severity | When to Use | -|----------|-------------| -| **high** | Glaring omission that would obviously confuse users OR capability that completely contradicts persona | -| **medium** | Clear gap in core workflow OR significant redundancy OR moderate misalignment | -| **low** | Minor enhancement opportunity OR edge case not covered | -| **suggestion** | Creative idea, nice-to-have, speculative improvement | - -## Process - -Read all agent files. Evaluate cohesion across all 6 dimensions above. Write findings to `{quality-report-dir}/agent-cohesion-temp.json`. Return only the filename. - -## Critical After Draft Output - -Before finalizing, verify completeness across all dimensions and that findings tell a coherent story. - -## Key Principle - -You are NOT checking for syntax errors or missing fields. You are evaluating whether this agent makes sense as a coherent tool. Think like a product designer reviewing a feature set: Is this useful? Is it complete? Does it fit together? Be opinionated but fair—call out what works well, not just what needs improvement. +## Output + +Write your analysis as a natural document. This is an opinionated, advisory assessment. Include: + +- **Assessment** — overall cohesion verdict in 2-3 sentences. Does this agent feel authentic and purposeful? +- **Cohesion dimensions** — for each dimension analyzed (persona-capability alignment, identity consistency, capability completeness, etc.), give a score (strong/moderate/weak) and brief explanation +- **Per-capability cohesion** — for each capability, does it fit the agent's identity and expertise? Would this agent naturally have this capability? Flag misalignments. +- **Key findings** — gaps, redundancies, misalignments. Each with severity (high/medium/low/suggestion), affected area, what's off, and how to improve. High = glaring persona contradiction or missing core capability. Medium = clear gap. Low = minor. Suggestion = creative idea. +- **Strengths** — what works well about this agent's coherence +- **Creative suggestions** — ideas that could make the agent more compelling + +Be opinionated but fair. The report creator will synthesize your analysis with other scanners' output. + +Write your analysis to: `{quality-report-dir}/agent-cohesion-analysis.md` + +Return only the filename when complete. diff --git a/skills/bmad-agent-builder/quality-scan-enhancement-opportunities.md b/skills/bmad-agent-builder/quality-scan-enhancement-opportunities.md index d7ba768..935b7be 100644 --- a/skills/bmad-agent-builder/quality-scan-enhancement-opportunities.md +++ b/skills/bmad-agent-builder/quality-scan-enhancement-opportunities.md @@ -29,7 +29,6 @@ Find and read: - `SKILL.md` — Understand the agent's purpose, persona, audience, and flow - `*.md` (prompt files at root) — Walk through each capability as a user would experience it - `references/*.md` — Understand what supporting material exists -- `references/*.json` — See what supporting schemas exist ## Creative Analysis Lenses @@ -157,84 +156,19 @@ For each journey, note: Explore creatively, then distill each idea into a concrete, actionable suggestion. Prioritize by user impact. Stay in your lane. -## Output Format - -Output your findings using the universal schema defined in `references/universal-scan-schema.md`. - -Use EXACTLY these field names: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`. Do not rename, restructure, or add fields to findings. - -Before writing output, verify: Is your array called `findings`? Does every item have `title`, `detail`, `action`? Is `assessments` an object, not items in the findings array? - -You will receive `{skill-path}` and `{quality-report-dir}` as inputs. - -Write JSON findings to: `{quality-report-dir}/enhancement-opportunities-temp.json` - -```json -{ - "scanner": "enhancement-opportunities", - "skill_path": "{path}", - "findings": [ - { - "file": "SKILL.md|{name}.md", - "severity": "high-opportunity|medium-opportunity|low-opportunity", - "category": "edge-case|experience-gap|delight-opportunity|assumption-risk|journey-friction|autonomous-potential|facilitative-pattern", - "title": "The specific situation or user story that reveals this opportunity", - "detail": "What you noticed, why it matters, and how this would change the user's experience", - "action": "Concrete, actionable improvement — the tempered version of the wild idea" - } - ], - "assessments": { - "skill_understanding": { - "purpose": "What this agent is trying to do", - "primary_user": "Who this agent is for", - "key_assumptions": ["assumption 1", "assumption 2"] - }, - "user_journeys": [ - { - "archetype": "first-timer|expert|confused|edge-case|hostile-environment|automator", - "summary": "Brief narrative of this user's experience with the agent", - "friction_points": ["moment 1", "moment 2"], - "bright_spots": ["what works well for this user"] - } - ], - "autonomous_assessment": { - "potential": "headless-ready|easily-adaptable|partially-adaptable|fundamentally-interactive", - "hitl_points": 0, - "auto_resolvable": 0, - "needs_input": 0, - "suggested_output_contract": "What a headless invocation would return", - "required_inputs": ["parameters needed upfront for headless mode"], - "notes": "Brief assessment of autonomous viability" - }, - "top_insights": [ - { - "title": "The single most impactful creative observation", - "detail": "The user experience impact", - "action": "What to do about it" - } - ] - }, - "summary": { - "total_findings": 0, - "by_severity": {"high-opportunity": 0, "medium-opportunity": 0, "low-opportunity": 0}, - "by_category": { - "edge_case": 0, - "experience_gap": 0, - "delight_opportunity": 0, - "assumption_risk": 0, - "journey_friction": 0, - "autonomous_potential": 0, - "facilitative_pattern": 0 - }, - "assessment": "Brief creative assessment of the agent's user experience, including the boldest practical idea" - } -} -``` - -## Process - -Read all agent files. Analyze through each creative lens above. Write findings to `{quality-report-dir}/enhancement-opportunities-temp.json`. Return only the filename. - -## Critical After Draft Output - -Before finalizing, verify findings are realistic, actionable, and honest about what the agent already does well. +## Output + +Write your analysis as a natural document. Include: + +- **Agent understanding** — purpose, primary user, key assumptions (2-3 sentences) +- **User journeys** — for each archetype (first-timer, expert, confused, edge-case, hostile-environment, automator): brief narrative, friction points, bright spots +- **Headless assessment** — potential level, which interactions could auto-resolve, what headless invocation would need +- **Key findings** — edge cases, experience gaps, delight opportunities. Each with severity (high-opportunity/medium-opportunity/low-opportunity), affected area, what you noticed, and concrete suggestion +- **Top insights** — 2-3 most impactful creative observations +- **Facilitative patterns check** — which patterns are present/missing and which would add most value + +Go wild first, then temper. Prioritize by user impact. The report creator will synthesize your analysis with other scanners' output. + +Write your analysis to: `{quality-report-dir}/enhancement-opportunities-analysis.md` + +Return only the filename when complete. diff --git a/skills/bmad-agent-builder/quality-scan-execution-efficiency.md b/skills/bmad-agent-builder/quality-scan-execution-efficiency.md index 1a7e152..7f3d266 100644 --- a/skills/bmad-agent-builder/quality-scan-execution-efficiency.md +++ b/skills/bmad-agent-builder/quality-scan-execution-efficiency.md @@ -118,49 +118,17 @@ GOOD: Selective loading --- -## Output Format - -Output your findings using the universal schema defined in `references/universal-scan-schema.md`. - -Use EXACTLY these field names: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`. Do not rename, restructure, or add fields to findings. - -Before writing output, verify: Is your array called `findings`? Does every item have `title`, `detail`, `action`? Is `assessments` an object, not items in the findings array? - -You will receive `{skill-path}` and `{quality-report-dir}` as inputs. - -Write JSON findings to: `{quality-report-dir}/execution-efficiency-temp.json` - -```json -{ - "scanner": "execution-efficiency", - "skill_path": "{path}", - "findings": [ - { - "file": "SKILL.md|{name}.md", - "line": 42, - "severity": "critical|high|medium|low|medium-opportunity", - "category": "sequential-independent|parent-reads-first|missing-batch|no-output-spec|subagent-chain-violation|memory-loading|resource-loading|missing-delegation|parallelization|batching|delegation|memory-optimization|resource-optimization", - "title": "Brief description", - "detail": "What it does now, and estimated time/token savings", - "action": "What it should do instead" - } - ], - "summary": { - "total_findings": 0, - "by_severity": {"critical": 0, "high": 0, "medium": 0, "low": 0}, - "by_category": {} - } -} -``` +## Output -Merge all items into the single `findings[]` array: -- Former `issues[]` items: map `issue` to `title`, merge `current_pattern`+`estimated_savings` into `detail`, map `efficient_alternative` to `action` -- Former `opportunities[]` items: map `description` to `title`, merge details into `detail`, map `recommendation` to `action`, use severity like `medium-opportunity` +Write your analysis as a natural document. Include: -## Process +- **Assessment** — overall efficiency verdict in 2-3 sentences +- **Key findings** — each with severity (critical/high/medium/low), affected file:line, current pattern, efficient alternative, and estimated savings. Critical = circular deps or subagent-from-subagent. High = parent-reads-before-delegating, sequential independent ops. Medium = missed batching, ordering issues. Low = minor opportunities. +- **Optimization opportunities** — larger structural changes with estimated impact +- **What's already efficient** — patterns worth preserving -Read pre-pass JSON and raw files as needed. Evaluate efficiency across all dimensions above. Write JSON to `{quality-report-dir}/execution-efficiency-temp.json`. Return only the filename. +Be specific about file paths, line numbers, and savings estimates. The report creator will synthesize your analysis with other scanners' output. -## Critical After Draft Output +Write your analysis to: `{quality-report-dir}/execution-efficiency-analysis.md` -Before finalizing, verify findings target genuine inefficiencies with measurable impact. +Return only the filename when complete. diff --git a/skills/bmad-agent-builder/quality-scan-prompt-craft.md b/skills/bmad-agent-builder/quality-scan-prompt-craft.md index e55319b..cd33bb4 100644 --- a/skills/bmad-agent-builder/quality-scan-prompt-craft.md +++ b/skills/bmad-agent-builder/quality-scan-prompt-craft.md @@ -135,6 +135,26 @@ Do NOT flag these: | Companion/interactive agent | Outcome + persona + communication guidance | Needs to read user and adapt | | Workflow facilitator agent | Outcome + rationale + selective HOW | Needs to understand WHY for routing | +### Pruning: Instructions the Agent Doesn't Need + +Beyond micro-step over-specification, check for entire blocks that teach the LLM something it already knows — or that repeat what the agent's persona context already establishes. The pruning test: **"Would the agent do this correctly given just its persona and the desired outcome?"** If yes, the block is noise. + +**Flag as HIGH when a capability prompt contains any of these:** + +| Anti-Pattern | Why It's Noise | Example | +|-------------|----------------|---------| +| Scoring formulas for subjective judgment | LLMs naturally assess relevance without numeric weights | "Score each option: relevance(×4) + novelty(×3)" | +| Capability prompt repeating identity/style from SKILL.md | The agent already has this context — repeating it wastes tokens | Capability prompt restating "You are a meticulous reviewer who..." | +| Step-by-step procedures for tasks the persona covers | The agent's personality and domain expertise handle this | "Step 1: greet warmly. Step 2: ask about their day. Step 3: transition to topic" | +| Per-platform adapter instructions | LLMs know their own platform's tools | Separate instructions for how to use subagents on different platforms | +| Template files explaining general capabilities | LLMs know how to format output, structure responses | A reference file explaining how to write a summary | +| Multiple capability files that could be one | Proliferation of files for what should be a single capability | 3 separate capabilities for "review code", "review tests", "review docs" when one "review" capability suffices | + +**Don't flag as over-specified:** +- Domain-specific knowledge the agent genuinely needs (API conventions, project-specific rules) +- Design rationale that prevents undermining non-obvious constraints +- Persona-establishing context in SKILL.md (identity, style, principles — this is load-bearing, not waste) + ### Structural Anti-Patterns | Pattern | Threshold | Fix | |---------|-----------|-----| @@ -156,69 +176,27 @@ Do NOT flag these: | Severity | When to Apply | |----------|---------------| | **Critical** | Missing progression conditions, self-containment failures, intelligence leaks into scripts | -| **High** | Pervasive defensive padding, SKILL.md over size guidelines with no progressive disclosure, over-optimized complex agent (empty Overview, no persona context), persona voice stripped to bare skeleton | -| **Medium** | Moderate token waste, over-specified procedures, minor voice inconsistency | +| **High** | Pervasive over-specification (scoring algorithms, capability prompts repeating persona context, adapter proliferation — see Pruning section), SKILL.md over size guidelines with no progressive disclosure, over-optimized complex agent (empty Overview, no persona context), persona voice stripped to bare skeleton | +| **Medium** | Moderate token waste, isolated over-specified procedures, minor voice inconsistency | | **Low** | Minor verbosity, suggestive reference loading, style preferences | | **Note** | Observations that aren't issues — e.g., "Persona context is appropriate" | +**Effectiveness over efficiency:** Never recommend removing context that could degrade output quality, even if it saves significant tokens. Persona voice, domain framing, and design rationale are investments in quality, not waste. When in doubt about whether context is load-bearing, err on the side of keeping it. + --- -## Output Format - -Output your findings using the universal schema defined in `references/universal-scan-schema.md`. - -Use EXACTLY these field names: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`. Do not rename, restructure, or add fields to findings. - -Before writing output, verify: Is your array called `findings`? Does every item have `title`, `detail`, `action`? Is `assessments` an object, not items in the findings array? - -You will receive `{skill-path}` and `{quality-report-dir}` as inputs. - -Write JSON findings to: `{quality-report-dir}/prompt-craft-temp.json` - -```json -{ - "scanner": "prompt-craft", - "skill_path": "{path}", - "findings": [ - { - "file": "SKILL.md|{name}.md", - "line": 42, - "severity": "critical|high|medium|low|note", - "category": "token-waste|anti-pattern|outcome-balance|progression|self-containment|intelligence-placement|overview-quality|progressive-disclosure|under-contextualized|persona-voice|communication-consistency|inline-data", - "title": "Brief description", - "detail": "Why this matters for prompt craft. Include any nuance about why this might be intentional.", - "action": "Specific action to resolve" - } - ], - "assessments": { - "skill_type_assessment": "simple-utility|domain-expert|companion-interactive|workflow-facilitator", - "skillmd_assessment": { - "overview_quality": "appropriate|excessive|missing|disconnected", - "progressive_disclosure": "good|needs-extraction|monolithic", - "persona_context": "appropriate|excessive|missing", - "notes": "Brief assessment of SKILL.md craft" - }, - "prompts_scanned": 0, - "prompt_health": { - "prompts_with_config_header": 0, - "prompts_with_progression_conditions": 0, - "prompts_self_contained": 0, - "total_prompts": 0 - } - }, - "summary": { - "total_findings": 0, - "by_severity": {"critical": 0, "high": 0, "medium": 0, "low": 0, "note": 0}, - "assessment": "Brief 1-2 sentence assessment", - "top_improvement": "Highest-impact improvement" - } -} -``` - -## Process - -Read pre-pass JSON and all prompt files. Evaluate using the criteria in Parts 1-3 above. Write JSON to `{quality-report-dir}/prompt-craft-temp.json`. Return only the filename. - -## Critical After Draft Output - -Before finalizing, verify all files were read, token-waste findings are genuine (not persona context), and suggestions would improve the agent holistically. +## Output + +Write your analysis as a natural document. Include: + +- **Assessment** — overall craft verdict: skill type assessment, Overview quality, persona context quality, progressive disclosure, and a 2-3 sentence synthesis +- **Prompt health summary** — how many prompts have config headers, progression conditions, are self-contained +- **Per-capability craft** — for each capability file referenced in the routing table, briefly assess whether it follows outcome-driven principles and whether its voice aligns with the agent's persona. Flag capabilities that are over-specified or under-contextualized. +- **Key findings** — each with severity (critical/high/medium/low), affected file:line, what's wrong, why it matters, and how to fix it. Distinguish genuine waste from persona-serving context. +- **Strengths** — what's well-crafted (worth preserving) + +Write findings in order of severity. Be specific about file paths and line numbers. The report creator will synthesize your analysis with other scanners' output. + +Write your analysis to: `{quality-report-dir}/prompt-craft-analysis.md` + +Return only the filename when complete. diff --git a/skills/bmad-agent-builder/quality-scan-script-opportunities.md b/skills/bmad-agent-builder/quality-scan-script-opportunities.md index 27626b7..903bb09 100644 --- a/skills/bmad-agent-builder/quality-scan-script-opportunities.md +++ b/skills/bmad-agent-builder/quality-scan-script-opportunities.md @@ -168,7 +168,7 @@ For each script opportunity found, also assess: | Dimension | Question | |-----------|----------| | **Pre-pass potential** | Could this script feed structured data to an existing LLM scanner? | -| **Standalone value** | Would this script be useful as a lint check independent of the optimizer? | +| **Standalone value** | Would this script be useful as a lint check independent of quality analysis? | | **Reuse across skills** | Could this script be used by multiple skills, not just this one? | | **--help self-documentation** | Prompts that invoke this script can use `--help` instead of inlining the interface — note the token savings | @@ -184,49 +184,17 @@ For each script opportunity found, also assess: --- -## Output Format +## Output -Output your findings using the universal schema defined in `references/universal-scan-schema.md`. +Write your analysis as a natural document. Include: -Use EXACTLY these field names: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`. Do not rename, restructure, or add fields to findings. +- **Existing scripts inventory** — what scripts already exist in the agent +- **Assessment** — overall verdict on intelligence placement in 2-3 sentences +- **Key findings** — deterministic operations found in prompts. Each with severity (high/medium/low based on LLM Tax: high = 500+ tokens, medium = 100-500, low = <100), affected file:line, what the LLM is currently doing, what a script would do instead, estimated token savings, and whether it could serve as a pre-pass +- **Aggregate savings** — total estimated token savings across all opportunities -Before writing output, verify: Is your array called `findings`? Does every item have `title`, `detail`, `action`? Is `assessments` an object, not items in the findings array? +Be specific about file paths and line numbers. Think broadly about what scripts can accomplish. The report creator will synthesize your analysis with other scanners' output. -You will receive `{skill-path}` and `{quality-report-dir}` as inputs. +Write your analysis to: `{quality-report-dir}/script-opportunities-analysis.md` -Write JSON findings to: `{quality-report-dir}/script-opportunities-temp.json` - -```json -{ - "scanner": "script-opportunities", - "skill_path": "{path}", - "findings": [ - { - "file": "SKILL.md|{name}.md", - "line": 42, - "severity": "high|medium|low", - "category": "validation|extraction|transformation|counting|comparison|structure|graph|preprocessing|postprocessing", - "title": "What the LLM is currently doing", - "detail": "Determinism confidence: certain|high|moderate. Estimated token savings: N per invocation. Implementation complexity: trivial|moderate|complex. Language: python|bash|either. Could be prepass: yes/no. Feeds scanner: name if applicable. Reusable across skills: yes/no. Help pattern savings: additional prompt tokens saved by using --help instead of inlining interface.", - "action": "What a script would do instead" - } - ], - "assessments": { - "existing_scripts": ["list of scripts that already exist in the agent's scripts/ folder"] - }, - "summary": { - "total_findings": 0, - "by_severity": {"high": 0, "medium": 0, "low": 0}, - "by_category": {}, - "assessment": "Brief assessment including total estimated token savings, the single highest-value opportunity, and how many findings could become pre-pass scripts for LLM scanners" - } -} -``` - -## Process - -Read all agent files and the scripts/ directory. Apply the determinism test and category analysis described above. Write findings to `{quality-report-dir}/script-opportunities-temp.json`. Return only the filename. - -## Critical After Draft Output - -Before finalizing, verify flagged operations are truly deterministic, existing scripts aren't duplicated, and you stayed in your lane. +Return only the filename when complete. diff --git a/skills/bmad-agent-builder/quality-scan-structure.md b/skills/bmad-agent-builder/quality-scan-structure.md index c705553..5132b78 100644 --- a/skills/bmad-agent-builder/quality-scan-structure.md +++ b/skills/bmad-agent-builder/quality-scan-structure.md @@ -52,7 +52,7 @@ Include all pre-pass findings in your output, preserved as-is. These are determi | Description mentions key action verbs matching capabilities | Users invoke agents with action-oriented language | | Description distinguishes this agent from similar agents | Ambiguous descriptions cause wrong-agent activation | | Description follows two-part format: [5-8 word summary]. [trigger clause] | Standard format ensures consistent triggering behavior | -| Trigger clause uses quoted specific phrases ('create agent', 'optimize agent') | Specific phrases prevent false activations | +| Trigger clause uses quoted specific phrases ('create agent', 'analyze agent') | Specific phrases prevent false activations | | Trigger clause is conservative (explicit invocation) unless organic activation is intentional | Most skills should only fire on direct requests, not casual mentions | ### Identity Effectiveness @@ -76,6 +76,23 @@ Include all pre-pass findings in your output, preserved as-is. These are determi | Principles relate to the agent's specific domain | Generic principles waste tokens | | Principles create clear decision frameworks | Good principles help the agent resolve ambiguity | +### Over-Specification of LLM Capabilities + +Agents should describe outcomes, not prescribe procedures for things the LLM does naturally. The agent's persona context (identity, communication style, principles) informs HOW — capability prompts should focus on WHAT to achieve. Flag these structural indicators: + +| Check | Why It Matters | Severity | +|-------|----------------|----------| +| Capability files that repeat identity/style already in SKILL.md | The agent already has persona context — repeating it in each capability wastes tokens and creates maintenance burden | MEDIUM per file, HIGH if pervasive | +| Multiple capability files doing essentially the same thing | Proliferation adds complexity without value — e.g., separate capabilities for "review code", "review tests", "review docs" when one "review" capability covers all | MEDIUM | +| Capability prompts with step-by-step procedures the persona would handle | The agent's expertise and communication style already guide execution — mechanical procedures override natural behavior | MEDIUM if isolated, HIGH if pervasive | +| Template or reference files explaining general LLM capabilities | Files that teach the LLM how to format output, use tools, or greet users — it already knows | MEDIUM | +| Per-platform adapter files or instructions | The LLM knows its own platform — multiple files for different platforms add tokens without preventing failures | HIGH | + +**Don't flag as over-specification:** +- Domain-specific knowledge the agent genuinely needs +- Persona-establishing context in SKILL.md (identity, style, principles are load-bearing) +- Design rationale for non-obvious choices + ### Logical Consistency | Check | Why It Matters | |-------|----------------| @@ -110,52 +127,19 @@ Include all pre-pass findings in your output, preserved as-is. These are determi --- -## Output Format - -Output your findings using the universal schema defined in `references/universal-scan-schema.md`. - -Use EXACTLY these field names: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`. Do not rename, restructure, or add fields to findings. - -Before writing output, verify: Is your array called `findings`? Does every item have `title`, `detail`, `action`? Is `assessments` an object, not items in the findings array? - -You will receive `{skill-path}` and `{quality-report-dir}` as inputs. - -Write JSON findings to: `{quality-report-dir}/structure-temp.json` - -```json -{ - "scanner": "structure", - "skill_path": "{path}", - "findings": [ - { - "file": "SKILL.md|{name}.md", - "line": 42, - "severity": "critical|high|medium|low", - "category": "frontmatter|sections|artifacts|capabilities|identity|communication-style|principles|consistency|memory-setup|headless-mode|activation-sequence", - "title": "Brief description", - "detail": "", - "action": "Specific action to resolve" - } - ], - "assessments": { - "sections_found": ["Overview", "Identity"], - "capabilities_count": 0, - "has_memory": false, - "has_headless": false, - }, - "summary": { - "total_findings": 0, - "by_severity": {"critical": 0, "high": 0, "medium": 0, "low": 0}, - "by_category": {}, - "assessment": "Brief 1-2 sentence assessment" - } -} -``` - -## Process - -Read pre-pass JSON (include all findings verbatim). Read raw files for judgment-based assessment as described above. Write findings to `{quality-report-dir}/structure-temp.json`. Return only the filename. - -## Critical After Draft Output - -Before finalizing, verify findings cover all structural dimensions and severity ratings are honest. +## Output + +Write your analysis as a natural document. Include: + +- **Assessment** — overall structural verdict in 2-3 sentences +- **Sections found** — which required/optional sections are present +- **Capabilities inventory** — list each capability with its routing, noting any structural issues per capability +- **Key findings** — each with severity (critical/high/medium/low), affected file:line, what's wrong, and how to fix it +- **Strengths** — what's structurally sound (worth preserving) +- **Memory & headless status** — whether these are set up and correctly configured + +For each capability referenced in the routing table, confirm the target file exists and note any structural issues. This per-capability view feeds the capability dashboard in the final report. + +Write your analysis to: `{quality-report-dir}/structure-analysis.md` + +Return only the filename when complete. diff --git a/skills/bmad-agent-builder/references/quality-dimensions.md b/skills/bmad-agent-builder/references/quality-dimensions.md index a1c95bf..29626cc 100644 --- a/skills/bmad-agent-builder/references/quality-dimensions.md +++ b/skills/bmad-agent-builder/references/quality-dimensions.md @@ -1,8 +1,16 @@ # Quality Dimensions — Quick Reference -Six dimensions to keep in mind when building agent skills. The quality scanners check these automatically during optimization — this is a mental checklist for the build phase. +Seven dimensions to keep in mind when building agent skills. The quality scanners check these automatically during quality analysis — this is a mental checklist for the build phase. -## 1. Informed Autonomy +## 1. Outcome-Driven Design + +Describe what each capability achieves, not how to do it step by step. The agent's persona context (identity, communication style, principles) informs HOW — capability prompts just need the WHAT. + +- **The test:** Would removing this instruction cause the agent to produce a worse outcome? If the agent would do it anyway given its persona and the desired outcome, the instruction is noise. +- **Pruning:** If a capability prompt teaches the LLM something it already knows — or repeats guidance already in the agent's identity/style — cut it. +- **When procedure IS value:** Exact script invocations, specific file paths, API calls, security-critical operations. These need low freedom. + +## 2. Informed Autonomy The executing agent needs enough context to make judgment calls when situations don't match the script. The Overview section establishes this: domain framing, theory of mind, design rationale. @@ -10,15 +18,15 @@ The executing agent needs enough context to make judgment calls when situations - Agents with memory, autonomous mode, or complex capabilities need domain understanding, user perspective, and rationale for non-obvious choices - When in doubt, explain *why* — an agent that understands the mission improvises better than one following blind steps -## 2. Intelligence Placement +## 3. Intelligence Placement Scripts handle plumbing (fetch, transform, validate). Prompts handle judgment (interpret, classify, decide). **Test:** If a script contains an `if` that decides what content *means*, intelligence has leaked. -**Reverse test:** If a prompt validates structure, counts items, parses known formats, compares against schemas, or checks file existence — determinism has leaked into the LLM. That work belongs in a script. Scripts have access to full bash, Python with standard library plus PEP 723 dependencies, and system tools — think broadly about what can be offloaded. +**Reverse test:** If a prompt validates structure, counts items, parses known formats, compares against schemas, or checks file existence — determinism has leaked into the LLM. That work belongs in a script. -## 3. Progressive Disclosure +## 4. Progressive Disclosure SKILL.md stays focused. Detail goes where it belongs. @@ -29,18 +37,18 @@ SKILL.md stays focused. Detail goes where it belongs. - Multi-capability SKILL.md under ~250 lines: fine as-is - Single-purpose up to ~500 lines: acceptable if focused -## 4. Description Format +## 5. Description Format Two parts: `[5-8 word summary]. [Use when user says 'X' or 'Y'.]` -Default to conservative triggering. See `./references/standard-fields.md` for full format and examples. +Default to conservative triggering. See `./references/standard-fields.md` for full format. -## 5. Path Construction +## 6. Path Construction Only use `{project-root}` for `_bmad` paths. Config variables used directly — they already contain `{project-root}`. See `./references/standard-fields.md` for correct/incorrect patterns. -## 6. Token Efficiency +## 7. Token Efficiency -Remove genuine waste (repetition, defensive padding, meta-explanation). Preserve context that enables judgment (domain framing, theory of mind, design rationale). These are different things — the prompt-craft scanner distinguishes between them. +Remove genuine waste (repetition, defensive padding, meta-explanation). Preserve context that enables judgment (persona voice, domain framing, theory of mind, design rationale). These are different things — never trade effectiveness for efficiency. A capability that works correctly but uses extra tokens is always better than one that's lean but fails edge cases. diff --git a/skills/bmad-agent-builder/references/script-opportunities-reference.md b/skills/bmad-agent-builder/references/script-opportunities-reference.md index b1c7602..1f24ee7 100644 --- a/skills/bmad-agent-builder/references/script-opportunities-reference.md +++ b/skills/bmad-agent-builder/references/script-opportunities-reference.md @@ -302,9 +302,9 @@ When creating validation scripts: --- -## Integration with Quality Optimizer +## Integration with Quality Analysis -The Quality Optimizer should: +The Quality Analysis skill should: 1. **First**: Run available scripts for fast, deterministic checks 2. **Then**: Use sub-agents for semantic analysis (requires judgment) diff --git a/skills/bmad-agent-builder/references/universal-scan-schema.md b/skills/bmad-agent-builder/references/universal-scan-schema.md deleted file mode 100644 index d49bfcc..0000000 --- a/skills/bmad-agent-builder/references/universal-scan-schema.md +++ /dev/null @@ -1,267 +0,0 @@ -# Universal Scanner Output Schema - -All quality scanners — both LLM-based and deterministic lint scripts — MUST produce output conforming to this schema. No exceptions. - -## Top-Level Structure - -```json -{ - "scanner": "scanner-name", - "skill_path": "{path}", - "findings": [], - "assessments": {}, - "summary": { - "total_findings": 0, - "by_severity": {}, - "assessment": "1-2 sentence overall assessment" - } -} -``` - -| Key | Type | Required | Description | -|-----|------|----------|-------------| -| `scanner` | string | yes | Scanner identifier (e.g., `"workflow-integrity"`, `"prompt-craft"`) | -| `skill_path` | string | yes | Absolute path to the skill being scanned | -| `findings` | array | yes | ALL items — issues, strengths, suggestions, opportunities. Always an array, never an object | -| `assessments` | object | yes | Scanner-specific structured analysis (cohesion tables, health metrics, user journeys, etc.). Free-form per scanner | -| `summary` | object | yes | Aggregate counts and brief overall assessment | - -## Finding Schema (7 fields) - -Every item in `findings[]` has exactly these 7 fields: - -```json -{ - "file": "SKILL.md", - "line": 42, - "severity": "high", - "category": "frontmatter", - "title": "Brief headline of the finding", - "detail": "Full context — rationale, what was observed, why it matters", - "action": "What to do about it — fix, suggestion, or script to create" -} -``` - -| Field | Type | Required | Description | -|-------|------|----------|-------------| -| `file` | string | yes | Relative path to the affected file (e.g., `"SKILL.md"`, `"scripts/build.py"`). Empty string if not file-specific | -| `line` | int\|null | no | Line number (1-based). `null` or `0` if not line-specific | -| `severity` | string | yes | One of the severity values below | -| `category` | string | yes | Scanner-specific category (e.g., `"frontmatter"`, `"token-waste"`, `"lint"`) | -| `title` | string | yes | Brief headline (1 sentence). This is the primary display text | -| `detail` | string | yes | Full context — fold rationale, observation, impact, nuance into one narrative. Empty string if title is self-explanatory | -| `action` | string | yes | What to do — fix instruction, suggestion, or script to create. Empty string for strengths/notes | - -## Severity Values (complete enum) - -``` -critical | high | medium | low | high-opportunity | medium-opportunity | low-opportunity | suggestion | strength | note -``` - -**Routing rules:** -- `critical`, `high` → "Truly Broken" section in report -- `medium`, `low` → category-specific findings sections -- `high-opportunity`, `medium-opportunity`, `low-opportunity` → enhancement/creative sections -- `suggestion` → creative suggestions section -- `strength` → strengths section (positive observations worth preserving) -- `note` → informational observations, also routed to strengths - -## Assessment Sub-Structure Contracts - -The `assessments` object is free-form per scanner, but the HTML report renderer expects specific shapes for specific keys. These are the canonical formats. - -### user_journeys (enhancement-opportunities scanner) - -**Always an array of objects. Never an object keyed by persona.** - -```json -"user_journeys": [ - { - "archetype": "first-timer", - "summary": "Brief narrative of this user's experience", - "friction_points": ["moment 1", "moment 2"], - "bright_spots": ["what works well"] - } -] -``` - -### autonomous_assessment (enhancement-opportunities scanner) - -```json -"autonomous_assessment": { - "potential": "headless-ready|easily-adaptable|partially-adaptable|fundamentally-interactive", - "hitl_points": 3, - "auto_resolvable": 2, - "needs_input": 1, - "notes": "Brief assessment" -} -``` - -### top_insights (enhancement-opportunities scanner) - -**Always an array of objects with title/detail/action (same shape as findings but without file/line/severity/category).** - -```json -"top_insights": [ - { - "title": "The key observation", - "detail": "Why it matters", - "action": "What to do about it" - } -] -``` - -### cohesion_analysis (skill-cohesion / agent-cohesion scanner) - -```json -"cohesion_analysis": { - "dimension_name": { "score": "strong|moderate|weak", "notes": "explanation" } -} -``` - -Dimension names are scanner-specific (e.g., `stage_flow_coherence`, `persona_alignment`). The report renderer iterates all keys and renders a table row per dimension. - -### skill_identity / agent_identity (cohesion scanners) - -```json -"skill_identity": { - "name": "skill-name", - "purpose_summary": "Brief characterization", - "primary_outcome": "What this skill produces" -} -``` - -### skillmd_assessment (prompt-craft scanner) - -```json -"skillmd_assessment": { - "overview_quality": "appropriate|excessive|missing", - "progressive_disclosure": "good|needs-extraction|monolithic", - "notes": "brief assessment" -} -``` - -Agent variant adds `"persona_context": "appropriate|excessive|missing"`. - -### prompt_health (prompt-craft scanner) - -```json -"prompt_health": { - "total_prompts": 3, - "with_config_header": 2, - "with_progression": 1, - "self_contained": 3 -} -``` - -### skill_understanding (enhancement-opportunities scanner) - -```json -"skill_understanding": { - "purpose": "what this skill does", - "primary_user": "who it's for", - "assumptions": ["assumption 1", "assumption 2"] -} -``` - -### stage_summary (workflow-integrity scanner) - -```json -"stage_summary": { - "total_stages": 0, - "missing_stages": [], - "orphaned_stages": [], - "stages_without_progression": [], - "stages_without_config_header": [] -} -``` - -### metadata (structure scanner) - -Free-form key-value pairs. Rendered as a metadata block. - -### script_summary (scripts lint) - -```json -"script_summary": { - "total_scripts": 5, - "by_type": {"python": 3, "shell": 2}, - "missing_tests": ["script1.py"] -} -``` - -### existing_scripts (script-opportunities scanner) - -Array of strings (script paths that already exist). - -## Complete Example - -```json -{ - "scanner": "workflow-integrity", - "skill_path": "/path/to/skill", - "findings": [ - { - "file": "SKILL.md", - "line": 12, - "severity": "high", - "category": "frontmatter", - "title": "Missing 'description' field in frontmatter", - "detail": "The SKILL.md frontmatter is missing the description field. Without a description, the skill cannot be triggered reliably by the help system.", - "action": "Add a description with trigger phrases to the YAML frontmatter block" - }, - { - "file": "build-process.md", - "line": null, - "severity": "strength", - "category": "design", - "title": "Excellent progressive disclosure pattern in build stages", - "detail": "Each stage provides exactly the context needed without front-loading information. This reduces token waste and improves LLM comprehension.", - "action": "" - }, - { - "file": "SKILL.md", - "line": 45, - "severity": "medium-opportunity", - "category": "experience-gap", - "title": "No guidance for first-time users unfamiliar with build workflows", - "detail": "A user encountering this skill for the first time has no onboarding path. The skill assumes familiarity with stage-based workflows, which creates friction for newcomers.", - "action": "Add a 'Getting Started' section or link to onboarding documentation" - } - ], - "assessments": { - "stage_summary": { - "total_stages": 7, - "missing_stages": [], - "orphaned_stages": ["cleanup"] - } - }, - "summary": { - "total_findings": 3, - "by_severity": {"high": 1, "medium-opportunity": 1, "strength": 1}, - "assessment": "Well-structured skill with one critical frontmatter gap. Progressive disclosure is a notable strength." - } -} -``` - -## DO NOT - -- **DO NOT** rename fields. Use exactly: `file`, `line`, `severity`, `category`, `title`, `detail`, `action` -- **DO NOT** use `issues` instead of `findings` — the array is always called `findings` -- **DO NOT** add fields to findings beyond the 7 defined above. Put scanner-specific structured data in `assessments` -- **DO NOT** use separate arrays for strengths, suggestions, or opportunities — they go in `findings` with appropriate severity values -- **DO NOT** change `user_journeys` from an array to an object keyed by persona name -- **DO NOT** restructure assessment sub-objects — use the shapes defined above -- **DO NOT** put free-form narrative data into `assessments` — that belongs in `detail` fields of findings or in `summary.assessment` - -## Self-Check Before Output - -Before writing your JSON output, verify: - -1. Is your array called `findings` (not `issues`, not `opportunities`)? -2. Does every item in `findings` have all 7 fields: `file`, `line`, `severity`, `category`, `title`, `detail`, `action`? -3. Are strengths in `findings` with `severity: "strength"` (not in a separate `strengths` array)? -4. Are suggestions in `findings` with `severity: "suggestion"` (not in a separate `creative_suggestions` array)? -5. Is `assessments` an object containing structured analysis data (not items that belong in findings)? -6. Is `user_journeys` an array of objects (not an object keyed by persona)? -7. Do `top_insights` items use `title`/`detail`/`action` (not `insight`/`suggestion`/`why_it_matters`)? diff --git a/skills/bmad-agent-builder/report-quality-scan-creator.md b/skills/bmad-agent-builder/report-quality-scan-creator.md index ffa7161..3c0aee3 100644 --- a/skills/bmad-agent-builder/report-quality-scan-creator.md +++ b/skills/bmad-agent-builder/report-quality-scan-creator.md @@ -1,119 +1,276 @@ -# Quality Scan Report Creator +# BMad Method · Quality Analysis Report Creator -You are a master quality engineer tech writer agent QualityReportBot-9001. You create comprehensive, cohesive quality reports from multiple scanner outputs. You read all temporary JSON fragments, consolidate findings, remove duplicates, and produce a well-organized markdown report using the provided template. You are quality obsessed — nothing gets dropped. You will never attempt to fix anything — you are a writer, not a fixer. +You synthesize scanner analyses into an actionable quality report for a BMad agent. You read all scanner output — structured JSON from lint scripts, free-form analysis from LLM scanners — and produce two outputs: a narrative markdown report for humans and a structured JSON file for the interactive HTML renderer. + +Your job is **synthesis, not transcription.** Don't list findings by scanner. Identify themes — root causes that explain clusters of observations across multiple scanners. Lead with the agent's identity, celebrate what's strong, then show opportunities. ## Inputs -- `{skill-path}` — Path to the agent being validated -- `{quality-report-dir}` — Directory containing scanner temp files AND where to write the final report +- `{skill-path}` — Path to the agent being analyzed +- `{quality-report-dir}` — Directory containing all scanner output AND where to write your reports -## Template +## Process -Read `assets/quality-report-template.md` for the report structure. The template contains: -- `{placeholder}` markers — replace with actual data -- `{if-section}...{/if-section}` blocks — include only when data exists, omit entirely when empty -- `` — inline guidance for what data to pull and from where; strip from final output +### Step 1: Read Everything -## Process +Read all files in `{quality-report-dir}`: +- `*-temp.json` — Lint script output (structured JSON with findings arrays) +- `*-prepass.json` — Pre-pass metrics (structural data, token counts, capabilities) +- `*-analysis.md` — LLM scanner analyses (free-form markdown) + +Also read the agent's `SKILL.md` to extract: name, icon, title, identity, communication style, principles, and the capability routing table. + +### Step 2: Build the Agent Portrait + +From the agent's SKILL.md, synthesize a 2-3 sentence portrait that captures who this agent is — their personality, expertise, and voice. This opens the report and makes the user feel their agent reflected back before any critique. Include the agent's icon, display name, and title. + +### Step 3: Build the Capability Dashboard + +From the routing table in SKILL.md, list every capability. Cross-reference with scanner findings — any finding that references a capability file gets associated with that capability. Rate each: +- **Good** — no findings or only low/note severity +- **Needs attention** — medium+ findings referencing this capability + +This dashboard shows the user the breadth of what they built and directs attention where it's needed. + +### Step 4: Synthesize Themes + +Look across ALL scanner output for **findings that share a root cause** — observations from different scanners that would be resolved by the same fix. + +Ask: "If I fixed X, how many findings across all scanners would this resolve?" + +Group related findings into 3-5 themes. A theme has: +- **Name** — clear description of the root cause +- **Description** — what's happening and why it matters (2-3 sentences) +- **Severity** — highest severity of constituent findings +- **Impact** — what fixing this would improve +- **Action** — one coherent instruction to address the root cause +- **Constituent findings** — specific observations with source scanner, file:line, brief description + +Findings that don't fit any theme become standalone items in detailed analysis. + +### Step 5: Assess Overall Quality + +- **Grade:** Excellent / Good / Fair / Poor (based on severity distribution) +- **Narrative:** 2-3 sentences capturing the agent's primary strength and primary opportunity + +### Step 6: Collect Strengths + +Gather strengths from all scanners. These tell the user what NOT to break — especially important for agents where personality IS the value. + +### Step 7: Organize Detailed Analysis -### Step 1: Ingest Everything +For each analysis dimension, summarize the scanner's assessment and list findings not covered by themes: +- **Structure & Capabilities** — from structure scanner +- **Persona & Voice** — from prompt-craft scanner (agent-specific framing) +- **Identity Cohesion** — from agent-cohesion scanner +- **Execution Efficiency** — from execution-efficiency scanner +- **Conversation Experience** — from enhancement-opportunities scanner (journeys, headless, edge cases) +- **Script Opportunities** — from script-opportunities scanner -1. Read `assets/quality-report-template.md` -2. List ALL files in `{quality-report-dir}` — both `*-temp.json` (scanner findings) and `*-prepass.json` (structural metrics) -3. Read EVERY JSON file +### Step 8: Rank Recommendations -### Step 2: Extract All Data Types +Order by impact — "how many findings does fixing this resolve?" The fix that clears 9 findings ranks above the fix that clears 1. -All scanners now use the universal schema defined in `references/universal-scan-schema.md`. Scanner-specific data lives in `assessments{}`, not as top-level keys. +## Write Two Files -For each scanner file, extract not just `findings` arrays but ALL of these data types: +### 1. quality-report.md -| Data Type | Where It Lives | Report Destination | -|-----------|---------------|-------------------| -| Issues/findings (severity: critical-low) | All scanner `findings[]` | Detailed Findings by Category | -| Strengths (severity: "strength"/"note", category: "strength") | All scanners: findings where severity="strength" | Strengths section | -| Agent identity | agent-cohesion `assessments.agent_identity` | Agent Identity section + Executive Summary | -| Cohesion dimensional analysis | agent-cohesion `assessments.cohesion_analysis` | Cohesion Analysis table | -| Consolidation opportunities | agent-cohesion `assessments.cohesion_analysis.redundancy_level.consolidation_opportunities` | Consolidation Opportunities in Cohesion | -| Creative suggestions | `findings[]` with severity="suggestion" (no separate creative_suggestions array) | Creative Suggestions in Cohesion section | -| Craft & agent assessment | prompt-craft `assessments.skillmd_assessment` (incl. `persona_context`), `assessments.prompt_health`, `summary.assessment` | Prompt Craft section header + Executive Summary | -| Structure metadata | structure `assessments.metadata` (has_memory, has_headless, etc.) | Structure & Capabilities section header | -| User journeys | enhancement-opportunities `assessments.user_journeys[]` | User Journeys section | -| Autonomous assessment | enhancement-opportunities `assessments.autonomous_assessment` | Autonomous Readiness section | -| Skill understanding | enhancement-opportunities `assessments.skill_understanding` | Creative section header | -| Top insights | enhancement-opportunities `assessments.top_insights[]` | Top Insights in Creative section | -| Optimization opportunities | `findings[]` with severity ending in "-opportunity" (no separate opportunities array) | Optimization Opportunities in Efficiency section | -| Script inventory & token savings | scripts `assessments.script_summary`, script-opportunities `summary` | Scripts sections | -| Prepass metrics | `*-prepass.json` files | Context data points where useful | +```markdown +# BMad Method · Quality Analysis: {agent-name} -### Step 3: Populate Template +**{icon} {display-name}** — {title} +**Analyzed:** {timestamp} | **Path:** {skill-path} +**Interactive report:** quality-report.html -Fill the template section by section, following the `` guidance in each. Key rules: +## Agent Portrait -- **Conditional sections:** Only include `{if-...}` blocks when the data exists. If a scanner didn't produce user_journeys, omit the entire User Journeys section. -- **Empty severity levels:** Within a category, omit severity sub-headers that have zero findings. -- **Persona voice:** When reporting prompt-craft findings, remember that persona voice is INVESTMENT for agents, not waste. Reflect the scanner's nuance field if present. -- **Strip comments:** Remove all `` blocks from final output. +{synthesized 2-3 sentence portrait} -### Step 4: Deduplicate +## Capabilities -- **Same issue, two scanners:** Keep ONE entry, cite both sources. Use the more detailed description. -- **Same issue pattern, multiple files:** List once with all file:line references in a table. -- **Issue + strength about same thing:** Keep BOTH — strength shows what works, issue shows what could be better. -- **Overlapping creative suggestions:** Merge into the richer description. -- **Routing:** "note"/"strength" severity → Strengths section. "suggestion" severity → Creative subsection. Do not mix these into issue lists. +| Capability | Status | Observations | +|-----------|--------|-------------| +| {name} | Good / Needs attention | {count or —} | -### Step 5: Verification Pass +## Assessment -Re-read all temp files and verify every finding appears in the report. If any item was dropped, add it to the appropriate section before writing. +**{Grade}** — {narrative} -### Step 6: Write and Return +## What's Broken -Write report to: `{quality-report-dir}/quality-report.md` +{Only if critical/high issues exist} -Return JSON: +## Opportunities + +### 1. {Theme Name} ({severity} — {N} observations) + +{Description + Fix + constituent findings} + +## Strengths + +{What this agent does well} + +## Detailed Analysis + +### Structure & Capabilities +### Persona & Voice +### Identity Cohesion +### Execution Efficiency +### Conversation Experience +### Script Opportunities + +## Recommendations + +1. {Highest impact} +2. ... +``` + +### 2. report-data.json + +**CRITICAL: This file is consumed by a deterministic Python script. Use EXACTLY the field names shown below. Do not rename, restructure, or omit any required fields. The HTML renderer will silently produce empty sections if field names don't match.** + +Every `"..."` below is a placeholder for your content. Replace with actual values. Arrays may be empty `[]` but must exist. ```json { - "report_file": "{full-path-to-report}", - "summary": { - "total_issues": 0, - "critical": 0, - "high": 0, - "medium": 0, - "low": 0, - "strengths_count": 0, - "enhancements_count": 0, - "user_journeys_count": 0, - "overall_quality": "Excellent|Good|Fair|Poor", - "overall_cohesion": "cohesive|mostly-cohesive|fragmented|confused", - "craft_assessment": "brief summary from prompt-craft", - "truly_broken_found": true, - "truly_broken_count": 0 + "meta": { + "skill_name": "the-agent-name", + "skill_path": "/full/path/to/agent", + "timestamp": "2026-03-26T23:03:03Z", + "scanner_count": 8, + "type": "agent" }, - "by_category": { - "structure_capabilities": {"critical": 0, "high": 0, "medium": 0, "low": 0}, - "prompt_craft": {"critical": 0, "high": 0, "medium": 0, "low": 0}, - "execution_efficiency": {"critical": 0, "high": 0, "medium": 0, "low": 0}, - "path_script_standards": {"critical": 0, "high": 0, "medium": 0, "low": 0}, - "agent_cohesion": {"critical": 0, "high": 0, "medium": 0, "low": 0}, - "creative": {"high_opportunity": 0, "medium_opportunity": 0, "low_opportunity": 0} + "agent_profile": { + "icon": "emoji icon from agent's SKILL.md", + "display_name": "Agent's display name", + "title": "Agent's title/role", + "portrait": "Synthesized 2-3 sentence personality portrait" }, - "high_impact_quick_wins": [ - {"issue": "description", "file": "location", "effort": "low"} + "capabilities": [ + { + "name": "Capability display name", + "file": "references/capability-file.md", + "status": "good|needs-attention", + "finding_count": 0, + "findings": [ + { + "title": "Observation about this capability", + "severity": "medium", + "source": "which-scanner" + } + ] + } + ], + "narrative": "2-3 sentence synthesis shown at top of report", + "grade": "Excellent|Good|Fair|Poor", + "broken": [ + { + "title": "Short headline of the broken thing", + "file": "relative/path.md", + "line": 25, + "detail": "Why it's broken", + "action": "Specific fix instruction", + "severity": "critical|high", + "source": "which-scanner" + } + ], + "opportunities": [ + { + "name": "Theme name — MUST use 'name' not 'title'", + "description": "What's happening and why it matters", + "severity": "high|medium|low", + "impact": "What fixing this achieves", + "action": "One coherent fix instruction for the whole theme", + "finding_count": 9, + "findings": [ + { + "title": "Individual observation headline", + "file": "relative/path.md", + "line": 42, + "detail": "What was observed", + "source": "which-scanner" + } + ] + } + ], + "strengths": [ + { + "title": "What's strong — MUST be an object with 'title', not a plain string", + "detail": "Why it matters and should be preserved" + } + ], + "detailed_analysis": { + "structure": { + "assessment": "1-3 sentence summary", + "findings": [] + }, + "persona": { + "assessment": "1-3 sentence summary", + "overview_quality": "appropriate|excessive|missing", + "findings": [] + }, + "cohesion": { + "assessment": "1-3 sentence summary", + "dimensions": { + "persona_capability_alignment": { "score": "strong|moderate|weak", "notes": "explanation" } + }, + "findings": [] + }, + "efficiency": { + "assessment": "1-3 sentence summary", + "findings": [] + }, + "experience": { + "assessment": "1-3 sentence summary", + "journeys": [ + { + "archetype": "first-timer|expert|confused|edge-case|hostile-environment|automator", + "summary": "Brief narrative of this user's experience", + "friction_points": ["moment where user struggles"], + "bright_spots": ["moment where agent shines"] + } + ], + "autonomous": { + "potential": "headless-ready|easily-adaptable|partially-adaptable|fundamentally-interactive", + "notes": "Brief assessment" + }, + "findings": [] + }, + "scripts": { + "assessment": "1-3 sentence summary", + "token_savings": "estimated total", + "findings": [] + } + }, + "recommendations": [ + { + "rank": 1, + "action": "What to do — MUST use 'action' not 'description'", + "resolves": 9, + "effort": "low|medium|high" + } ] } ``` -## Scanner Reference - -| Scanner | Temp File | Primary Category | -|---------|-----------|-----------------| -| structure | structure-temp.json | Structure & Capabilities | -| prompt-craft | prompt-craft-temp.json | Prompt Craft | -| execution-efficiency | execution-efficiency-temp.json | Execution Efficiency | -| path-standards | path-standards-temp.json | Path & Script Standards | -| scripts | scripts-temp.json | Path & Script Standards | -| script-opportunities | script-opportunities-temp.json | Script Opportunities | -| agent-cohesion | agent-cohesion-temp.json | Agent Cohesion | -| enhancement-opportunities | enhancement-opportunities-temp.json | Creative | +**Self-check before writing report-data.json:** +1. Is `meta.skill_name` present (not `meta.skill` or `meta.name`)? +2. Is `meta.scanner_count` a number (not an array)? +3. Does `agent_profile` have all 4 fields: `icon`, `display_name`, `title`, `portrait`? +4. Is every strength an object `{"title": "...", "detail": "..."}` (not a plain string)? +5. Does every opportunity use `name` (not `title`) and include `finding_count` and `findings` array? +6. Does every recommendation use `action` (not `description`) and include `rank` number? +7. Does every capability include `name`, `file`, `status`, `finding_count`, `findings`? +8. Are detailed_analysis keys exactly: `structure`, `persona`, `cohesion`, `efficiency`, `experience`, `scripts`? +9. Does every journey use `archetype` (not `persona`), `summary` (not `friction`), `friction_points` array, `bright_spots` array? +10. Does `autonomous` use `potential` and `notes`? + +Write both files to `{quality-report-dir}/`. + +## Return + +Return only the path to `report-data.json` when complete. + +## Key Principle + +You are the synthesis layer. Scanners analyze through individual lenses. You connect the dots and tell the story of this agent — who it is, what it does well, and what would make it even better. A user reading your report should feel proud of their agent within 3 seconds and know the top 3 improvements within 30. diff --git a/skills/bmad-agent-builder/scripts/generate-html-report.py b/skills/bmad-agent-builder/scripts/generate-html-report.py index a8614db..1d2cefe 100644 --- a/skills/bmad-agent-builder/scripts/generate-html-report.py +++ b/skills/bmad-agent-builder/scripts/generate-html-report.py @@ -4,22 +4,18 @@ #!/usr/bin/env python3 """ -Generate an interactive HTML quality report from scanner temp JSON files. +Generate an interactive HTML quality analysis report for a BMad agent. -Reads all *-temp.json and *-prepass.json files from a quality scan output -directory, normalizes findings into a unified data model, and produces a +Reads report-data.json produced by the report creator and renders a self-contained HTML report with: - - Collapsible sections with severity filter badges - - Per-item copy-prompt buttons - - Multi-select batch prompt generator - - Executive summary with severity counts + - BMad Method branding + - Agent portrait (icon, name, title, personality description) + - Capability dashboard with expandable per-capability findings + - Opportunity themes with "Fix This Theme" prompt generation + - Expandable strengths and detailed analysis Usage: - python3 generate-html-report.py {quality-report-dir} [--open] [--skill-path /path/to/skill] - -The --skill-path is embedded in the prompt context so generated prompts -reference the correct location. If omitted, it is read from the first -temp JSON that contains a skill_path field. + python3 generate-html-report.py {quality-report-dir} [--open] """ from __future__ import annotations @@ -29,501 +25,32 @@ import platform import subprocess import sys -from datetime import datetime, timezone from pathlib import Path -# ============================================================================= -# Normalization — diverse scanner JSONs → unified item model -# ============================================================================= - -SEVERITY_RANK = { - 'critical': 0, 'high': 1, 'medium': 2, 'low': 3, - 'high-opportunity': 1, 'medium-opportunity': 2, 'low-opportunity': 3, - 'note': 4, 'strength': 5, 'suggestion': 4, 'info': 5, -} - -# Map scanner names to report sections -SCANNER_SECTIONS = { - 'workflow-integrity': 'structural', - 'structure': 'structure-capabilities', - 'prompt-craft': 'prompt-craft', - 'execution-efficiency': 'efficiency', - 'skill-cohesion': 'cohesion', - 'agent-cohesion': 'cohesion', - 'path-standards': 'quality', - 'scripts': 'scripts', - 'script-opportunities': 'script-opportunities', - 'enhancement-opportunities': 'creative', -} - -SECTION_LABELS = { - 'structural': 'Structural', - 'structure-capabilities': 'Structure & Capabilities', - 'prompt-craft': 'Prompt Craft', - 'efficiency': 'Efficiency', - 'cohesion': 'Cohesion', - 'quality': 'Path & Script Standards', - 'scripts': 'Scripts', - 'script-opportunities': 'Script Opportunities', - 'creative': 'Creative & Enhancements', -} +def load_report_data(report_dir: Path) -> dict: + """Load report-data.json from the report directory.""" + data_file = report_dir / 'report-data.json' + if not data_file.exists(): + print(f'Error: {data_file} not found', file=sys.stderr) + sys.exit(2) + return json.loads(data_file.read_text(encoding='utf-8')) -def _coalesce(*values) -> str: - """Return the first truthy string value, or empty string.""" - for v in values: - if v and isinstance(v, str) and v.strip() and v.strip() not in ('N/A', 'n/a', 'None'): - return v.strip() - return '' - - -def _norm_severity(sev: str) -> str: - """Normalize severity to lowercase, handle variants.""" - if not sev: - return 'low' - s = sev.strip().lower() - # Map common variants - return { - 'high-opportunity': 'high-opportunity', - 'medium-opportunity': 'medium-opportunity', - 'low-opportunity': 'low-opportunity', - }.get(s, s) - - -def normalize_finding(f: dict, scanner: str, idx: int) -> dict: - """ - Normalize a single finding/issue dict into the unified item model. - - Handles all known field name variants across scanners: - Title: issue | title | description (fallback) - Desc: description | rationale | observation | insight | scenario | - current_behavior | current_pattern | context | nuance - Action: fix | recommendation | suggestion | suggested_approach | - efficient_alternative | script_alternative - File: file | location | current_location - Line: line | lines - Cat: category | dimension - Impact: user_impact | impact | estimated_savings | estimated_token_savings - """ - sev = _norm_severity(f.get('severity', 'low')) - section = SCANNER_SECTIONS.get(scanner, 'other') - - # Determine item type from severity - if sev in ('strength', 'note') or f.get('category') == 'strength': - item_type = 'strength' - action_type = 'none' - selectable = False - elif sev.endswith('-opportunity'): - item_type = 'enhancement' - action_type = 'enhance' - selectable = True - elif f.get('category') == 'suggestion' or sev == 'suggestion': - item_type = 'suggestion' - action_type = 'refactor' - selectable = True - else: - item_type = 'issue' - action_type = 'fix' - selectable = True - - # --- Title: prefer 'title', fall back to old field names --- - title = _coalesce( - f.get('title'), - f.get('issue'), - _truncate(f.get('scenario', ''), 150), - _truncate(f.get('current_behavior', ''), 150), - _truncate(f.get('description', ''), 150), - f.get('observation', ''), - ) - if not title: - title = f.get('id', 'Finding') - - # --- Detail/description: prefer 'detail', fall back to old field names --- - description = _coalesce(f.get('detail')) - if not description: - # Backward compat: coalesce old field names - desc_candidates = [] - for key in ('description', 'rationale', 'observation', 'insight', 'scenario', - 'current_behavior', 'current_pattern', 'context', 'nuance', - 'assessment'): - v = f.get(key) - if v and isinstance(v, str) and v.strip() and v != title: - desc_candidates.append(v.strip()) - description = ' '.join(desc_candidates) if desc_candidates else '' - - # --- Action: prefer 'action', fall back to old field names --- - action = _coalesce( - f.get('action'), - f.get('fix'), - f.get('recommendation'), - f.get('suggestion'), - f.get('suggested_approach'), - f.get('efficient_alternative'), - f.get('script_alternative'), - ) - - # --- File reference --- - file_ref = _coalesce( - f.get('file'), - f.get('location'), - f.get('current_location'), - ) - - # --- Line reference --- - line = f.get('line') - if line is None: - lines_str = f.get('lines') - if lines_str: - line = str(lines_str) - - # --- Category --- - category = _coalesce( - f.get('category'), - f.get('dimension'), - ) - - # --- Impact (backward compat only - new schema folds into detail) --- - impact = _coalesce( - f.get('user_impact'), - f.get('impact'), - f.get('estimated_savings'), - str(f.get('estimated_token_savings', '')) if f.get('estimated_token_savings') else '', - ) - - # --- Extra fields for specific scanners --- - extra = {} - if scanner == 'script-opportunities': - action_type = 'create-script' - for k in ('determinism_confidence', 'implementation_complexity', - 'language', 'could_be_prepass', 'reusable_across_skills'): - if k in f: - extra[k] = f[k] - - # Use scanner-provided id if available - item_id = f.get('id', f'{scanner}-{idx:03d}') - - return { - 'id': item_id, - 'scanner': scanner, - 'section': section, - 'type': item_type, - 'severity': sev, - 'rank': SEVERITY_RANK.get(sev, 3), - 'category': category, - 'file': file_ref, - 'line': line, - 'title': title, - 'description': description, - 'action': action, - 'impact': impact, - 'extra': extra, - 'selectable': selectable, - 'action_type': action_type, - } - - -def _truncate(text: str, max_len: int) -> str: - """Truncate text to max_len, breaking at sentence boundary if possible.""" - if not text: - return '' - text = text.strip() - if len(text) <= max_len: - return text - # Try to break at sentence boundary - for end in ('. ', '.\n', ' — ', '; '): - pos = text.find(end) - if 0 < pos < max_len: - return text[:pos + 1].strip() - return text[:max_len].strip() + '...' - - -def normalize_scanner(data: dict) -> tuple[list[dict], dict]: - """ - Normalize a full scanner JSON into (items, meta). - Returns list of normalized items + dict of meta/assessment data. - Handles all known scanner output variants. - """ - scanner = data.get('scanner', 'unknown') - items = [] - meta = {} - - # New schema: findings[]. Backward compat: issues[] or findings[] - findings = data.get('findings') or data.get('issues') or [] - for idx, f in enumerate(findings): - items.append(normalize_finding(f, scanner, idx)) - - # Backward compat: opportunities[] (execution-efficiency had separate array) - for idx, opp in enumerate(data.get('opportunities', []), start=len(findings)): - opp_item = normalize_finding(opp, scanner, idx) - opp_item['type'] = 'enhancement' - opp_item['action_type'] = 'enhance' - opp_item['selectable'] = True - items.append(opp_item) - - # Backward compat: strengths[] (old cohesion scanners — plain strings) - for idx, s in enumerate(data.get('strengths', [])): - text = s if isinstance(s, str) else (s.get('title', '') if isinstance(s, dict) else str(s)) - desc = '' if isinstance(s, str) else (s.get('description', s.get('detail', '')) if isinstance(s, dict) else '') - items.append({ - 'id': f'{scanner}-str-{idx:03d}', - 'scanner': scanner, - 'section': SCANNER_SECTIONS.get(scanner, 'cohesion'), - 'type': 'strength', - 'severity': 'strength', - 'rank': 5, - 'category': 'strength', - 'file': '', - 'line': None, - 'title': text, - 'description': desc, - 'action': '', - 'impact': '', - 'extra': {}, - 'selectable': False, - 'action_type': 'none', - }) - - # Backward compat: creative_suggestions[] (old cohesion scanners) - for idx, cs in enumerate(data.get('creative_suggestions', [])): - if isinstance(cs, str): - cs_title, cs_desc = cs, '' - else: - cs_title = _coalesce(cs.get('title'), cs.get('idea'), '') - cs_desc = _coalesce(cs.get('description'), cs.get('detail'), cs.get('rationale'), '') - items.append({ - 'id': cs.get('id', f'{scanner}-cs-{idx:03d}') if isinstance(cs, dict) else f'{scanner}-cs-{idx:03d}', - 'scanner': scanner, - 'section': SCANNER_SECTIONS.get(scanner, 'cohesion'), - 'type': 'suggestion', - 'severity': 'suggestion', - 'rank': 4, - 'category': cs.get('type', 'suggestion') if isinstance(cs, dict) else 'suggestion', - 'file': '', - 'line': None, - 'title': cs_title, - 'description': cs_desc, - 'action': cs_title, - 'impact': cs.get('estimated_impact', '') if isinstance(cs, dict) else '', - 'extra': {}, - 'selectable': True, - 'action_type': 'refactor', - }) - - # New schema: assessments{} contains all structured analysis - # Backward compat: also collect from top-level keys - if 'assessments' in data: - meta.update(data['assessments']) - - # Backward compat: collect meta from top-level keys - skip_keys = {'scanner', 'script', 'version', 'skill_path', 'agent_path', - 'timestamp', 'scan_date', 'status', 'issues', 'findings', - 'strengths', 'creative_suggestions', 'opportunities', 'assessments'} - for key, val in data.items(): - if key not in skip_keys and key not in meta: - meta[key] = val - - return items, meta - - -def build_journeys(data: dict) -> list[dict]: - """ - Extract user journey data from enhancement-opportunities scanner. - Handles two formats: - - Array of objects: [{archetype, journey_summary, friction_points, bright_spots}] - - Object keyed by persona: {first_timer: {entry_friction, mid_flow_resilience, exit_satisfaction}} - """ - journeys_raw = data.get('user_journeys') - if not journeys_raw: - return [] - - # Format 1: already a list — normalize field names - if isinstance(journeys_raw, list): - normalized = [] - for j in journeys_raw: - if isinstance(j, dict): - normalized.append({ - 'archetype': j.get('archetype', 'unknown'), - 'journey_summary': j.get('summary', j.get('journey_summary', '')), - 'friction_points': j.get('friction_points', []), - 'bright_spots': j.get('bright_spots', []), - }) - else: - normalized.append(j) - return normalized - - # Format 2: object keyed by persona name - if isinstance(journeys_raw, dict): - result = [] - for persona, details in journeys_raw.items(): - if isinstance(details, dict): - # Convert the dict-based format to the expected format - journey = { - 'archetype': persona.replace('_', ' ').title(), - 'journey_summary': '', - 'friction_points': [], - 'bright_spots': [], - } - # Map known sub-keys to friction/bright spots - for key, val in details.items(): - if isinstance(val, str): - # Heuristic: negative-sounding keys → friction, positive → bright - if any(neg in key.lower() for neg in ('friction', 'issue', 'problem', 'gap', 'pain')): - journey['friction_points'].append(val) - elif any(pos in key.lower() for pos in ('bright', 'strength', 'satisfaction', 'delight')): - journey['bright_spots'].append(val) - else: - # Neutral keys — include as summary parts - if journey['journey_summary']: - journey['journey_summary'] += f' | {key}: {val}' - else: - journey['journey_summary'] = f'{key}: {val}' - elif isinstance(val, list): - for item in val: - if isinstance(item, str): - journey['friction_points'].append(item) - # Build summary from all fields if not yet set - if not journey['journey_summary']: - parts = [] - for k, v in details.items(): - if isinstance(v, str): - parts.append(f'**{k.replace("_", " ").title()}:** {v}') - journey['journey_summary'] = ' | '.join(parts) if parts else str(details) - result.append(journey) - elif isinstance(details, str): - result.append({ - 'archetype': persona.replace('_', ' ').title(), - 'journey_summary': details, - 'friction_points': [], - 'bright_spots': [], - }) - return result - - return [] - - -# ============================================================================= -# Report Data Assembly -# ============================================================================= - -def load_report_data(report_dir: Path, skill_path: str | None) -> dict: - """Load all temp/prepass JSONs and assemble normalized report data.""" - all_items = [] - all_meta = {} - journeys = [] - detected_skill_path = skill_path - - # Read all JSON files - json_files = sorted(report_dir.glob('*.json')) - for jf in json_files: - try: - data = json.loads(jf.read_text(encoding='utf-8')) - except (json.JSONDecodeError, OSError): - continue - - if not isinstance(data, dict): - continue - - scanner = data.get('scanner', jf.stem.replace('-temp', '').replace('-prepass', '')) - - # Detect skill path from scanner data - if not detected_skill_path: - detected_skill_path = data.get('skill_path') or data.get('agent_path') - - # Only normalize temp files (not prepass) - if '-temp' in jf.name or jf.name in ('path-standards-temp.json', 'scripts-temp.json'): - items, meta = normalize_scanner(data) - all_items.extend(items) - all_meta[scanner] = meta - - if scanner == 'enhancement-opportunities': - journeys = build_journeys(data) - elif '-prepass' in jf.name: - all_meta[f'prepass-{scanner}'] = data - - # Sort items: severity rank first, then section - all_items.sort(key=lambda x: (x['rank'], x['section'])) - - # Build severity counts - counts = {'critical': 0, 'high': 0, 'medium': 0, 'low': 0} - for item in all_items: - if item['type'] == 'issue' and item['severity'] in counts: - counts[item['severity']] += 1 - - enhancement_count = sum(1 for i in all_items if i['type'] == 'enhancement') - strength_count = sum(1 for i in all_items if i['type'] == 'strength') - total_issues = sum(counts.values()) - - # Quality grade - if counts['critical'] > 0: - grade = 'Poor' - elif counts['high'] > 2: - grade = 'Fair' - elif counts['high'] > 0 or counts['medium'] > 5: - grade = 'Good' - else: - grade = 'Excellent' - - # Extract assessments for display - assessments = {} - for scanner_key, meta in all_meta.items(): - for akey in ('cohesion_analysis', 'autonomous_assessment', 'skill_understanding', - 'agent_identity', 'skill_identity', 'prompt_health', - 'skillmd_assessment', 'top_insights'): - if akey in meta: - assessments[akey] = meta[akey] - if 'summary' in meta: - s = meta['summary'] - if 'craft_assessment' in s: - assessments['craft_assessment'] = s['craft_assessment'] - if 'overall_cohesion' in s: - assessments['overall_cohesion'] = s['overall_cohesion'] - - # Skill name from path - sp = detected_skill_path or str(report_dir) - skill_name = Path(sp).name - - return { - 'meta': { - 'skill_name': skill_name, - 'skill_path': detected_skill_path or '', - 'timestamp': datetime.now(timezone.utc).isoformat(), - 'scanner_count': len([f for f in json_files if '-temp' in f.name]), - 'report_dir': str(report_dir), - }, - 'executive_summary': { - 'total_issues': total_issues, - 'counts': counts, - 'enhancement_count': enhancement_count, - 'strength_count': strength_count, - 'grade': grade, - 'craft_assessment': assessments.get('craft_assessment', ''), - 'overall_cohesion': assessments.get('overall_cohesion', ''), - }, - 'items': all_items, - 'journeys': journeys, - 'assessments': assessments, - 'section_labels': SECTION_LABELS, - } - - -# ============================================================================= -# HTML Generation -# ============================================================================= - HTML_TEMPLATE = r""" -Quality Report: SKILL_NAME_PLACEHOLDER +BMad Method · Quality Analysis: SKILL_NAME -

Quality Report:

+
BMad Method
+

Quality Analysis:

-
+
+
+
-
- -
- - +
+
+
+
+
+
@@ -927,63 +489,34 @@ def load_report_data(report_dir: Path, skill_path: str | None) -> dict: def generate_html(report_data: dict) -> str: - """Inject report data into the HTML template.""" data_json = json.dumps(report_data, indent=None, ensure_ascii=False) - # Embed the JSON as a script tag before the main script data_tag = f'' - # Insert before the main @@ -929,30 +476,20 @@ def load_report_data(report_dir: Path, skill_path: str | None) -> dict: def generate_html(report_data: dict) -> str: """Inject report data into the HTML template.""" data_json = json.dumps(report_data, indent=None, ensure_ascii=False) - # Embed the JSON as a script tag before the main script data_tag = f'' - # Insert before the main