refactor: outcome-driven overhaul of builder and quality analysis pipeline#43
refactor: outcome-driven overhaul of builder and quality analysis pipeline#43
Conversation
… and quality analysis pipeline
Builder Core (both workflow-builder and agent-builder):
- Establish outcome-driven design as the core philosophy in SKILL.md Overview,
framing the builder's identity around helping users articulate outcomes rather
than prescribe procedures
- Rewrite build-process.md Phase 1: existing skill input treated as reference
material (not spec to replicate), discovery loop enforced with outcome-focused
questions even when given existing input, 3-way routing for existing skills
(Analyze/Edit/Rebuild)
- Add pruning check in Phase 4: "would the LLM do this correctly without being
told? If yes, cut it"
- Remove redundant Phase 5 loading of script-opportunities-reference.md (was
loaded twice — Phase 3 and Phase 5)
Reference Optimization (workflow-builder):
- script-opportunities-reference.md: 3,727 → 956 tokens (74% reduction) by
compressing the 12-entry catalog to a compact table while preserving the
determinism test principles
- complex-workflow-patterns.md: 3,760 → 1,110 tokens (70% reduction) by
removing duplicated {project-root} section, cutting worked skeleton example,
trimming compaction examples to one brief illustration
- quality-dimensions.md: add outcome-driven design as dimension 1 with pruning
test and effectiveness > efficiency principle
- Net Phase 5 build load: 11,900 → 6,660 tokens (44% reduction)
Scanner Improvements (both builders):
- Add pruning concept to prompt-craft scanner with anti-pattern table (scoring
algorithms, calibration tables, adapter files, per-platform instructions)
and elevate over-specification severity from MEDIUM to HIGH
- Expand workflow-integrity scanner's single "no over-specification" line into
a full structural check with specific patterns and severity guidance
- Add per-capability assessment guidance to agent builder's prompt-craft and
cohesion scanners
Quality Analysis Pipeline Redesign (both builders):
- Rename "Quality Optimizer" → "Quality Analysis" throughout (the tool analyzes,
the user optimizes)
- LLM scanners: remove rigid JSON output format sections and universal schema
references (~2,500 tokens saved per scanner, ~15k total across 6 scanners).
Scanners now write free-form analysis documents — natural output focused on
insight, not form-filling
- Report creator: complete rewrite from template-filling to theme synthesis.
Reads all scanner output (mixed formats), identifies root causes that group
findings across scanners, produces both quality-report.md (narrative) and
report-data.json (structured data for HTML). JSON schema is explicit and
non-negotiable with self-check checklist
- HTML generator: complete rewrite to read report-data.json instead of raw
scanner JSONs. Renders opportunities/themes with "Fix This Theme" prompt
generation, BMad Method branding, expandable detailed analysis. Removes
~300 lines of backward-compat field name normalization. Adds normalize()
safety net for common LLM field name drift
- Agent builder HTML: adds agent portrait (icon, name, title, synthesized
personality description) and capability dashboard with expandable
per-capability findings
Deleted Files:
- quality-optimizer.md (replaced by quality-analysis.md in both builders)
- assets/quality-report-template.md (report creator writes narrative, not
fill-in-the-blanks)
- references/universal-scan-schema.md (scanners write free-form analysis,
only report-data.json needs a schema)
|
This pull request is too large for Augment to review. The PR exceeds the maximum size limit of 100000 tokens (approximately 400000 characters) for automated code review. Please consider breaking this PR into smaller, more focused changes. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (33)
WalkthroughThis PR redesigns the BMad quality assessment workflow from "Quality Optimizer" (iterative refinement) to "Quality Analysis" (structural assessment), converting quality scanners from structured JSON output to natural-language markdown reports, reframing design philosophy toward outcome-driven capability definitions, and updating the report generation pipeline to consume a single Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant QA as Quality Analysis<br/>(quality-analysis.md)
participant Lint as Lint/<br/>Pre-Pass Scripts
participant Scanners as LLM<br/>Scanners (6)
participant Creator as Report<br/>Creator
participant HTML as HTML<br/>Generator
User->>QA: Analyze agent/skill
QA->>Lint: Run deterministic<br/>scripts in parallel
Lint-->>QA: JSON metrics<br/>(pre-pass.json)
QA->>Scanners: Spawn 6 scanners<br/>with path ± JSON
Scanners->>Scanners: Read agent,<br/>write *-analysis.md
Scanners-->>QA: Filename
QA->>Creator: Invoke with all<br/>analysis outputs
Creator->>Creator: Read SKILL.md,<br/>synthesize themes,<br/>group findings
Creator-->>QA: report-data.json<br/>+ quality-report.md
QA->>HTML: Generate with<br/>report-data.json
HTML-->>QA: report.html
QA-->>User: Grade + narrative<br/>+ opportunities<br/>+ HTML link
sequenceDiagram
participant Builder as Build Process
participant User
participant AgentQA as Analyze/Edit/<br/>Rebuild Decision
Builder->>Builder: Phase 1: Handle<br/>existing agent
Builder->>AgentQA: Existing agent<br/>provided?
AgentQA->>User: Analyze / Edit /<br/>Rebuild?
User-->>AgentQA: Choice
alt Analyze
AgentQA-->>Builder: Route to<br/>quality-analysis.md
else Edit or Rebuild
AgentQA-->>Builder: Route to<br/>build-process.md<br/>with intent
Builder->>Builder: Phase 2–6:<br/>Outcome-driven<br/>design flow
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
Comprehensive overhaul of both the workflow builder and agent builder, driven by a real-world failure: the builder produced a massively over-specified party-mode skill (scoring algorithms, calibration tables, 3 adapter files, prompt templates) and the quality optimizer didn't catch any of it.
Root causes identified and fixed:
Changes across both builders:
Builder Core:
Quality Analysis Pipeline (renamed from "Quality Optimizer"):
Reference Optimization (workflow-builder):
Stats
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
Refactor