Skip to content

feat: Add CLI tool development support to MAP framework#7

Merged
azalio merged 1 commit intomainfrom
feature/map-cli-improvements
Oct 24, 2025
Merged

feat: Add CLI tool development support to MAP framework#7
azalio merged 1 commit intomainfrom
feature/map-cli-improvements

Conversation

@azalio
Copy link
Copy Markdown
Owner

@azalio azalio commented Oct 24, 2025

Overview

Enhance MAP framework with comprehensive CLI tool development support based on lessons learned from mapify CLI subcommands implementation (PR #6).

Problem

During development of CLI subcommands, we encountered systematic issues that MAP agents didn't catch:

  1. Stdout Pollution: SemanticSearchEngine printed diagnostic messages to stdout, breaking JSON output and | jq pipes
  2. Version Incompatibility: Used CliRunner(mix_stderr=False) unavailable in CI's older Click version
  3. CliRunner ≠ Real CLI: Tests passed with mocked CliRunner but actual installed command had different behavior
  4. No Manual Testing: Relied solely on pytest, missed real-world usage issues

Result: 3 iterations to fix issues that could have been caught with proper CLI validation.

Solution

Add CLI-specific validation, risk prediction, and pattern recognition to MAP agents:

Enhanced Agents (v2.2.0 → v2.3.0)

Monitor Agent

New Section: CLI Tool Validation (### 6)

  • ✅ Manual execution test checklist
  • ✅ Output stream validation (stdout = output, stderr = diagnostics)
  • ✅ Library version compatibility checks
  • ✅ Integration testing requirements
  • ✅ Common CLI issues with solutions

Example Check:

- [ ] Command runs outside test environment (via `python -m` or installed tool)?
- [ ] Stdout contains ONLY intended output (JSON, formatted text)?
- [ ] Diagnostic messages use stderr (print(..., file=sys.stderr))?

Predictor Agent

New Section: CLI Tool Specific Risks

HIGH Risks:

  • Using new library parameter not in minimum supported version
  • Diagnostic messages printing to stdout instead of stderr
  • CLI output format change without version bump
  • Tests pass with CliRunner but real CLI fails

Example:

IF using CliRunner(mix_stderr=False):
  → Check Click version >= 8.0
  → CI may use older version
  → Risk: Tests pass locally, CI fails

Reflector Agent

New Section: CLI Tool Pattern Recognition

Pattern Types:

  • Output Pollution
  • Version Incompatibility
  • CliRunner ≠ Real CLI
  • Stream Handling

Reflection Template:

  1. What test missed?
  2. What manual CLI test would have caught it?
  3. What library version assumption was wrong?
  4. How to verify stdout is clean?

Playbook Schema

New Section: CLI_TOOL_PATTERNS (10th section)

  • Captures CLI-specific lessons
  • Enables pattern reuse across implementations
  • Institutional memory for CLI development

Documentation

New File: docs/CLI_TESTING_GUIDE.md (400+ lines)

Comprehensive guide covering:

  • Output stream management principles
  • Version compatibility patterns
  • Integration testing workflows (CliRunner + subprocess)
  • Common pitfalls with real-world examples
  • Best practices checklist
  • Manual testing workflow

Example Pattern:

# ❌ BAD: Pollutes stdout
print("Loading model...")

# ✅ GOOD: Uses stderr
print("Loading model...", file=sys.stderr)

Benefits

Proactive Issue Detection

  • Monitor: Validates output cleanliness, version compatibility, manual testing
  • Predictor: Flags CLI-specific risks before implementation
  • Reflector: Captures patterns for institutional memory

Improved Quality

  • Catches stdout pollution before merge
  • Prevents version incompatibility in CI
  • Requires integration testing alongside unit tests
  • Documents CLI best practices

Knowledge Capture

  • CLI_TOOL_PATTERNS section in playbook
  • Systematic lesson extraction via Reflector
  • Comprehensive testing guide for reference

Real-World Impact

Before (PR #6 without these improvements):

  1. ❌ Tests passed locally, CI failed (version incompatibility)
  2. ❌ JSON output polluted with diagnostics
  3. ❌ Required 3 commits to fix issues
  4. ❌ Manual testing done only after CI failure

After (with these improvements):

  1. ✅ Monitor catches stdout pollution in review
  2. ✅ Predictor flags version compatibility risk
  3. ✅ Monitor requires manual CLI execution before merge
  4. ✅ Reflector captures patterns for future CLIs

Testing

✅ All agent files validated (template variable check passed)
✅ Playbook schema updated correctly (sections_count: 9 → 10)
✅ CHANGELOG documented comprehensively
✅ CLI Testing Guide provides actionable patterns

Files Changed

  • .claude/agents/monitor.md: +68 lines (CLI validation)
  • .claude/agents/predictor.md: +69 lines (CLI risks)
  • .claude/agents/reflector.md: +46 lines (CLI patterns)
  • src/mapify_cli/playbook_manager.py: +5 lines (CLI section)
  • docs/CLI_TESTING_GUIDE.md: +544 lines (new file)
  • CHANGELOG.md: +54 lines (documentation)

Total: +786 lines of CLI-specific improvements

Related

Checklist

  • Agent versions updated (v2.2.0 → v2.3.0)
  • CHANGELOG documented
  • CLI Testing Guide created
  • Playbook schema updated
  • Agent validation passed
  • All changes committed

Next Steps

Future CLI implementations will benefit from:

  1. Monitor's CLI validation checklist
  2. Predictor's risk warnings
  3. Reflector's pattern capture
  4. CLI Testing Guide reference

Summary: Systematically addresses CLI development challenges discovered in PR #6, preventing similar issues in future MAP-powered CLI implementations.

Copilot AI review requested due to automatic review settings October 24, 2025 18:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the MAP framework with comprehensive CLI tool development support based on lessons learned from the mapify CLI subcommands implementation. The changes address systematic issues with stdout pollution, version incompatibility, and testing gaps that weren't caught during initial development.

Key Changes:

  • Enhanced three MAP agents (Monitor, Predictor, Reflector) to v2.3.0 with CLI-specific validation, risk prediction, and pattern recognition capabilities
  • Added new CLI_TOOL_PATTERNS section to playbook schema for capturing CLI development lessons
  • Created comprehensive CLI Testing Guide (400+ lines) documenting best practices for output stream management, version compatibility, and integration testing

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/mapify_cli/playbook_manager.py Updated playbook schema to support new CLI_TOOL_PATTERNS section (sections_count: 9 → 10)
docs/CLI_TESTING_GUIDE.md New comprehensive guide covering CLI testing best practices, common pitfalls, and workflows
CHANGELOG.md Documented all CLI development improvements with context from PR #6
.claude/agents/reflector.md Added CLI pattern recognition capabilities for extracting CLI-specific lessons
.claude/agents/predictor.md Added CLI-specific risk prediction (stdout pollution, version incompatibility, etc.)
.claude/agents/monitor.md Added CLI validation checklist for manual testing, output streams, and integration tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Enhance MAP agents and documentation with CLI-specific validation,
risk prediction, and pattern recognition based on lessons learned from
mapify CLI subcommands implementation (PR #6).

## Changes

### Enhanced Agents (all v2.2.0 → v2.3.0)

**Monitor Agent**:
- Add CLI Tool Validation section (### 6) with comprehensive checklist
- Manual execution test requirements
- Output stream validation (stdout/stderr separation)
- Library version compatibility checks
- Integration testing patterns
- Common CLI issues with solutions

**Predictor Agent**:
- Add CLI Tool Specific Risks section
- HIGH risk: version incompatibility, stdout pollution, format changes
- MEDIUM risk: env vars, error stream changes, command renames
- Real-world example from mapify implementation
- CLI testing validation checklist

**Reflector Agent**:
- Add CLI_TOOL_PATTERNS pattern type
- CLI pattern recognition signals (pollution, version, streams)
- CLI Reflection Template for lesson extraction
- Pattern extraction for reusable CLI lessons

### Playbook Schema
- Add CLI_TOOL_PATTERNS section (10 sections total, was 9)
- Update sections_count in playbook_manager.py

### Documentation
- New: docs/CLI_TESTING_GUIDE.md (400+ lines)
  - Output stream management principles
  - Version compatibility patterns
  - Integration testing workflows
  - Common pitfalls with examples
  - Best practices checklist

### CHANGELOG
- Document all CLI improvements in [Unreleased]
- Include context from PR #6 lessons learned

## Rationale

These improvements address systematic issues discovered during CLI
development:

1. **Stdout Pollution**: Libraries printing diagnostics to stdout
   - Solution: Always use stderr for diagnostics
   - Detection: Monitor validates output streams

2. **Version Incompatibility**: Using features not in minimum version
   - Solution: Check compatibility or use fallback
   - Detection: Predictor flags version-specific risks

3. **CliRunner ≠ Real CLI**: Tests pass but command fails
   - Solution: Add integration tests with subprocess
   - Detection: Monitor requires manual CLI execution

4. **Pattern Capture**: Prevent repeating these mistakes
   - Solution: Reflector extracts CLI patterns to playbook
   - Benefit: Institutional memory for CLI development

## Benefits

- Proactive risk detection for CLI implementations
- Comprehensive validation checklist for reviewers
- Systematic lesson capture for future CLI projects
- Single source of truth for CLI testing practices
- Prevents common CLI development pitfalls

Closes #<will-be-created>
@azalio azalio force-pushed the feature/map-cli-improvements branch from 2545378 to 43f8693 Compare October 24, 2025 19:00
@azalio azalio merged commit b77f295 into main Oct 24, 2025
5 checks passed
azalio pushed a commit that referenced this pull request Feb 8, 2026
…stillation

Five optimizations to the Architect phase:
1. Architecture Graph (Step 4): REQUIRED pseudocode graph of affected
   classes/modules before decomposition — decomposer gets a skeleton
2. AAG Contracts (Step 5 & 6): mandatory aag_contract per subtask in
   task_plan.md and aag_contracts map in workflow_state.json — turns
   the plan from a "todo list" into an executable protocol
3. Semantic Brackets (Step 6 & 7): <MAP_Plan_v1_0> wraps task plan,
   _semantic_tag in workflow_state.json — zero-ambiguity parsing
4. Contract Clarity (Step 2): dimension #7 in interview checklist —
   reject process-goals ("improve auth"), require outcome-goals
   ("returns 401 for expired tokens")
5. Context Distillation (Step 8): distillation checklist before STOP —
   ensures plan files are self-contained for fresh executor session,
   target ≤4000 tokens per subtask context

https://claude.ai/code/session_01AR3EbNKosxBD5PocKkMSMd
azalio added a commit that referenced this pull request Feb 13, 2026
-7)

- #1-3: Remove Actor "proposals only" section — Actor applies code directly
  with Edit/Write tools, consistent with map-efficient.md prompts
- #4: Rename orchestrator phase 2.7 from APPLY_CHANGES to UPDATE_STATE
  (Actor applies code, 2.7 only updates state tracking)
- #5: Implement check_circuit_breaker command in orchestrator
  (was referenced in map-efficient.md but missing from argparse)
- #6: Replace non-existent map-efficient-step reference with /map-resume
- #7: Fix STEP_ORDER index bug — used [3:] (starts at CHOOSE_MODE)
  instead of index("2.0") (starts at XML_PACKET) for subtask loop reset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants