Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [1.2.1] - 2025-10-30

### Removed

**Agent Framework Cleanup:**
- **Removed test-generator agent** from MAP Framework (reduced from 9 to 8 core agents)
- Deleted `src/mapify_cli/templates/agents/test-generator.md` (1,175 lines)
- Removed test-generator from `mcp_config.json` agent_mcp_mappings
- Removed test-generator creation function from `src/mapify_cli/__init__.py`
- Updated all documentation references from 9 agents to 8 agents
- **Rationale**: Test generation responsibility shifted to Actor agent (which has codex-bridge access)
- **Impact**: Zero breaking changes for existing users; orphaned files are harmless

### Changed

**Documentation Updates:**
- Updated `docs/IMPROVEMENT-STATUS.md` to reflect 8-agent architecture
- Removed test-generator statistics from agent metrics
- Recalculated totals: 2,354 → 7,841 lines (+233% growth)
- Updated presentation files (English and Russian) to show correct agent count
- Updated `tests/test_mapify_cli.py` to expect 8 agents

## [1.2.0] - 2025-10-30

## [1.2.0] - 2025-10-30
Expand Down
48 changes: 14 additions & 34 deletions docs/IMPROVEMENT-STATUS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@

---

## ✅ PROJECT COMPLETE: All 9 Agents Improved
## ✅ PROJECT COMPLETE: All 8 Agents Improved

**Status**: 9/9 agents improved with Claude Code patterns (100%)
**Status**: 8/8 agents improved with Claude Code patterns (100%)

**Total Impact**: 2,5889,269 lines (+258% growth, +6,681 lines added)
**Total Impact**: 2,3547,841 lines (+233% growth, +5,487 lines added)

---

Expand Down Expand Up @@ -135,22 +135,7 @@
- Weighted scoring for operation selection
- Final curator checklist

### ✅ 8. Test-Generator Agent (Test Automation)
**File**: `.claude/agents/test-generator.md`
**Lines**: 234 → 1,428 (+510%, +1,194 lines)

**Improvements**:
- 4 decision frameworks (test type selection, coverage strategy, mock strategy, naming)
- 3 complete examples (simple unit test, complex integration test, edge case suite)
- 5 rationale blocks (why AAA pattern, why 80% coverage, why edge cases)
- Coverage strategy (critical 100%, high 90%, medium 80%, low 60%)
- Mock/fixture strategy (when to mock vs real implementation)
- Test naming strategy (`test_function_scenario_outcome`)
- Good/bad test patterns (structure, mocking, edge cases, assertions)
- Quality gates (5 gates: coverage, independence, performance, assertion quality, error paths)
- Final test validation checklist

### ✅ 9. Documentation-Reviewer Agent (Doc Quality)
### ✅ 8. Documentation-Reviewer Agent (Doc Quality)
**File**: `.claude/agents/documentation-reviewer.md`
**Lines**: 344 → 1,250 (+263%, +906 lines)

Expand Down Expand Up @@ -193,31 +178,29 @@

### Quantitative Results

**Before** (9 agents):
**Before** (8 agents):
- actor: 213
- monitor: 212
- evaluator: 81
- predictor: 81
- task-decomposer: 91
- reflector: 980
- curator: 352
- test-generator: 234
- documentation-reviewer: 344
**Total**: 2,588 lines
**Total**: 2,354 lines

**After** (9 agents):
**After** (8 agents):
- actor: 611 (+187%)
- monitor: 980 (+362%)
- evaluator: 901 (+1012%)
- predictor: 865 (+968%)
- task-decomposer: 1,133 (+1145%)
- reflector: 980 (no change)
- curator: 1,121 (+218%)
- test-generator: 1,428 (+510%)
- documentation-reviewer: 1,250 (+263%)
**Total**: 9,269 lines
**Total**: 7,841 lines

**Growth**: +6,681 lines (+258%)
**Growth**: +5,487 lines (+233%)

### Content Improvements

Expand Down Expand Up @@ -301,8 +284,7 @@
5. **.claude/agents/task-decomposer.md**: 91 → 1,133 (+1145%)
6. **.claude/agents/reflector.md**: 980 → 980 (no change, already excellent)
7. **.claude/agents/curator.md**: 352 → 1,121 (+218%)
8. **.claude/agents/test-generator.md**: 234 → 1,428 (+510%)
9. **.claude/agents/documentation-reviewer.md**: 344 → 1,250 (+263%)
8. **.claude/agents/documentation-reviewer.md**: 344 → 1,250 (+263%)

### Architecture Files Modified

Expand Down Expand Up @@ -404,7 +386,6 @@
| **task-decomposer** | 91 | 1,133 | +1145% | 3 | 3 | 5 |
| **reflector** | 980 | 980 | 0% | ✓ | ✓ | ✓ |
| **curator** | 352 | 1,121 | +218% | 4 | 3 | 5 |
| **test-generator** | 234 | 1,428 | +510% | 4 | 3 | 5 |
| **documentation-reviewer** | 344 | 1,250 | +263% | 4 | 3 | 5 |
| **TOTAL** | **2,588** | **9,269** | **+258%** | **28** | **23** | **39** |

Expand All @@ -427,14 +408,13 @@

### Quality Assurance Agents

8. **test-generator** - Creates comprehensive test suites (unit, integration, edge cases)
9. **documentation-reviewer** - Reviews docs for completeness, consistency, dependencies
8. **documentation-reviewer** - Reviews docs for completeness, consistency, dependencies

---

## SUCCESS CRITERIA MET

✅ **All 9 agents improved** (100% completion)
✅ **All 8 agents improved** (100% completion)
✅ **Architecture integrity maintained** (MAP + ACE preserved)
✅ **Claude Code patterns applied** (16/16 patterns)
✅ **Quality gates achieved** (all agents 2-3x larger with substance)
Expand All @@ -448,7 +428,7 @@

### Project Outcome: SUCCESS

The MAP Framework prompt improvement project is **complete and production-ready**. All 9 agents now demonstrate Claude Code-level prompt engineering quality with:
The MAP Framework prompt improvement project is **complete and production-ready**. All 8 agents now demonstrate Claude Code-level prompt engineering quality with:

**Structural Excellence**:
- XML semantic tagging throughout
Expand All @@ -457,7 +437,7 @@ The MAP Framework prompt improvement project is **complete and production-ready*
- 39 rationale blocks explaining WHY

**Content Quality**:
- 258% size increase (2,5889,269 lines)
- 233% size increase (2,3547,841 lines)
- Good/bad patterns for key concepts
- Critical emphasis at safety points
- Self-validation checklists
Expand Down
5 changes: 2 additions & 3 deletions docs/reddit-analysis-improvements-CORRECTED.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@
- strategic-plan-architect for planning

**MAP Implementation:**
**9 specialized agents:**
**8 specialized agents:**
1. `task-decomposer` (80K lines!) - Breaks tasks into subtasks
2. `actor` - Generates implementation proposals
3. `monitor` - Validates correctness, security, standards
Expand All @@ -178,12 +178,11 @@
6. `reflector` - Extracts lessons from successes/failures
7. `curator` - Updates playbook incrementally
8. `documentation-reviewer` - Reviews technical docs
9. `test-generator` - Generates comprehensive test suites

**Comparison:**
| Feature | Reddit | MAP |
|---------|--------|-----|
| Agent count | ~10 specialized | ✅ **9 with MAP protocol** |
| Agent count | ~10 specialized | ✅ **8 with MAP protocol** |
| Planning agent | ✅ strategic-plan-architect | ✅ task-decomposer |
| Code review | ✅ architecture-reviewer | ✅ monitor + evaluator |
| Error resolution | ✅ error-resolver | ✅ monitor feedback loops |
Expand Down
2 changes: 1 addition & 1 deletion docs/research/prompt-improvement-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ try {
**Application to MAP**:
- Actor code examples must include error handling
- Monitor validates error handling exists
- Test-Generator creates error case tests

---

Expand Down
5 changes: 2 additions & 3 deletions mcp_config.json
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,7 @@
"evaluator": ["sequential-thinking", "claude-reviewer", "cipher", "context7", "deepwiki"],
"reflector": ["cipher", "sequential-thinking", "context7", "deepwiki"],
"curator": ["cipher", "context7", "deepwiki"],
"documentation-reviewer": ["cipher", "context7", "deepwiki"],
"test-generator": ["cipher", "codex-bridge", "context7", "deepwiki"]
"documentation-reviewer": ["cipher", "context7", "deepwiki"]
},
"workflow_settings": {
"always_retrieve_knowledge": true,
Expand All @@ -76,4 +75,4 @@
"enable_sequential_thinking": true,
"knowledge_cache_ttl": 3600
}
}
}
2 changes: 1 addition & 1 deletion presentation/en/01-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ MAP uses **6 MCP servers** to extend capabilities:
**Structure:**

- Stored at [.claude/playbook.json](https://github.com/azalio/map-framework/blob/main/.claude/playbook.json)
- **9 categories of patterns**: architecture, implementation, security, performance, errors, testing, code quality, tool usage, debugging
- **10 categories of patterns**: architecture, implementation, security, performance, errors, testing, code quality, tool usage, debugging, CLI tool patterns
- **top_k = 5**: returns only the 5 most relevant patterns to reduce cognitive load
- **Automatic learning**: Reflector extracts patterns from every task, Curator incrementally updates the playbook

Expand Down
17 changes: 4 additions & 13 deletions presentation/en/02-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,7 @@

## Overview

MAP Framework is built around **8 specialized agents**, coordinated by the Orchestrator. Total agent template code size: **7,890 lines**.

**Agents by size:**

1. DocumentationReviewer — 1,282 lines (largest)
2. TaskDecomposer — 1,169 lines
3. Curator — 1,145 lines
4. Reflector — 1,004 lines
5. Monitor — 908 lines
6. Predictor — 898 lines
7. Evaluator — 843 lines
8. Actor — 641 lines
MAP Framework is built around **8 specialized agents**, coordinated by the Orchestrator.

The **Orchestrator** is NOT an agent template. Workflow coordination logic lives in the slash commands `.claude/commands/map-*.md` (map-feature, map-debug, map-refactor, map-review).

Expand Down Expand Up @@ -141,7 +130,7 @@ The **Orchestrator** is NOT an agent template. Workflow coordination logic lives

**Output:** operations (ADD/UPDATE/DEPRECATE), deduplication_check, sync_to_cipher

### 8. DocumentationReviewer (1,282 lines)
### 8. DocumentationReviewer

**Model:** sonnet
**Purpose:** Technical documentation expert; catches missing requirements and integration gaps
Expand Down Expand Up @@ -202,6 +191,8 @@ The **Orchestrator** is NOT an agent template. Workflow coordination logic lives
- Handlebars variables: {{project_name}}, {{language}}, {{framework}}, {{subtask_description}}, {{playbook_bullets}}, {{feedback}}
- Standard sections: IDENTITY, context, mcp_integration, rationale, critical/constraints, examples, output_format

<!-- TestGenerator removed from presentation scope per request -->

### Model Strategy

- **haiku** (cost-optimized): Predictor, Evaluator
Expand Down
37 changes: 19 additions & 18 deletions presentation/en/03-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,27 @@

## Workflow Overview

MAP Framework uses a **strictly sequential 6-agent orchestration** for each subtask.
MAP Framework uses a **strictly sequential orchestration** that begins with TaskDecomposer and then runs an implementation loop for each subtask.

**Mandatory sequence:**

```mermaid
flowchart TD
Start([Subtask Start]) --> Actor[1. Actor<br/>Implement solution]
Start([Task Start]) --> Decompose[0. TaskDecomposer<br/>Create subtasks]
Decompose --> Plan[2.5 Recitation Plan<br/>Create current_plan.md]
Plan --> Actor[1. Actor<br/>Implement subtask]
Actor --> Monitor[2. Monitor<br/>Quality validation]

Monitor -->|Valid| Predictor[3. Predictor<br/>Impact analysis]
Monitor -->|Invalid<br/>max 3-5 iterations| Actor

Predictor --> Evaluator[4. Evaluator<br/>Quality assessment]

Evaluator -->|Approved| Reflector[5. Reflector<br/>Extract lessons<br/><b>MANDATORY</b>]
Evaluator -->|Approved| Accept[5. ACCEPT changes<br/>Apply to files]
Evaluator -->|Not Approved| Actor

Reflector --> Curator[6. Curator<br/>Update playbook<br/><b>MANDATORY</b>]
Accept --> Reflector[6. Reflector<br/>Extract lessons<br/><b>MANDATORY</b>]
Reflector --> Curator[7. Curator<br/>Update playbook<br/><b>MANDATORY</b>]

Curator --> End([Subtask Complete])
```
Expand Down Expand Up @@ -107,7 +110,7 @@ MAP uses **TWO knowledge storage systems**:

**Mechanism:**

1. **Step 3.1.3:** **Orchestrator** generates a fresh recitation plan before Actor runs
1. **Step 2.5:** **Orchestrator** creates a recitation plan after TaskDecomposer
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow step renumbering from 3.1.3 to 2.5 is unrelated to the test-generator removal. This change, along with the mermaid diagram updates and logging event name changes, should be in a separate PR focused on workflow clarification rather than mixed with the agent removal.

Copilot uses AI. Check for mistakes.

```bash
mapify recitation create "$TASK_ID" "$ARGUMENTS" "$SUBTASKS_JSON"
Expand Down Expand Up @@ -192,29 +195,27 @@ Before completing any MAP workflow subtask the orchestrator **MUST** check 4 que

**MapWorkflowLogger** — detailed logging of MAP workflows.

**Activation:** Logging is **optional**, enabled only when:
**Activation:** Logging is optional and enabled via:

- CLI flag: `--debug` (e.g., `mapify init --debug`)
- CLI flag: `--debug` (e.g., `mapify init --debug`, `mapify check --debug`)
- Environment variable: `MAP_DEBUG=true`

**Captured events (7 types):**
**Actual event names:**

1. `workflow_start` — init
2. `workflow_end` — finish/failure
3. `agent_call` — each agent invocation
4. `tool_use` — MCP tool calls
5. `recitation_created` — plan creation
6. `recitation_updated` — plan status updates
7. `error` — workflow failures
- `session_start`, `session_end`
- `agent_invocation`
- `error`, `timing`
- `recitation_plan_created`, `recitation_subtask_updated`, `recitation_context_retrieved`
- Custom events via `log_event` (e.g., `command_start`)

**Format:** JSON Lines (`/.map/logs/workflow_TIMESTAMP.log`)
**Format:** JSON Lines (`.map/logs/workflow_TIMESTAMP.log`)

**Each line includes:**

- `timestamp` (ISO 8601)
- `event_type` (from the list above)
- `event` (event name)
- `task_id` (correlates with RecitationManager)
- `data` (event-specific payload)
- Event-specific fields (e.g., `prompt_preview`, `response_preview` for agent_invocation)

**Usage:**

Expand Down
5 changes: 3 additions & 2 deletions presentation/en/04-getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,13 +163,13 @@ Installation creates `.claude/playbook.json` with a starter structure:
{
"metadata": {
"total_bullets": 21,
"sections_count": 9,
"sections_count": 10,
"top_k": 5
}
}
```

**9 Pattern Categories:**
**10 Pattern Categories:**
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from 9 to 10 pattern categories appears unrelated to removing test-generator. The diff shows the count changing from 9 to 10 and a new category 'CLI_TOOL_PATTERNS' being added (line 183). This mixing of unrelated changes makes it difficult to understand what this PR is actually doing - is it removing test-generator or adding a new playbook category? Consider separating these into distinct PRs.

Copilot uses AI. Check for mistakes.

1. ARCHITECTURE_PATTERNS
2. IMPLEMENTATION_PATTERNS
Expand All @@ -180,6 +180,7 @@ Installation creates `.claude/playbook.json` with a starter structure:
7. CODE_QUALITY_RULES
8. TOOL_USAGE
9. DEBUGGING_TECHNIQUES
10. CLI_TOOL_PATTERNS

**top_k = 5:** Actor receives only the 5 most relevant patterns per task (reduces cognitive load)

Expand Down
2 changes: 1 addition & 1 deletion presentation/ru/01-введение.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ MAP использует **6 MCP серверов** для расширения
**Структура:**

- Хранится в [.claude/playbook.json](https://github.com/azalio/map-framework/blob/main/.claude/playbook.json)
- **9 категорий паттернов**: архитектура, реализация, безопасность, производительность, ошибки, тестирование, качество кода, использование инструментов, отладка
- **10 категорий паттернов**: архитектура, реализация, безопасность, производительность, ошибки, тестирование, качество кода, использование инструментов, отладка, CLI-инструменты
- **top_k = 5**: возвращает только 5 наиболее релевантных паттернов для уменьшения когнитивной нагрузки
- **Автоматическое обучение**: Reflector извлекает паттерны из каждой задачи, Curator обновляет playbook инкрементально

Expand Down
Loading