feat: add MAP Framework enhancements for workflow artifacts and early termination#71
feat: add MAP Framework enhancements for workflow artifacts and early termination#71
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (9)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds branch-aware, non-blocking verification recording from end-of-turn hooks, per-check duration instrumentation, repo insight and finish-intent detection, WONT_DO early-termination state with persistence, JSON schemas and recorder implementation, docs updates, and extensive tests across these features. Changes
Sequence Diagram(s)sequenceDiagram
participant Hook as end-of-turn.sh
participant Check as ProjectCheck
participant Recorder as verification_recorder.py
participant FS as .map/verification_results_<branch>.json
Hook->>Hook: get_branch_name()
loop for each check
Hook->>Check: run_check(name, cmd, recipe_id)
Check-->>Hook: exit status (pass/fail/skip)
Hook->>Hook: compute duration_ms
Hook->>Recorder: call_verification_recorder(recipe_id, status, summary, duration_ms)
activate Recorder
Recorder->>Recorder: validate input & build recipe entry
Recorder->>FS: load existing results or init structure
Recorder->>Recorder: append entry & compute overall status
Recorder->>FS: atomic write updated JSON
Recorder-->>Hook: return (non-blocking)
deactivate Recorder
end
Hook->>Hook: continue execution (recorder failures ignored)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
… termination Add comprehensive improvements to MAP Framework: - JSON schemas for .map/ artifacts (state, verification_results, repo_insight) - Early-finish detection with Russian phrase support and won't_do terminal status - Verification results recording with atomic writes for CI/CD integration - Repo insight generation for language detection and suggested checks - YAML frontmatter validator for agent templates - Documentation updates for hooks contract and artifact specifications New modules: - intent_detector.py: Detects early termination phrases - workflow_finalizer.py: Transitions subtasks to won't_do status - verification_recorder.py: Records hook verification results - repo_insight.py: Generates project metadata artifacts Updated end-of-turn.sh hook to record verification results to .map/verification_results_<branch>.json for machine-readable CI/CD status. All 600 tests pass with no linting errors.
6550eb5 to
83b59f1
Compare
There was a problem hiding this comment.
Pull request overview
This PR implements MAP Framework enhancements for workflow artifacts and early termination support, adding JSON schemas for .map/ directory artifacts, early-finish detection with Russian phrase support, verification results recording for CI/CD integration, repository insight generation, and YAML frontmatter validation for agent templates.
Changes:
- Adds
won't_doterminal status for workflows ended early by user request - Implements verification results tracking in
.map/verification_results_<branch>.json - Creates repo insight generation for automatic language detection and suggested checks
- Adds comprehensive test coverage (600 tests mentioned in description)
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/mapify_cli/workflow_state.py |
Adds WONT_DO phase and ended_early metadata support with save/load |
src/mapify_cli/workflow_finalizer.py |
New module for transitioning incomplete subtasks to won't_do |
src/mapify_cli/verification_recorder.py |
New module for atomic verification results recording |
src/mapify_cli/repo_insight.py |
New module for language detection and suggested checks |
src/mapify_cli/intent_detector.py |
New module for Russian early-finish phrase detection |
src/mapify_cli/schemas.py |
Adds JSON schemas for state, verification, and repo insight artifacts |
src/mapify_cli/templates/hooks/end-of-turn.sh |
Integrates verification recording into hook workflow |
.claude/hooks/end-of-turn.sh |
Mirror of template hook with verification recording |
tests/test_workflow_state.py |
Comprehensive tests for won't_do status (169 new lines) |
tests/test_workflow_finalizer.py |
Full test coverage for finalization logic (434 lines) |
tests/test_verification_recorder.py |
Extensive verification recorder tests (611 lines) |
tests/test_repo_insight.py |
Complete repo insight tests (368 lines) |
tests/test_intent_detector.py |
Russian phrase detection tests (260 lines) |
tests/test_agent_frontmatter.py |
YAML frontmatter validation tests (202 lines) |
docs/USAGE.md |
Documents verification results and early termination (178 new lines) |
docs/ARCHITECTURE.md |
Documents .map/ artifact specifications (176 new lines) |
docs/gm_map_framework_improvement_plan.md |
New improvement plan document (162 lines) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| # Write JSON to temp file | ||
| with open(temp_fd, "w", encoding="utf-8") as f: | ||
| json.dump(data, f, indent=2, ensure_ascii=False) |
There was a problem hiding this comment.
The file descriptor from tempfile.mkstemp() should be closed before attempting to open the file again with open(). The current code passes the file descriptor to open(), but it's better practice to close the descriptor first since it's not needed after the temp file is created.
The issue is that temp_fd is an integer file descriptor, not a file object. While open(temp_fd, ...) works on Unix-like systems, it's better to close the descriptor explicitly to avoid any resource leaks.
| """Test that partial phrases don't match (word boundaries).""" | ||
| # These contain the letters but not as whole words | ||
| assert detect_finish_intent("незакончили") is False | ||
| assert detect_finish_intent("прекращайте") is False # Different word form |
There was a problem hiding this comment.
The test at line 101 has a typo in the comment. It says "Different word form" but "прекращайте" is actually a different word form of "прекращай" (imperative plural vs imperative singular). However, the current regex pattern should match both forms since it only checks for "прекращай" as a substring.
This test might be passing for the wrong reason - "прекращайте" contains "прекращай" but the word boundary regex \b should prevent this from matching. The test expectation is correct (False), but the comment should clarify that it's testing word boundaries, not just different word forms.
| assert detect_finish_intent("прекращайте") is False # Different word form | |
| assert detect_finish_intent("прекращайте") is False # Contains 'прекращай' as substring but should not match due to word boundaries |
| **Trigger phrases (Russian and English):** | ||
| - "закончили" (finished) | ||
| - "остановимся" (let's stop) | ||
| - "хватит" (enough) | ||
| - "дальше не делай" (don't continue) | ||
|
|
There was a problem hiding this comment.
The documentation lists Russian trigger phrases but doesn't include English equivalents, despite the heading saying "Russian and English". The intent_detector.py module only supports Russian phrases according to the implementation.
Either the documentation should be updated to remove "and English" from the heading, or English support should be noted as a future enhancement. Currently, only the Russian phrases listed in src/mapify_cli/intent_detector.py are implemented.
| **Trigger phrases (Russian and English):** | |
| - "закончили" (finished) | |
| - "остановимся" (let's stop) | |
| - "хватит" (enough) | |
| - "дальше не делай" (don't continue) | |
| **Trigger phrases (Russian; English support planned):** | |
| - "закончили" (finished) | |
| - "остановимся" (let's stop) | |
| - "хватит" (enough) | |
| - "дальше не делай" (don't continue) | |
| > Note: Currently, only these Russian trigger phrases are implemented in `intent_detector.py`. English equivalents are not yet supported and are planned as a future enhancement. |
- Add pyyaml to test/dev dependencies (fixes CI ModuleNotFoundError) - Fix file descriptor handling in verification_recorder.py (close fd before reopening by path) - Improve test comment about word boundary matching - Fix docs to clarify only Russian phrases are currently supported
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/mapify_cli/schemas.py`:
- Around line 228-232: The examples array for the "active_subtask_id" JSON
Schema uses a Python None which will break JSON serialization; update the
examples for the "active_subtask_id" entry in src/mapify_cli/schemas.py to be
JSON-safe (e.g., replace the Python None with a JSON-friendly placeholder string
like "null (JSON null)" or simply remove the None example) so the schema can be
serialized without errors.
- Around line 280-284: The "at_subtask_id" schema's examples currently include a
raw None literal which isn't valid JSON; update the examples array in schemas.py
for the at_subtask_id field to use a JSON-compatible null (or remove the null
example) instead of the Python None literal so the examples are valid when
serialized or consumed by JSON Schema tools.
🧹 Nitpick comments (3)
src/mapify_cli/repo_insight.py (3)
57-68: Hardcoded test command appears project-specific.The Python suggested checks include
pytest tests/test_template_sync.py -v, which references a specific test file. This may not be appropriate for arbitrary Python projects using this framework.♻️ Consider using generic pytest command
commands_by_language = { "python": [ "make check", - "pytest tests/test_template_sync.py -v", + "pytest", "make sync-templates", ],
84-118: Key directories limited to first 5 in hardcoded order.The function returns directories in the order they appear in
standard_dirs, which may exclude important project-specific directories that come later in the list. For example,.claudeand.mapare last and may be excluded ifsrc,tests,lib,pkg,cmdall exist.Consider whether the ordering should be prioritized differently or if all existing standard directories should be returned (with the limit applied as a display constraint rather than data constraint).
151-161: Consider usingparents=Truefor.mapdirectory creation.If the
project_rootpath doesn't exist or has other issues,mkdir(exist_ok=True)withoutparents=Truecould fail unexpectedly.♻️ Add parents=True for robustness
# Ensure .map/ directory exists map_dir = project_root / ".map" - map_dir.mkdir(exist_ok=True) + map_dir.mkdir(parents=True, exist_ok=True)
…ion compatibility
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/mapify_cli/verification_recorder.py`:
- Around line 76-81: The branch value is used raw to build results_path, which
allows characters like "/" to create nested directories (e.g.,
verification_results_feature/foo.json); update verification_recorder.py to
sanitize/normalize branch before use (e.g., in the code that sets results_path
or just before it) by replacing path separators and other unsafe filename
characters with safe alternatives (underscore or hyphen), collapsing consecutive
unsafe chars, and possibly truncating long names; ensure you perform this
normalization where branch is referenced (the variables branch, map_dir, and
results_path) so results_path always points to a single file under .map.
- Around line 82-105: The read-modify-write around results_path is vulnerable to
lost updates under concurrent access; wrap the entire
load->modify->_validate_verification_results_schema->_atomic_write_json sequence
with a file-level lock (e.g., using fcntl/portalocker) or implement an
optimistic retry loop that: reads results_path, merges the new recipe into
results_data["recipes"], computes overall via _compute_overall_status,
validates, then attempts the atomic write and, if the file changed (mtime/ETag)
between read and write, reloads, merges again and retries a bounded number of
times before failing; apply this to the function that performs the append so
_atomic_write_json is only called while holding the lock (or after a successful
retry/merge). If the module truly guarantees single-writer usage, add a clear
comment/docstring to that effect instead.
| # Load existing results or create new structure | ||
| if results_path.exists(): | ||
| try: | ||
| with results_path.open("r", encoding="utf-8") as f: | ||
| results_data = json.load(f) | ||
| except (json.JSONDecodeError, OSError): | ||
| # If file is corrupted, start fresh | ||
| results_data = {"overall": "unknown", "recipes": []} | ||
| else: | ||
| # Create new results file | ||
| results_data = {"overall": "unknown", "recipes": []} | ||
|
|
||
| # Append new recipe result (preserving existing entries) | ||
| results_data["recipes"].append(recipe) | ||
|
|
||
| # Update overall status based on aggregation rules | ||
| results_data["overall"] = _compute_overall_status(results_data["recipes"]) | ||
|
|
||
| # Validate schema before writing | ||
| _validate_verification_results_schema(results_data) | ||
|
|
||
| # Atomic write: write to temp file, then rename | ||
| _atomic_write_json(results_path, results_data) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/mapify_cli/verification_recorder.py | head -120Repository: azalio/map-framework
Length of output: 4869
🏁 Script executed:
cat -n src/mapify_cli/verification_recorder.py | tail -100Repository: azalio/map-framework
Length of output: 3798
🏁 Script executed:
# Search for where record_verification_result is called from
rg -n "record_verification_result" --type py -B 2 -A 2Repository: azalio/map-framework
Length of output: 9689
🏁 Script executed:
# Search for any documentation or usage patterns mentioning concurrent/parallel execution
rg -i "concurrent|parallel|hook|subprocess" --type py -B 2 -A 2 | head -80Repository: azalio/map-framework
Length of output: 5649
🏁 Script executed:
# Look at README or documentation for usage patterns
find . -name "README*" -o -name "*.md" | head -20 | xargs grep -l "verification\|hook" 2>/dev/null | head -5Repository: azalio/map-framework
Length of output: 217
🏁 Script executed:
# Check the plugin documentation for verification usage context
cat ./.claude-plugin/PLUGIN.md | head -200Repository: azalio/map-framework
Length of output: 5598
🏁 Script executed:
# Search for verification hooks or scripts that might call the recorder
find . -name "*verification*" -o -name "*hook*" | grep -v __pycache__ | head -20Repository: azalio/map-framework
Length of output: 341
🏁 Script executed:
# Check if there are hooks that might call the verification recorder in parallel
cat ./.claude/settings.hooks.jsonRepository: azalio/map-framework
Length of output: 1781
🏁 Script executed:
# Check the end-of-turn hook to see if it calls verification recorder
cat ./.claude/hooks/end-of-turn.shRepository: azalio/map-framework
Length of output: 9310
🏁 Script executed:
# Check if there are any comments or documentation about concurrency assumptions
rg -n "single.*writer|concurrent|parallel|race|thread|lock" src/mapify_cli/verification_recorder.py -B 2 -A 2Repository: azalio/map-framework
Length of output: 641
🏁 Script executed:
# Check if there's any test for concurrent writes
rg -n "concurrent|thread|parallel" tests/test_verification_recorder.py -B 2 -A 5Repository: azalio/map-framework
Length of output: 478
🏁 Script executed:
# Get the full concurrent writes test
sed -n '578,650p' tests/test_verification_recorder.pyRepository: azalio/map-framework
Length of output: 1364
🏁 Script executed:
# Check if hooks could run in parallel
cat ./.claude/settings.hooks.json | jq -r '.hooks | keys[]'Repository: azalio/map-framework
Length of output: 96
🏁 Script executed:
# Verify: Can the end-of-turn hook be called concurrently, or is it strictly sequential per turn?
# Check if Claude Code typically runs hooks in parallel
grep -r "concurrent\|parallel\|async" .claude* docs* --include="*.md" --include="*.json" 2>/dev/null | head -20Repository: azalio/map-framework
Length of output: 2888
🏁 Script executed:
# Search for any documentation on how Claude Code executes hooks (parallel vs sequential)
find . -name "*.md" -exec grep -l "hook" {} \; | xargs grep -i "sequential\|parallel\|concurrent" 2>/dev/null | head -20Repository: azalio/map-framework
Length of output: 2934
🏁 Script executed:
# Check if there's any documentation or code pattern showing parallel verification calls expected
rg -n "parallel\|concurrent" docs/ --type md 2>/dev/null | grep -i "verif\|hook\|record" | head -20Repository: azalio/map-framework
Length of output: 46
Read-modify-write race condition allows lost updates under concurrent access.
The atomic write pattern (temp file + rename) prevents file corruption but not lost updates. Multiple processes reading, modifying, and writing the results file concurrently will drop updates (last writer wins). If this function can be called from parallel hooks or multiple processes simultaneously, add file-level locking or implement a retry-on-change loop to ensure all recipe results are preserved. If single-writer usage is guaranteed, document that concurrency assumption explicitly.
🤖 Prompt for AI Agents
In `@src/mapify_cli/verification_recorder.py` around lines 82 - 105, The
read-modify-write around results_path is vulnerable to lost updates under
concurrent access; wrap the entire
load->modify->_validate_verification_results_schema->_atomic_write_json sequence
with a file-level lock (e.g., using fcntl/portalocker) or implement an
optimistic retry loop that: reads results_path, merges the new recipe into
results_data["recipes"], computes overall via _compute_overall_status,
validates, then attempts the atomic write and, if the file changed (mtime/ETag)
between read and write, reloads, merges again and retries a bounded number of
times before failing; apply this to the function that performs the append so
_atomic_write_json is only called while holding the lock (or after a successful
retry/merge). If the module truly guarantees single-writer usage, add a clear
comment/docstring to that effect instead.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/mapify_cli/schemas.py`:
- Around line 336-340: The "language" property's examples array contains a
Python None literal which causes JSON serialization problems; update the
examples for the "language" schema (the "language" property in schemas.py) to
remove the Python None or replace it with a JSON-safe value (e.g., the string
"null" or simply drop the None entry) so the examples array contains only
serializable values like "python", "javascript", "go", "rust", "typescript".
- Around line 424-428: The examples array for the "language" field in
REPO_INSIGHT_SCHEMA contains a Python None literal which breaks JSON
serialization; update the REPO_INSIGHT_SCHEMA "language" examples to be
JSON-compatible by removing the Python None or replacing it with a proper JSON
null representation (i.e., ensure the examples list contains only
JSON-serializable values) so the "language" key no longer includes the Python
None literal.
- Remove remaining Python None from JSON Schema examples (schemas.py:339, 427) - Sanitize branch names in verification_recorder.py to avoid nested paths (e.g., feature/foo -> feature_foo)
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/mapify_cli/verification_recorder.py`:
- Around line 85-93: Ensure skip_reason is only set when the record's status is
'skipped': before assigning recipe["skip_reason"], check the status variable
(e.g., status) and if status != "skipped" raise a ValueError indicating
skip_reason may only be provided for skipped records; keep the existing
non-negative validation for duration_ms and only attach recipe["skip_reason"]
when the status check passes so recipe remains consistent (references: recipe,
skip_reason, duration_ms, status).
♻️ Duplicate comments (1)
src/mapify_cli/verification_recorder.py (1)
102-124: Atomic rename doesn’t prevent lost updates under concurrent writers.
The read→modify→write cycle can still drop entries if multiple hooks/processes run at the same time. Please confirm single-writer execution or add a file-level lock / optimistic retry to preserve all recipe updates.
| # Add optional fields | ||
| if duration_ms is not None: | ||
| if duration_ms < 0: | ||
| raise ValueError(f"duration_ms must be >= 0, got {duration_ms}") | ||
| recipe["duration_ms"] = duration_ms | ||
|
|
||
| if skip_reason is not None: | ||
| recipe["skip_reason"] = skip_reason | ||
|
|
There was a problem hiding this comment.
Gate skip_reason to status='skipped'.
Right now skip_reason can be set for pass/fail, which can create inconsistent records.
🔧 Proposed fix
- if skip_reason is not None:
- recipe["skip_reason"] = skip_reason
+ if skip_reason is not None:
+ if status != "skipped":
+ raise ValueError("skip_reason is only valid when status='skipped'")
+ recipe["skip_reason"] = skip_reason🤖 Prompt for AI Agents
In `@src/mapify_cli/verification_recorder.py` around lines 85 - 93, Ensure
skip_reason is only set when the record's status is 'skipped': before assigning
recipe["skip_reason"], check the status variable (e.g., status) and if status !=
"skipped" raise a ValueError indicating skip_reason may only be provided for
skipped records; keep the existing non-negative validation for duration_ms and
only attach recipe["skip_reason"] when the status check passes so recipe remains
consistent (references: recipe, skip_reason, duration_ms, status).
- verification_recorder.py: Add TypedDict, MAX_RECIPES limit, warning logs, os.fdopen for atomic writes, CLI validation - end-of-turn.sh: Add get_time_ms() for POSIX-compatible timing - workflow_finalizer.py: Use None instead of empty string for at_subtask_id, add by_user parameter - workflow_state.py: Update mark_ended_early signature - intent_detector.py: Lazy regex compilation - tests: Update assertions for None vs empty string, fix os.fdopen mock
Summary
.map/artifacts (state, verification_results, repo_insight)won't_doterminal statusNew Modules
intent_detector.pyworkflow_finalizer.pyverification_recorder.py.map/verification_results_<branch>.jsonrepo_insight.pyKey Features
Verification Results Recording
The
end-of-turn.shhook now records verification results to.map/verification_results_<branch>.json:{ "overall": "pass", "recipes": [ {"id": "check_ruff", "status": "pass", "summary": "ruff passed", "duration_ms": 1200} ] }Early Termination with
won't_doWhen users request early termination (e.g., "закончили"), all pending subtasks are marked
won't_doand the workflow state recordsended_earlymetadata.Repo Insight Generation
Automatic language detection from marker files (pyproject.toml, tsconfig.json, go.mod, etc.) with suggested check commands.
Test plan
make check)make sync-templates)Summary by CodeRabbit
New Features
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.