fix: rehydrate execution state from map-plan artifacts#102
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the MAP plan → execution handoff contract so execution state is rehydrated from /map-plan artifacts (spec/task plan/blueprint/manifest) rather than relying on a planning-time step_state.json, and adds regression coverage to prevent skipping resume_from_plan.
Changes:
- Removes planning-time
step_state.jsoncreation guidance from/map-plantemplates and skills; adds explicit handoff messaging to start execution viaresume_from_plan. - Updates execution templates to rehydrate runtime state when existing state is missing or “planning-shaped”.
- Enhances orchestrator resume to recover
aag_contractsfromblueprint.jsonsubtasks and to parse/persist spec constraints; adjusts plan artifact readiness logic and adds regression tests.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_map_step_runner.py | Updates/extends tests for new plan artifact readiness semantics (step_state no longer required). |
| tests/test_map_orchestrator.py | Adds regressions for planning-only state behavior and resume_from_plan rehydration (AAG contracts + constraints). |
| tests/test_command_templates.py | Verifies templates reflect the new handoff contract and rehydration detection guidance. |
| src/mapify_cli/templates/map/scripts/map_step_runner.py | Treats task_plan + blueprint as sufficient for plan readiness; step_state becomes optional. |
| src/mapify_cli/templates/map/scripts/map_orchestrator.py | Adds constraint parsing from spec, persists aag_contracts on resume, and rehydrates state from plan artifacts. |
| src/mapify_cli/templates/commands/map-plan.md | Removes planning-time step_state contract; clarifies execution must start via resume_from_plan. |
| src/mapify_cli/templates/commands/map-efficient.md | Adds explicit “rehydrate when stale/missing” detection logic and calls resume_from_plan. |
| src/mapify_cli/templates/codex/skills/map-plan/SKILL.md | Aligns Codex skill with new plan-only artifact contract and execution rehydration. |
| .codex/skills/map-plan/SKILL.md | Mirrors the Codex skill contract updates for plan-only artifacts and resume-based execution init. |
| .claude/commands/map-plan.md | Aligns Claude command template with plan-only artifact contract and explicit execution handoff. |
| .claude/commands/map-efficient.md | Mirrors the new stale/missing state detection + resume_from_plan guidance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| key, value = line.strip().split(":", 1) | ||
| normalized = value.strip() | ||
| if normalized in {"null", "None", ""}: | ||
| parsed[key] = None | ||
| elif key in {"max_files", "max_subtasks", "time_budget"}: | ||
| try: | ||
| parsed[key] = int(normalized) | ||
| except ValueError: | ||
| parsed[key] = normalized | ||
| else: | ||
| parsed[key] = normalized.strip('"\'') |
There was a problem hiding this comment.
_load_constraints_from_spec falls back to storing the raw string when int parsing fails for time_budget/max_* (e.g., quoted numbers like "45" or floats like 45.0). Downstream, the workflow-gate time_budget check silently ignores non-numeric values (TypeError), which effectively disables enforcement. Consider normalizing by stripping quotes before parsing, supporting floats where appropriate, and/or treating non-numeric values as an error (or None) so constraints can’t be bypassed accidentally.
| planning_only_workflow = state.get("workflow") == "map-plan" | ||
|
|
||
| should_resume = ( | ||
| workflow_status in {"", "INITIALIZED"} | ||
| or current_phase in {"", "INITIALIZED"} | ||
| or (pending_steps == [] and bool(subtask_sequence)) | ||
| or planning_only_workflow |
There was a problem hiding this comment.
The should_resume heuristic will also return true for a completed runtime state because pending_steps becomes an empty list at completion (see map_orchestrator.validate_step setting current_step_phase to COMPLETE while leaving pending_steps empty). With a plan file present, this would re-run resume_from_plan after completion and overwrite the final runtime state. Consider explicitly excluding COMPLETE (and/or checking current_step_id/current_step_phase == "COMPLETE") from the resume condition, and tightening the pending_steps == [] branch to only match known planning-shaped state (e.g., workflow == "map-plan" and current_phase == "INITIALIZED").
| planning_only_workflow = state.get("workflow") == "map-plan" | |
| should_resume = ( | |
| workflow_status in {"", "INITIALIZED"} | |
| or current_phase in {"", "INITIALIZED"} | |
| or (pending_steps == [] and bool(subtask_sequence)) | |
| or planning_only_workflow | |
| workflow_name = str(state.get("workflow") or "").strip() | |
| is_complete = workflow_status == "COMPLETE" or current_phase == "COMPLETE" | |
| planning_artifact_state = ( | |
| workflow_name == "map-plan" | |
| and current_phase in {"", "INITIALIZED"} | |
| and pending_steps == [] | |
| and bool(subtask_sequence) | |
| ) | |
| should_resume = ( | |
| not is_complete | |
| and ( | |
| workflow_status in {"", "INITIALIZED"} | |
| or current_phase in {"", "INITIALIZED"} | |
| or planning_artifact_state | |
| ) |
|
|
||
| should_resume = ( | ||
| workflow_status in {"", "INITIALIZED"} | ||
| or current_phase in {"", "INITIALIZED"} | ||
| or (pending_steps == [] and bool(subtask_sequence)) |
There was a problem hiding this comment.
Same issue as the template version: pending_steps == [] and bool(subtask_sequence) will also be true for a completed runtime state (where current_step_phase is COMPLETE), causing /map-efficient to re-run resume_from_plan and overwrite the completed execution state whenever a plan file exists. Exclude COMPLETE from the resume heuristic and/or tighten the empty-pending_steps check to only match planning-shaped state.
| should_resume = ( | |
| workflow_status in {"", "INITIALIZED"} | |
| or current_phase in {"", "INITIALIZED"} | |
| or (pending_steps == [] and bool(subtask_sequence)) | |
| planning_shaped_pending_state = ( | |
| pending_steps == [] | |
| and bool(subtask_sequence) | |
| and current_phase != "COMPLETE" | |
| and ( | |
| workflow_status in {"", "INITIALIZED"} | |
| or planning_only_workflow | |
| ) | |
| ) | |
| should_resume = ( | |
| workflow_status in {"", "INITIALIZED"} | |
| or current_phase in {"", "INITIALIZED"} | |
| or planning_shaped_pending_state |
| try: | ||
| src_data = json.loads(source_file.read_text(encoding="utf-8")) | ||
| aag_contracts = src_data.get("aag_contracts", {}) | ||
| if not aag_contracts and isinstance(src_data.get("subtasks"), list): | ||
| aag_contracts = { | ||
| subtask.get("id"): subtask.get("aag_contract", "") | ||
| for subtask in src_data["subtasks"] | ||
| if isinstance(subtask, dict) | ||
| and subtask.get("id") | ||
| and subtask.get("aag_contract") | ||
| } |
There was a problem hiding this comment.
resume_from_plan currently prefers reading aag_contracts from an existing step_state.json before blueprint.json. With the new contract, blueprint.json is the canonical planning artifact, while step_state.json may be stale/partial runtime state (or an old planning-shaped file). Consider preferring blueprint.json as the primary source and only falling back to step_state.json when blueprint is missing or doesn’t contain contracts.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -1586,6 +1652,7 @@ def resume_from_plan(branch: str) -> dict: | |||
| plan_approved=True, | |||
| execution_mode="batch", | |||
| workflow_status="IN_PROGRESS", | |||
| constraints=constraints, | |||
| ) | |||
There was a problem hiding this comment.
resume_from_plan now persists constraints (including time_budget), but the newly created StepState relies on the dataclass default started_at=datetime.now().isoformat() (naive / no timezone). workflow-gate.py’s time_budget enforcement uses datetime.now(timezone.utc) - start, which raises TypeError for naive start and silently skips enforcement. Set started_at explicitly to an unambiguous UTC timestamp (e.g., _utc_timestamp()) when constructing the state (or ensure StepState.started_at is always timezone-aware).
| for raw_line in match.group("body").splitlines(): | ||
| line = raw_line.split("#", 1)[0].rstrip() | ||
| if not line.strip() or ":" not in line: | ||
| continue |
There was a problem hiding this comment.
Constraint parsing strips comments via raw_line.split("#", 1), which will corrupt valid quoted YAML values containing # (e.g., scope_glob: "src/#tmp/**") and can lead to silently wrong constraints being persisted. Consider only treating # as a comment delimiter when it occurs outside quotes (or require that values containing # are fully preserved).
Summary
/map-planfrom instructing users to create a planning-onlystep_state.json/map-efficientexplicitly rehydrate runtime state from plan artifacts when state is missing or still in planning shapeaag_contractsand spec constraints duringresume_from_planWhy
/map-planwas producing a planning-onlystep_state.jsoncontract that did not match runtime expectations inmap_orchestrator.py.That allowed
/map-efficientto skipresume_from_planand, in the worst case, jump straight toST-002.This PR makes planning produce only planning artifacts (
spec,task_plan,blueprint, manifest), and makes execution rebuild canonical runtime state from those artifacts.Changes
map-plan.md/$map-planskill:step_state.jsoncreationresume_from_planmap-efficient.md:COMPLETEruntime state as resumable plan statemap_orchestrator.py:aag_contractsfromblueprint.subtasks[].aag_contractaag_contractsinto runtimeStepStateconstraintsduringresume_from_planmap_step_runner.py:task_plan + blueprintas sufficient for plan readinessTest Plan
Result:
357 passed, 5 skipped