feat(waypoint): session checkpoint plugin for multi-session progress tracking by armstrongl · Pull Request #2 · nuggylib/compound-engineering-plugin

armstrongl · 2026-04-14T00:17:46Z

Summary

Adds waypoint plugin: lightweight, automatic session checkpoints for multi-session projects
Stop hook blocks session end until .context/checkpoint.md is written with structured handoff context
CLAUDE.md instructions handle session-start checkpoint reading, plan checkbox auto-completion, and commit message plan anchoring
Includes ideation doc with full exploration of 40 candidates narrowed to 6 survivors

How it works

Session end: Stop hook detects stale/missing checkpoint, blocks the agent, and instructs it to write one
Session start: CLAUDE.md tells the agent to read the checkpoint before doing anything else
During work: Agent updates plan checkboxes and includes [unit:N] in commit messages

Files

plugins/waypoint/.claude-plugin/plugin.json — manifest with Stop hook
plugins/waypoint/.cursor-plugin/plugin.json — Cursor compat
plugins/waypoint/hooks/stop-checkpoint.sh — blocks stop if checkpoint stale (>5 min)
plugins/waypoint/CLAUDE.md — session-start, during-work, and session-end instructions
docs/ideation/2026-04-13-session-progress-tracking-ideation.md — ideation artifact

Test plan

Install plugin via claude plugin add ./plugins/waypoint
Start a session, verify CLAUDE.md instructions load
End a session without writing checkpoint — verify hook blocks and instructs
Write checkpoint, end session — verify hook approves
Start new session — verify agent reads checkpoint first

…and sequencing 120 raw candidates from 12 sub-agents across 10 ideation frames (information theory, PL semantics, systems architecture, biology, organizational process, economics, linguistics, physics, design, music/military/game theory). 30 ranked ideas, 69 rejections. New ideas EveryInc#14-30 cover: empirical ablation, module unbundling, circuit breaker, JIT specialization, Kolmogorov compression, carrying cost budgeting, register mismatch correction, pidgin instruction language, Schelling point architecture, renormalization group flow, cartographic zoom levels, OODA decision manifests, and congestion pricing. Added implementation sequencing with dependency graph, phase estimates, and speed x impact ranking. Recommended first move: EveryInc#26 (Register Mismatch Correction).

…ch correction Provides stakeholder-facing framing of the tutorial-to-specification register correction initiative, with real before/after examples from ce-review and projected per-skill savings estimates.

…sification and savings estimates Catalogs 6 tutorial-register pattern classes (progressive explanation, redundant clarification, motivational framing, inline rationale, hedging markers, indirect speech acts) with structural detection heuristics, 2-axis severity ratings, and transformation rules aligned to the skill compliance checklist. Includes 12 before/after examples from ce-review, a rationale classification framework with 6 worked borderline examples, HTML comment preservation rules, and sampling-based savings estimates for all top-7 skills (19.6-25.4KB estimated net savings across 283KB, correcting prior 20-30% projections to 8-13%).

Strip tutorial-register content (progressive explanation, redundant clarification, motivational framing, inline rationale, hedging markers, indirect speech acts) from the 7 largest skills using the methodology from docs/references/register-mismatch-correction-methodology.md. Total savings: 29,894 bytes (10.6% of 283,324 bytes across 7 skills). 22 HTML comments preserved for compliance-critical rationale. Per-skill results: - ce-work: 3,467 bytes (12.8%) - ce-compound: 6,363 bytes (20.4%) - ce-plan: 3,723 bytes (8.9%) - ce-compound-refresh: 5,485 bytes (11.4%) - ce-review: 6,145 bytes (11.2%) - ce-work-beta: 3,952 bytes (12.3%) - orchestrating-swarms: 759 bytes (1.6%)

…ess tracking Lightweight plugin that automatically drops checkpoint breadcrumbs at session boundaries. A Stop hook blocks the agent from ending until it writes a structured checkpoint to .context/checkpoint.md. CLAUDE.md instructions handle session-start reading, plan checkbox auto-completion, and commit message anchoring.

# Conflicts: # plugins/compound-engineering/skills/ce-plan/SKILL.md # plugins/compound-engineering/skills/ce-work-beta/SKILL.md # plugins/compound-engineering/skills/ce-work/SKILL.md # plugins/compound-engineering/skills/orchestrating-swarms/SKILL.md

…yInc#21 for Batch 3 - Brainstorm #2 Lean Agent Dispatch: shared context dedup delivers ~140-195 KB/review by writing dispatch content to .context/ once instead of inlining ~16 KB per reviewer. Archetype cleanup deferred. - Plan #3 Diff-Proportional Scaling: 5 units adding 4-tier caps (trivial/small/medium/large) with priority-ordered selection across Tier 1-4 reviewer categories. - Plan EveryInc#8+EveryInc#21 Combined Dedup: 7 units covering AGENTS.md canonical sections, native tool guidance removal from 14 files, phase-scoping the critical "NEVER CODE" interference, semantic compression, and staleness checks. Updates meta-plan tracking for all Batch 3 items.

…ryInc#8+EveryInc#21, #2 Batch 3 (Phase 2 - Structural Dedup) execution: - #3 Diff-Proportional Scaling: 5 units adding cap: arg, EXECUTABLE_LINES counting, tier table, priority ordering, and capped team announcement to ce-review SKILL.md. Trivial diffs now dispatch max 8 agents (was unbounded). - EveryInc#8+EveryInc#21 Cross-Skill Pipeline Dedup: 7 units across 25 files. AGENTS.md canonical Cross-Platform Interaction Convention section, native tool guidance removal from 14 agent/skill files, phase-scoping "NEVER CODE" interference (ce-plan/ce-work/ce-review), semantic compression of 3 pipeline skills, question-tool drift staleness check, cross-reference matrix, and carrying-waste manifest (97% of pipeline content is phase-specific). - #2 Lean Agent Dispatch: planned + executed 3 units. Write-once dispatch context deduplicates 15.7KB of shared template/schema/scope across N sub-agents. ~144KB savings per 10-reviewer dispatch. Lean prompts pass paths instead of inlining shared content. All 701 tests pass (4 new staleness tests). release:validate clean.

…mework and EveryInc#20 carrying cost Build the two Phase 0 measurement tools that unlock all Phase 3 work: EveryInc#14 Empirical Ablation Framework: - Section parser (src/analysis/sections.ts): splits markdown by H2/H3 headers - Variant generator (src/analysis/variants.ts): creates ablated skill copies - Evaluator + quality scorer (src/analysis/evaluator.ts): single-prompt evaluation with four-axis scoring (coverage, precision, calibration, compliance) - CLI scripts (scripts/ablation/run.ts, report.ts): orchestrate ablation runs with --dry-run, --section, --fixture flags and generate ranked reports - 3 diff fixtures extracted from repo history - 41 new tests EveryInc#20 Carrying Cost Budgeting: - scripts/skill/stats.ts: ranks 91 skills/agents by carrying cost (file_size x estimated_tool_calls) with recursive sub-agent cost - Heuristic detects phase headers, tool instructions, agent dispatches, loop constructs, and bash code blocks - Key finding: system cost ranking differs from file-size ranking; ce-optimize jumps from EveryInc#5 to #2 due to 146 estimated tool calls - 20 new tests Also re-created missing brainstorm docs for EveryInc#14 and EveryInc#20 (original 2026-04-13 artifacts were never committed).

armstrongl · 2026-04-21T15:51:35Z

@copilot resolve the merge conflicts in this pull request

# Conflicts: # plugins/compound-engineering/skills/ce-code-review/SKILL.md # plugins/compound-engineering/skills/ce-compound-refresh/SKILL.md # plugins/compound-engineering/skills/ce-compound/SKILL.md Co-authored-by: armstrongl <29762984+armstrongl@users.noreply.github.com>

Copilot · 2026-04-21T15:56:52Z

@copilot resolve the merge conflicts in this pull request

Resolved. The three conflicted skill files (ce-code-review/SKILL.md, ce-compound-refresh/SKILL.md, ce-compound/SKILL.md) had conflicts between the register-mismatch corrections in this branch and the updates that landed in main (renamed agents, updated instructions). Resolved by taking main's version for those files, then committed the merge in ca0eca6.

…yInc#21 for Batch 3 - Brainstorm #2 Lean Agent Dispatch: shared context dedup delivers ~140-195 KB/review by writing dispatch content to .context/ once instead of inlining ~16 KB per reviewer. Archetype cleanup deferred. - Plan #3 Diff-Proportional Scaling: 5 units adding 4-tier caps (trivial/small/medium/large) with priority-ordered selection across Tier 1-4 reviewer categories. - Plan EveryInc#8+EveryInc#21 Combined Dedup: 7 units covering AGENTS.md canonical sections, native tool guidance removal from 14 files, phase-scoping the critical "NEVER CODE" interference, semantic compression, and staleness checks. Updates meta-plan tracking for all Batch 3 items.

…ryInc#8+EveryInc#21, #2 Batch 3 (Phase 2 - Structural Dedup) execution: - #3 Diff-Proportional Scaling: 5 units adding cap: arg, EXECUTABLE_LINES counting, tier table, priority ordering, and capped team announcement to ce-review SKILL.md. Trivial diffs now dispatch max 8 agents (was unbounded). - EveryInc#8+EveryInc#21 Cross-Skill Pipeline Dedup: 7 units across 25 files. AGENTS.md canonical Cross-Platform Interaction Convention section, native tool guidance removal from 14 agent/skill files, phase-scoping "NEVER CODE" interference (ce-plan/ce-work/ce-review), semantic compression of 3 pipeline skills, question-tool drift staleness check, cross-reference matrix, and carrying-waste manifest (97% of pipeline content is phase-specific). - #2 Lean Agent Dispatch: planned + executed 3 units. Write-once dispatch context deduplicates 15.7KB of shared template/schema/scope across N sub-agents. ~144KB savings per 10-reviewer dispatch. Lean prompts pass paths instead of inlining shared content. All 701 tests pass (4 new staleness tests). release:validate clean.

…mework and EveryInc#20 carrying cost Build the two Phase 0 measurement tools that unlock all Phase 3 work: EveryInc#14 Empirical Ablation Framework: - Section parser (src/analysis/sections.ts): splits markdown by H2/H3 headers - Variant generator (src/analysis/variants.ts): creates ablated skill copies - Evaluator + quality scorer (src/analysis/evaluator.ts): single-prompt evaluation with four-axis scoring (coverage, precision, calibration, compliance) - CLI scripts (scripts/ablation/run.ts, report.ts): orchestrate ablation runs with --dry-run, --section, --fixture flags and generate ranked reports - 3 diff fixtures extracted from repo history - 41 new tests EveryInc#20 Carrying Cost Budgeting: - scripts/skill/stats.ts: ranks 91 skills/agents by carrying cost (file_size x estimated_tool_calls) with recursive sub-agent cost - Heuristic detects phase headers, tool instructions, agent dispatches, loop constructs, and bash code blocks - Key finding: system cost ranking differs from file-size ranking; ce-optimize jumps from EveryInc#5 to #2 due to 146 estimated tool calls - 20 new tests Also re-created missing brainstorm docs for EveryInc#14 and EveryInc#20 (original 2026-04-13 artifacts were never committed).

armstrongl added 6 commits April 11, 2026 17:22

docs(register-mismatch): add conceptual explainer for register mismat…

6c20be2

…ch correction Provides stakeholder-facing framing of the tutorial-to-specification register correction initiative, with real before/after examples from ce-review and projected per-skill savings estimates.

Copilot started work on behalf of armstrongl April 21, 2026 15:51 View session

Copilot finished work on behalf of armstrongl April 21, 2026 15:58

armstrongl merged commit b9266f2 into main Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(waypoint): session checkpoint plugin for multi-session progress tracking#2

feat(waypoint): session checkpoint plugin for multi-session progress tracking#2
armstrongl merged 7 commits intomainfrom
feat/waypoint-plugin

armstrongl commented Apr 14, 2026

Uh oh!

armstrongl commented Apr 21, 2026

Uh oh!

Copilot AI commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

armstrongl commented Apr 14, 2026

Summary

How it works

Files

Test plan

Uh oh!

armstrongl commented Apr 21, 2026

Uh oh!

Copilot AI commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants