Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
33b487c
test: add failing tests for purgeAllPreV2Knowledge (Fix 3)
Apr 14, 2026
9a8dfbb
feat: widen legacy-knowledge purge with v3 migration (Fix 3)
Apr 14, 2026
3d6eb8b
test: add failing tests for reconciler self-heal (Fix 2)
Apr 14, 2026
c7fd2bf
feat: add self-heal to reconcile-manifest for render-ready crash wind…
Apr 14, 2026
5fb3503
test: add failing tests for /resolve knowledge citation (Fix 1)
Apr 14, 2026
a4babb4
feat: /resolve reads and cites project knowledge (Fix 1)
Apr 14, 2026
b14f96f
docs: update CLAUDE.md, self-learning.md, CHANGELOG.md for v2.0.0 shi…
Apr 14, 2026
77a97fa
refactor: simplify v2.0.0 ship-blocker fix implementations
Apr 14, 2026
bd1c92f
fix: scrutiny issues from 9-pillar review on v2.0.0 ship-blockers
Apr 14, 2026
083c1c7
docs: move v2 fixes to Unreleased + document marker requirement
Apr 14, 2026
9f8cfdc
refactor(cli): align v3 migration pattern, extract withKnowledgeFiles…
Apr 14, 2026
471e232
refactor(reconcile-manifest): extract helpers, eliminate duplication,…
Apr 14, 2026
fb235d9
refactor: remove unreachable (none) fallback in loadKnowledgeContext
Apr 14, 2026
c3184e9
test: RED for loadKnowledgeIndex + CLI dispatch + observability
Apr 14, 2026
cc85295
feat: loadKnowledgeIndex + subcommand dispatch for knowledge-context.cjs
Apr 14, 2026
53ebb59
test: RED for apply-knowledge skill structure
Apr 14, 2026
2113800
feat: add shared/skills/apply-knowledge skill
Apr 14, 2026
d4c6a83
test: RED for four-command knowledge index adoption
Apr 14, 2026
30f14c2
refactor: wire knowledge index across resolve/plan/self-review/code-r…
Apr 14, 2026
8db9c35
docs: CLAUDE.md + self-learning.md + CHANGELOG for knowledge index pa…
Apr 14, 2026
4ad4f71
refactor: simplify knowledge-context.cjs and clean up new test files
Apr 14, 2026
597f1a2
fix: address scrutiny issues on knowledge index pattern
Apr 14, 2026
e6a5d80
fix: prune stale prose assertions and update reviewer citation sentence
Apr 14, 2026
6542beb
fix: resolve 31 review issues from PR #182 code review walkthrough
Apr 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Self-learning system: detects repeated workflows and creates slash commands/skills automatically
- **Learning**: `devflow learn --purge` command to remove invalid entries from learning log
- **Learning**: debug logging mode (`devflow learn --configure`) — logs to `~/.devflow/logs/`
- **Knowledge citations in review & resolve outputs**: Resolvers and Reviewers cite matching ADR-NNN/PF-NNN IDs inline with an explicit hallucination guard (verbatim-only, no inference). `/resolve` aggregates cited IDs into a `## Knowledge Citations` section at the top of `resolution-summary.md`.

### Changed
- **Knowledge index + on-demand Read pattern across all knowledge-consuming commands**: `/resolve`, `/plan`, `/self-review`, `/code-review`, and `/debug` (plus their Teams variants and ambient orch equivalents `resolve:orch`, `plan:orch`, `review:orch`, `debug:orch`) now fan a compact index instead of the full ADR/PF corpus. Downstream agents (resolver, designer, simplifier, scrutinizer, reviewer) Read full entry bodies on demand. For `/debug`, knowledge stays orchestrator-local (hypothesis generation) and is not fanned to Explore investigators. Shared algorithm extracted to new `devflow:apply-knowledge` skill. Unified placeholder convention: all 11 invocation sites use `"{worktree}"`. Closes PF-011 and fills pre-existing ambient gaps for plan:orch, review:orch, and debug:orch. Token savings: ~75K/run at 10 resolvers with current corpus; scales as O(1) instead of O(entries × agents) as corpus grows.
- **Learning**: Moved from Stop → SessionEnd hook with 3-session batching (adaptive: 5 at 15+ observations)
- **Learning**: Raised procedural thresholds from 2 to 3 observations with 24h+ temporal spread for both types
- **Learning**: Reduced default `max_daily_runs` from 10 to 5
Expand All @@ -26,6 +28,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **Learning**: Race condition in batch file handoff (atomic `mv` replaces `cp`+`rm`)
- **Learning**: `--enable` now auto-upgrades legacy Stop hook to SessionEnd
- **Learning**: `--status` detects legacy hook and shows upgrade instructions
- **Self-learning reconciler self-heal**: `reconcile-manifest` now recovers from `render-ready` crash-window states. When a knowledge file contains an ADR/PF anchor absent from the manifest, and exactly one `status: 'ready'` log observation matches by normalized pattern, the observation is upgraded to `status: 'created'` and the manifest entry is reconstructed. Zero matches are treated as user-curated (left alone); multiple matches are silently skipped as ambiguous. Adds `healed` counter to all reconcile-manifest output shapes. Heal is gated by the `- **Source**: self-learning:` marker on the knowledge-file section, preventing false-positive heals against pre-v2 seeded entries.
- **Legacy knowledge purge v3 migration** (`purge-legacy-knowledge-v3`): sweeps all remaining pre-v2 seeded knowledge entries using the `- **Source**: self-learning:` format discriminator. Any ADR/PF section lacking this marker is removed. Replaces the v2 hardcoded allow-list approach with a format-based approach that catches entries the v2 migration missed. Self-learning-generated entries and user-opted-in entries (entries containing the source marker) survive.

---

Expand Down
12 changes: 6 additions & 6 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,13 @@ Commands with Teams Variant ship as `{name}.md` (parallel subagents) and `{name}

**Ambient Mode**: Three-layer architecture for always-on intent classification. SessionStart hook (`session-start-classification`) reads lean classification rules (`~/.claude/skills/devflow:router/references/classification-rules.md`, ~30 lines) and injects as `additionalContext` — once per session, deterministic, zero model overhead. UserPromptSubmit hook (`preamble`) injects a one-sentence prompt per message triggering classification + router loading via Skill tool. Router SKILL.md is a pure skill lookup table (~50 lines) loaded on-demand only for GUIDED/ORCHESTRATED depth — maps intent×depth to domain and orchestration skills. Toggleable via `devflow ambient --enable/--disable/--status` or `devflow init`.

**Self-Learning**: A SessionEnd hook (`session-end-learning`) accumulates session IDs and triggers a background `claude -p --model sonnet` every 3 sessions (5 at 15+ observations) to detect **4 observation types** — workflow, procedural, decision, and pitfall — from batch transcripts. Transcript content is split into two channels by `scripts/hooks/lib/transcript-filter.cjs`: `USER_SIGNALS` (plain user messages, feeds workflow/procedural detection) and `DIALOG_PAIRS` (prior-assistant + user turns, feeds decision/pitfall detection). Detection uses per-type linguistic markers and quality gates stored in each observation as `quality_ok`. Per-type thresholds govern promotion (workflow: 3 required; procedural: 4 required; decision/pitfall: 2 required), each with independent temporal spread requirements. Observations accumulate in `.memory/learning-log.jsonl`; their lifecycle is `observing → ready → created → deprecated`. When thresholds are met, `json-helper.cjs render-ready` renders deterministically to 4 targets: slash commands (`.claude/commands/self-learning/`), skills (`.claude/skills/{slug}/`), decisions.md ADR entries, and pitfalls.md PF entries. A session-start feedback reconciler (`json-helper.cjs reconcile-manifest`) checks the manifest at `.memory/.learning-manifest.json` against the filesystem to detect deletions (applies 0.3× confidence penalty) and edits (ignored per D13). Loaded artifacts are reinforced locally (no LLM) on each session end. Single toggle mechanism: hook presence in `settings.json` IS the enabled state — no `enabled` field in `learning.json`. Toggleable via `devflow learn --enable/--disable/--status` or `devflow init --learn/--no-learn`. Configurable model/throttle/caps/debug via `devflow learn --configure`. Use `devflow learn --reset` to remove all artifacts + log + transient state. Use `devflow learn --purge` to remove invalid observations. Use `devflow learn --review` to inspect observations needing attention. Debug logs stored at `~/.devflow/logs/{project-slug}/`. The `knowledge-persistence` skill is a format specification only; the actual writer is `scripts/hooks/background-learning` via `json-helper.cjs render-ready`.
**Self-Learning**: A SessionEnd hook (`session-end-learning`) accumulates session IDs and triggers a background `claude -p --model sonnet` every 3 sessions (5 at 15+ observations) to detect **4 observation types** — workflow, procedural, decision, and pitfall — from batch transcripts. Transcript content is split into two channels by `scripts/hooks/lib/transcript-filter.cjs`: `USER_SIGNALS` (plain user messages, feeds workflow/procedural detection) and `DIALOG_PAIRS` (prior-assistant + user turns, feeds decision/pitfall detection). Detection uses per-type linguistic markers and quality gates stored in each observation as `quality_ok`. Per-type thresholds govern promotion (workflow: 3 required; procedural: 4 required; decision/pitfall: 2 required), each with independent temporal spread requirements. Observations accumulate in `.memory/learning-log.jsonl`; their lifecycle is `observing → ready → created → deprecated`. When thresholds are met, `json-helper.cjs render-ready` renders deterministically to 4 targets: slash commands (`.claude/commands/self-learning/`), skills (`.claude/skills/{slug}/`), decisions.md ADR entries, and pitfalls.md PF entries. A session-start feedback reconciler (`json-helper.cjs reconcile-manifest`) checks the manifest at `.memory/.learning-manifest.json` against the filesystem to detect deletions (applies 0.3× confidence penalty) and edits (ignored per D13). The reconciler also **self-heals** from render-ready crash-window states: when a knowledge file contains an ADR/PF anchor that is absent from the manifest *and* the section carries the `- **Source**: self-learning:` marker, the heal scans the log for `status: 'ready'` observations matching by normalized pattern (exactly one match = upgrade to `status: 'created'` and reconstruct manifest entry; zero or multiple matches = silently skipped). The marker check excludes pre-v2 seeded entries from the heal path so they cannot be falsely paired with a current ready obs. Loaded artifacts are reinforced locally (no LLM) on each session end. Single toggle mechanism: hook presence in `settings.json` IS the enabled state — no `enabled` field in `learning.json`. Toggleable via `devflow learn --enable/--disable/--status` or `devflow init --learn/--no-learn`. Configurable model/throttle/caps/debug via `devflow learn --configure`. Use `devflow learn --reset` to remove all artifacts + log + transient state. Use `devflow learn --purge` to remove invalid observations. Use `devflow learn --review` to inspect observations needing attention. Debug logs stored at `~/.devflow/logs/{project-slug}/`. The `knowledge-persistence` skill is a format specification only; the actual writer is `scripts/hooks/background-learning` via `json-helper.cjs render-ready`.

**Claude Code Flags**: Typed registry (`src/cli/utils/flags.ts`) for managing Claude Code feature flags (env vars and top-level settings). Pure functions `applyFlags`/`stripFlags`/`getDefaultFlags` follow the `applyTeamsConfig`/`stripTeamsConfig` pattern. Initial flags: `tool-search`, `lsp`, `clear-context-on-plan` (default ON), `brief`, `disable-1m-context` (default OFF). Manageable via `devflow flags --enable/--disable/--status/--list`. Stored in manifest `features.flags: string[]`.

**Two-Mode Init**: `devflow init` offers Recommended (sensible defaults, quick setup) or Advanced (full interactive flow) after plugin selection. `--recommended` / `--advanced` CLI flags for non-interactive use. Recommended applies: ambient ON, memory ON, learn ON, HUD ON, teams OFF, default-ON flags, .claudeignore ON, auto-install safe-delete if trash CLI detected, user-mode security deny list.

**Migrations**: Run-once migrations execute automatically on `devflow init`, tracked at `~/.devflow/migrations.json` (scope-independent; single file regardless of user-scope vs local-scope installs). Registry: append an entry to `MIGRATIONS` in `src/cli/utils/migrations.ts`. Scopes: `global` (runs once per machine, no project context) vs `per-project` (sweeps all discovered Claude-enabled projects in parallel). Failures are non-fatal — migrations retry on next init. **D37 edge case**: a project cloned *after* migrations have run won't be swept (the marker is global, not per-project). Recovery: `rm ~/.devflow/migrations.json` forces a re-sweep on next `devflow init`.
**Migrations**: Run-once migrations execute automatically on `devflow init`, tracked at `~/.devflow/migrations.json` (scope-independent; single file regardless of user-scope vs local-scope installs). Registry: append an entry to `MIGRATIONS` in `src/cli/utils/migrations.ts`. Scopes: `global` (runs once per machine, no project context) vs `per-project` (sweeps all discovered Claude-enabled projects in parallel). Failures are non-fatal — migrations retry on next init. Currently registered per-project migrations include `purge-legacy-knowledge-v2` (removes 4 hardcoded pre-v2 ADR/PF IDs and orphan `PROJECT-PATTERNS.md`) and `purge-legacy-knowledge-v3` (v3: sweeps all remaining pre-v2 seeded entries using the `- **Source**: self-learning:` format discriminator — any ADR/PF section lacking this marker is removed; entries the user edited to include the marker survive). **D37 edge case**: a project cloned *after* migrations have run won't be swept (the marker is global, not per-project). Recovery: `rm ~/.devflow/migrations.json` forces a re-sweep on next `devflow init`.

## Project Structure

Expand Down Expand Up @@ -142,12 +142,12 @@ Working memory files live in a dedicated `.memory/` directory:
## Agent & Command Roster

**Orchestration commands** (spawn agents, never do agent work in main session):
- `/plan` — Skimmer + Explore + Designer + Synthesizer + Plan + Designer → design artifact
- `/plan` — Skimmer + Explore + Designer + Synthesizer + Plan + Designer → design artifact; consumes knowledge via index + on-demand Read via `devflow:apply-knowledge`
- `/implement` — Git + Coder + Validator + Simplifier + Scrutinizer + Evaluator + Tester → PR (accepts plan documents, issues, or task descriptions)
- `/code-review` — 7-11 Reviewer agents + Git + Synthesizer
- `/resolve` — N Resolver agents + Git
- `/code-review` — 7-11 Reviewer agents + Git + Synthesizer; consumes knowledge via index + on-demand Read via `devflow:apply-knowledge`
- `/resolve` — N Resolver agents + Git; loads compact knowledge index (`knowledge-context.cjs index`) per worktree and passes it as `KNOWLEDGE_CONTEXT` to each Resolver; Resolvers use `devflow:apply-knowledge` to Read full bodies on demand; aggregates cited ADR-NNN/PF-NNN IDs into a `## Knowledge Citations` section at the top of `resolution-summary.md`
- `/debug` — Agent Teams competing hypotheses
- `/self-review` — Simplifier then Scrutinizer (sequential)
- `/self-review` — Simplifier then Scrutinizer (sequential); consumes knowledge via index + on-demand Read via `devflow:apply-knowledge`
- `/audit-claude` — CLAUDE.md audit (optional plugin)

**Shared agents** (12): git, synthesizer, skimmer, simplifier, coder, reviewer, resolver, evaluator, tester, scrutinizer, validator, designer
Expand Down
67 changes: 66 additions & 1 deletion docs/self-learning.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,67 @@ On session start, `json-helper.cjs reconcile-manifest <cwd>` compares manifest e

This creates a feedback loop: deleting a generated artifact reduces its observation's confidence, eventually causing it to stop promoting.

#### Self-Heal: Crash-Window Recovery

`render-ready` writes to the knowledge file first, then updates the log and manifest. If the process crashes in the window between file write and log update, the knowledge file contains the new ADR/PF entry but the log still shows `status: 'ready'` — a duplicate would be written on the next render-ready call.

The reconciler detects and heals these orphans automatically:

1. Scans `decisions.md` and `pitfalls.md` for ADR/PF anchors not tracked in the manifest. Only sections containing the `- **Source**: self-learning:` marker qualify — pre-v2 seeded entries (which lack the marker) are excluded so they cannot be falsely paired with a current ready obs.
2. For each unmanaged anchor, searches the log for `status: 'ready'` observations whose normalized pattern matches the anchor's heading text.
3. **Exactly one match** → upgrades the observation to `status: 'created'`, reconstructs the manifest entry, and registers usage. The `healed` counter in the reconcile output increments.
4. **Zero matches** → the entry is user-curated (written manually). Left untouched.
5. **Multiple matches** → ambiguous; silently skipped. The `healed` counter does not increment.

The `healed` field is present in all three reconcile-manifest output shapes (main path and both early-return paths) and is backward-compatible — callers that discard the output are unaffected.

## Knowledge Index + On-Demand Read Pattern

Knowledge consumers (slash commands and orch skills) do not fan the full ADR/PF corpus to spawned agents. Instead they use a two-step pattern:

### Step 1: Load compact index at orchestrator

```bash
KNOWLEDGE_CONTEXT=$(node scripts/hooks/lib/knowledge-context.cjs index "{worktree}")
```

This produces a compact index listing each active entry's ID, truncated title, status, and area:

```
Decisions (2):
ADR-001 Use Result types instead of thrown errors [Active]
...

Pitfalls (3):
PF-004 Background hook scripts become god scripts [Active] — scripts/hooks/
...

ADR-NNN entries live in /path/to/project/.memory/knowledge/decisions.md
PF-NNN entries live in /path/to/project/.memory/knowledge/pitfalls.md
Read the relevant file and locate the matching `## ADR-NNN:` or `## PF-NNN:` heading for the full body.
```

> **Note**: Pre-v2 seeded entries may show `[unknown]` instead of `[Active]` if they predate the standard `- **Status**: Active` line format. New entries created by the learning system always include the status line.

### Step 2: Agent reads full body on demand

Agents that receive `KNOWLEDGE_CONTEXT` follow the `devflow:apply-knowledge` skill algorithm:

1. Scan the index and identify plausibly-relevant entries for the current task
2. Use `Read` on the knowledge file and locate the matching `## ADR-NNN:` or `## PF-NNN:` heading
3. Read the full entry body
4. Cite `applies ADR-NNN` / `avoids PF-NNN` inline — verbatim IDs only, no fabrication

### Commands using this pattern

| Command / Orch | Agents that consume |
|----------------|---------------------|
| `/resolve`, `resolve:orch` | Resolver |
| `/plan`, `plan:orch` | Designer, Explore |
| `/self-review` | Simplifier, Scrutinizer |
| `/code-review`, `review:orch` | Reviewer |
| `debug:orch` | Orchestrator-local (not fanned to Explore) |

## CLI Commands

```bash
Expand All @@ -97,7 +158,11 @@ npx devflow-kit learn --purge # Remove invalid/corrupted entri
npx devflow-kit learn --review # Inspect observations needing attention (stale, capped, low-quality)
```

Removal of pre-v2 low-signal knowledge entries (ADR-002, PF-001, PF-003, PF-005) and orphan `PROJECT-PATTERNS.md` now runs automatically as a one-time migration on `devflow init` — no CLI flag needed. Migration state is tracked at `~/.devflow/migrations.json`.
Two one-time migrations run automatically on `devflow init` to remove pre-v2 seeded knowledge entries — no CLI flag needed. Migration state is tracked at `~/.devflow/migrations.json`.

**v2 migration (`purge-legacy-knowledge-v2`)**: Removes 4 hardcoded low-signal IDs (ADR-002, PF-001, PF-003, PF-005) and the orphan `PROJECT-PATTERNS.md` file seeded by earlier devflow versions.

**v3 migration (`purge-legacy-knowledge-v3`)**: Sweeps all remaining pre-v2 seeded entries using a format discriminator. Any ADR/PF section in `decisions.md` or `pitfalls.md` that lacks the line `- **Source**: self-learning:` is treated as pre-v2 seeded content and removed. Self-learning-generated entries all carry this marker, so they are preserved. User-edited entries survive too — add the `- **Source**: self-learning:manual_xxx` line to any entry you want to keep through future migrations.

## HUD Row

Expand Down
Loading
Loading