From b3440b412eade61513cd0f9667f082d94fadb17d Mon Sep 17 00:00:00 2001 From: Dean Sharon Date: Sun, 19 Apr 2026 00:08:47 +0300 Subject: [PATCH 1/5] fix: suppress QUICK classification announcement and deduplicate classification rules Conditional router loading in preamble hook gates loading and classification output to GUIDED and ORCHESTRATED depths, suppressing noisy QUICK announcements. Extract classification-rules.md from router/references/ to router/ to eliminate redundant loading and streamline the preamble execution path. - Preamble hook: conditional router loading based on depth - Session-start-classification: early exit for QUICK depth - Tests: updated ambient activation and classification test paths - CLAUDE.md: updated skill-references paths Co-Authored-By: Claude --- CLAUDE.md | 2 +- scripts/hooks/preamble | 2 +- scripts/hooks/session-start-classification | 8 ++++---- .../router/{references => }/classification-rules.md | 0 tests/ambient.test.ts | 4 ++-- tests/integration/ambient-activation.test.ts | 4 ++++ tests/integration/helpers.ts | 4 ++-- tests/skill-references.test.ts | 2 +- 8 files changed, 15 insertions(+), 11 deletions(-) rename shared/skills/router/{references => }/classification-rules.md (100%) diff --git a/CLAUDE.md b/CLAUDE.md index ff749c8..f60124f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -40,7 +40,7 @@ Commands with Teams Variant ship as `{name}.md` (parallel subagents) and `{name} **Working Memory**: Four shell-script hooks (`scripts/hooks/`) provide automatic session continuity. Toggleable via `devflow memory --enable/--disable/--status` or `devflow init --memory/--no-memory`. UserPromptSubmit (`prompt-capture-memory`) captures user prompt to `.memory/.pending-turns.jsonl` queue. Stop hook captures `assistant_message` (on `end_turn` only) to same queue, then spawns throttled background `claude -p --model haiku` updater (skips if triggered <2min ago; concurrent sessions serialize via mkdir-based lock). Background updater uses `mv`-based atomic handoff to process all pending turns in batch (capped at 10 most recent), with crash recovery via `.pending-turns.processing` file. Updates `.memory/WORKING-MEMORY.md` with structured sections (`## Now`, `## Progress`, `## Decisions`, `## Modified Files`, `## Context`, `## Session Log`). SessionStart hook → injects previous memory + git state as `additionalContext` on `/clear`, startup, or compact (warns if >1h stale; injects pre-compact memory snapshot when compaction happened mid-session). PreCompact hook → saves git state + WORKING-MEMORY.md snapshot + bootstraps minimal WORKING-MEMORY.md if none exists. Disabling memory removes all four hooks. Use `devflow memory --clear` to clean up pending queue files across projects. Zero-ceremony context preservation. -**Ambient Mode**: Three-layer architecture for always-on intent classification. SessionStart hook (`session-start-classification`) reads lean classification rules (`~/.claude/skills/devflow:router/references/classification-rules.md`, ~30 lines) and injects as `additionalContext` — once per session, deterministic, zero model overhead. UserPromptSubmit hook (`preamble`) injects a one-sentence prompt per message triggering classification + router loading via Skill tool. Router SKILL.md is a pure skill lookup table (~50 lines) loaded on-demand only for GUIDED/ORCHESTRATED depth — maps intent×depth to domain and orchestration skills. Toggleable via `devflow ambient --enable/--disable/--status` or `devflow init`. +**Ambient Mode**: Three-layer architecture for always-on intent classification. SessionStart hook (`session-start-classification`) reads lean classification rules (`~/.claude/skills/devflow:router/classification-rules.md`, ~30 lines) and injects as `additionalContext` — once per session, deterministic, zero model overhead. UserPromptSubmit hook (`preamble`) injects a one-sentence prompt per message triggering classification + conditional router loading via Skill tool. Router SKILL.md is a pure skill lookup table (~50 lines) loaded on-demand only for GUIDED/ORCHESTRATED depth — maps intent×depth to domain and orchestration skills. Toggleable via `devflow ambient --enable/--disable/--status` or `devflow init`. **Self-Learning**: A SessionEnd hook (`session-end-learning`) accumulates session IDs and triggers a background `claude -p --model sonnet` every 3 sessions (5 at 15+ observations) to detect **4 observation types** — workflow, procedural, decision, and pitfall — from batch transcripts. Transcript content is split into two channels by `scripts/hooks/lib/transcript-filter.cjs`: `USER_SIGNALS` (plain user messages, feeds workflow/procedural detection) and `DIALOG_PAIRS` (prior-assistant + user turns, feeds decision/pitfall detection). Detection uses per-type linguistic markers and quality gates stored in each observation as `quality_ok`. Per-type thresholds govern promotion (workflow: 3 required; procedural: 4 required; decision/pitfall: 2 required), each with independent temporal spread requirements. Observations accumulate in `.memory/learning-log.jsonl`; their lifecycle is `observing → ready → created → deprecated`. When thresholds are met, `json-helper.cjs render-ready` renders deterministically to 4 targets: slash commands (`.claude/commands/self-learning/`), skills (`.claude/skills/{slug}/`), decisions.md ADR entries, and pitfalls.md PF entries. A session-start feedback reconciler (`json-helper.cjs reconcile-manifest`) checks the manifest at `.memory/.learning-manifest.json` against the filesystem to detect deletions (applies 0.3× confidence penalty) and edits (ignored per D13). The reconciler also **self-heals** from render-ready crash-window states: when a knowledge file contains an ADR/PF anchor that is absent from the manifest *and* the section carries the `- **Source**: self-learning:` marker, the heal scans the log for `status: 'ready'` observations matching by normalized pattern (exactly one match = upgrade to `status: 'created'` and reconstruct manifest entry; zero or multiple matches = silently skipped). The marker check excludes pre-v2 seeded entries from the heal path so they cannot be falsely paired with a current ready obs. Loaded artifacts are reinforced locally (no LLM) on each session end. Single toggle mechanism: hook presence in `settings.json` IS the enabled state — no `enabled` field in `learning.json`. Toggleable via `devflow learn --enable/--disable/--status` or `devflow init --learn/--no-learn`. Configurable model/throttle/caps/debug via `devflow learn --configure`. Use `devflow learn --reset` to remove all artifacts + log + transient state. Use `devflow learn --purge` to remove invalid observations. Use `devflow learn --review` to inspect observations needing attention. Debug logs stored at `~/.devflow/logs/{project-slug}/`. The `knowledge-persistence` skill is a format specification only; the actual writer is `scripts/hooks/background-learning` via `json-helper.cjs render-ready`. diff --git a/scripts/hooks/preamble b/scripts/hooks/preamble index 234c050..2c46255 100755 --- a/scripts/hooks/preamble +++ b/scripts/hooks/preamble @@ -34,6 +34,6 @@ fi # Minimal preamble — classification rules injected at SessionStart, not here. # SYNC: must match tests/ambient.test.ts preamble drift detection -PREAMBLE="Classify this request's intent and depth, then load devflow:router via Skill tool." +PREAMBLE="Classify this request's intent and depth. If GUIDED or ORCHESTRATED, load devflow:router via Skill tool." json_prompt_output "$PREAMBLE" diff --git a/scripts/hooks/session-start-classification b/scripts/hooks/session-start-classification index aa13500..f5297d1 100755 --- a/scripts/hooks/session-start-classification +++ b/scripts/hooks/session-start-classification @@ -15,12 +15,12 @@ INPUT=$(cat) CWD=$(printf '%s' "$INPUT" | json_field "cwd" "") if [ -z "$CWD" ]; then exit 0; fi -CLASSIFICATION_RULES="$HOME/.claude/skills/devflow:router/references/classification-rules.md" +CLASSIFICATION_RULES="$HOME/.claude/skills/devflow:router/classification-rules.md" +CLASSIFICATION_RULES_LEGACY="$HOME/.claude/skills/devflow:router/references/classification-rules.md" if [ -f "$CLASSIFICATION_RULES" ]; then CONTEXT=$(cat "$CLASSIFICATION_RULES") -elif [ -f "$HOME/.claude/skills/devflow:router/SKILL.md" ]; then - # Fallback for upgrade window: old install without classification-rules.md - CONTEXT=$(awk '/^---$/{n++; next} n>=2' "$HOME/.claude/skills/devflow:router/SKILL.md") +elif [ -f "$CLASSIFICATION_RULES_LEGACY" ]; then + CONTEXT=$(cat "$CLASSIFICATION_RULES_LEGACY") else exit 0 fi diff --git a/shared/skills/router/references/classification-rules.md b/shared/skills/router/classification-rules.md similarity index 100% rename from shared/skills/router/references/classification-rules.md rename to shared/skills/router/classification-rules.md diff --git a/tests/ambient.test.ts b/tests/ambient.test.ts index 35ecd75..69a6bcc 100644 --- a/tests/ambient.test.ts +++ b/tests/ambient.test.ts @@ -489,7 +489,7 @@ function parseClassificationIntents(content: string): string[] { describe('router structural validation', () => { const routerPath = path.resolve(__dirname, '../shared/skills/router/SKILL.md'); - const rulesPath = path.resolve(__dirname, '../shared/skills/router/references/classification-rules.md'); + const rulesPath = path.resolve(__dirname, '../shared/skills/router/classification-rules.md'); const sharedSkillsDir = path.resolve(__dirname, '../shared/skills'); it('router covers all ORCHESTRATED intents (every non-CHAT intent has a row)', async () => { @@ -594,7 +594,7 @@ describe('preamble drift detection', () => { }); it('classification-rules.md contains required classification elements', async () => { - const rulesPath = path.resolve(__dirname, '../shared/skills/router/references/classification-rules.md'); + const rulesPath = path.resolve(__dirname, '../shared/skills/router/classification-rules.md'); const rulesContent = await fs.readFile(rulesPath, 'utf-8'); // Must contain Intent Signals heading diff --git a/tests/integration/ambient-activation.test.ts b/tests/integration/ambient-activation.test.ts index ccd56a1..cdc66b8 100644 --- a/tests/integration/ambient-activation.test.ts +++ b/tests/integration/ambient-activation.test.ts @@ -4,6 +4,7 @@ import { runClaudeStreaming, runClaudeStreamingWithRetry, hasSkillInvocations, + hasClassification, getSkillInvocations, hasRequiredSkills, } from './helpers.js'; @@ -34,12 +35,14 @@ describe.skipIf(!isClaudeAvailable())('devflow classification', () => { // "thanks" is ≤2 words — preamble's word-count filter skips it before classification runs const result = await runClaudeStreaming('thanks', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`preamble filter (single-word): no skills (${result.durationMs}ms)`); }); it('QUICK — explore: "where is the config?" loads no skills', async () => { const result = await runClaudeStreaming('where is the config file?', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`QUICK explore: no skills (${result.durationMs}ms)`); }); @@ -47,6 +50,7 @@ describe.skipIf(!isClaudeAvailable())('devflow classification', () => { // Passes preamble's word-count filter (>2 words) but classified CHAT/QUICK — no skills loaded const result = await runClaudeStreaming('sounds good, thanks for explaining that', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`CHAT/QUICK (multi-word): no skills (${result.durationMs}ms)`); }); diff --git a/tests/integration/helpers.ts b/tests/integration/helpers.ts index dc02414..9c6668d 100644 --- a/tests/integration/helpers.ts +++ b/tests/integration/helpers.ts @@ -21,13 +21,13 @@ export function isClaudeAvailable(): boolean { * Simulates SessionStart injection for integration tests. */ function loadRouterContext(): string { - const rulesPath = resolve(import.meta.dirname, '../../shared/skills/router/references/classification-rules.md'); + const rulesPath = resolve(import.meta.dirname, '../../shared/skills/router/classification-rules.md'); return readFileSync(rulesPath, 'utf-8').trim(); } // Simulates SessionStart injection (classification rules) + per-message preamble const DEVFLOW_PREAMBLE = loadRouterContext() + - '\nClassify this request\'s intent and depth, then load devflow:router via Skill tool.'; + '\nClassify this request\'s intent and depth. If GUIDED or ORCHESTRATED, load devflow:router via Skill tool.'; /** Result from a streaming claude invocation */ export interface StreamResult { diff --git a/tests/skill-references.test.ts b/tests/skill-references.test.ts index 6e1940e..b5428eb 100644 --- a/tests/skill-references.test.ts +++ b/tests/skill-references.test.ts @@ -703,7 +703,7 @@ describe('Test infrastructure skill references', () => { it('DEVFLOW_PREAMBLE reads classification-rules.md which has valid refs', () => { // helpers.ts loads DEVFLOW_PREAMBLE from classification-rules.md at runtime. // Verify the classification rules reference devflow:router (loaded via Skill tool). - const rulesPath = path.join(ROOT, 'shared', 'skills', 'router', 'references', 'classification-rules.md'); + const rulesPath = path.join(ROOT, 'shared', 'skills', 'router', 'classification-rules.md'); const rulesContent = readFileSync(rulesPath, 'utf-8'); const rulesRefs = extractPrefixedRefs(rulesContent); From 95e8ed68d56677e6036b76591d2b9cc285181176 Mon Sep 17 00:00:00 2001 From: Dean Sharon Date: Sun, 19 Apr 2026 14:27:50 +0300 Subject: [PATCH 2/5] test(ambient): add missing hasClassification assertion to slash-command preamble test The slash-command preamble filter test was missing the `expect(hasClassification(result)).toBe(false)` assertion present in the other three QUICK-tier tests, leaving the contract for that code path incompletely specified. Co-Authored-By: Claude --- tests/integration/ambient-activation.test.ts | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/integration/ambient-activation.test.ts b/tests/integration/ambient-activation.test.ts index cdc66b8..1163e28 100644 --- a/tests/integration/ambient-activation.test.ts +++ b/tests/integration/ambient-activation.test.ts @@ -58,6 +58,7 @@ describe.skipIf(!isClaudeAvailable())('devflow classification', () => { // Preamble filters prompts starting with "/" — no classification or skill loading const result = await runClaudeStreaming('/help with something', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`preamble filter (slash command): no skills (${result.durationMs}ms)`); }); From d776dc6d0458b5eb714eff639b084526a0d4792f Mon Sep 17 00:00:00 2001 From: Dean Sharon Date: Sun, 19 Apr 2026 22:06:20 +0300 Subject: [PATCH 3/5] refactor(test): remove hasDevFlowBranding duplicate, extract parseStreamEvent MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove hasDevFlowBranding (identical to hasClassification) and its tests - Extract parseStreamEvent from runClaudeStreaming to reduce nesting (5→3 levels) - Add parseStreamEvent unit tests (5 cases) and classification pattern variation guard --- tests/ambient.test.ts | 66 +++++++++++++++++++++++++++--- tests/integration/helpers.ts | 79 +++++++++++++++++++----------------- 2 files changed, 102 insertions(+), 43 deletions(-) diff --git a/tests/ambient.test.ts b/tests/ambient.test.ts index 69a6bcc..07eacf4 100644 --- a/tests/ambient.test.ts +++ b/tests/ambient.test.ts @@ -7,8 +7,8 @@ import { hasClassification, extractIntent, extractDepth, - hasDevFlowBranding, hasSkillInvocations, + parseStreamEvent, } from './integration/helpers.js'; /** Helper to create a StreamResult from text for unit-testing classification helpers. */ @@ -421,12 +421,68 @@ describe('classification helpers', () => { expect(extractDepth(textResult('no classification here'))).toBeNull(); }); - it('detects Devflow branding', () => { - expect(hasDevFlowBranding(textResult('Devflow: IMPLEMENT/GUIDED. Loading: devflow:patterns.'))).toBe(true); + it('CLASSIFICATION_PATTERN matches model output variations', () => { + // Canonical format (model instruction says "Devflow: INTENT/DEPTH") + expect(hasClassification(textResult('Devflow: IMPLEMENT/GUIDED'))).toBe(true); + // Lowercase (model might vary casing) + expect(hasClassification(textResult('devflow: implement/guided'))).toBe(true); + // No space after colon + expect(hasClassification(textResult('Devflow:CHAT/QUICK'))).toBe(true); + // Extra whitespace + expect(hasClassification(textResult('Devflow: PLAN / ORCHESTRATED'))).toBe(true); }); +}); - it('returns false for non-Devflow branding', () => { - expect(hasDevFlowBranding(textResult('Some random text without branding.'))).toBe(false); +describe('parseStreamEvent', () => { + it('extracts skills from assistant tool_use events', () => { + const event = { + type: 'assistant', + message: { content: [ + { type: 'tool_use', name: 'Skill', input: { skill: 'devflow:router' } }, + ] }, + }; + const parsed = parseStreamEvent(event); + expect(parsed.skills).toEqual(['devflow:router']); + expect(parsed.textFragments).toEqual([]); + }); + + it('extracts text from assistant text blocks', () => { + const event = { + type: 'assistant', + message: { content: [ + { type: 'text', text: 'Devflow: IMPLEMENT/GUIDED' }, + ] }, + }; + const parsed = parseStreamEvent(event); + expect(parsed.textFragments).toEqual(['Devflow: IMPLEMENT/GUIDED']); + expect(parsed.skills).toEqual([]); + }); + + it('returns empty arrays for non-assistant events', () => { + expect(parseStreamEvent({ type: 'user', message: { content: [] } })).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({ type: 'system', message: { content: [] } })).toEqual({ skills: [], textFragments: [] }); + }); + + it('returns empty arrays for malformed events', () => { + expect(parseStreamEvent(null)).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent(undefined)).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({})).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({ type: 'assistant' })).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({ type: 'assistant', message: {} })).toEqual({ skills: [], textFragments: [] }); + }); + + it('handles mixed content blocks', () => { + const event = { + type: 'assistant', + message: { content: [ + { type: 'text', text: 'Devflow: DEBUG/ORCHESTRATED' }, + { type: 'tool_use', name: 'Skill', input: { skill: 'devflow:debug:orch' } }, + { type: 'text', text: 'Loading debug orchestrator.' }, + ] }, + }; + const parsed = parseStreamEvent(event); + expect(parsed.skills).toEqual(['devflow:debug:orch']); + expect(parsed.textFragments).toEqual(['Devflow: DEBUG/ORCHESTRATED', 'Loading debug orchestrator.']); }); }); diff --git a/tests/integration/helpers.ts b/tests/integration/helpers.ts index 9c6668d..2416308 100644 --- a/tests/integration/helpers.ts +++ b/tests/integration/helpers.ts @@ -29,6 +29,41 @@ function loadRouterContext(): string { const DEVFLOW_PREAMBLE = loadRouterContext() + '\nClassify this request\'s intent and depth. If GUIDED or ORCHESTRATED, load devflow:router via Skill tool.'; +/** Parsed fields from a single streaming event */ +export interface ParsedStreamEvent { + skills: string[]; + textFragments: string[]; +} + +/** + * Extract skill invocations and text fragments from a single streaming event. + * Only processes assistant messages with content arrays. + */ +export function parseStreamEvent(event: unknown): ParsedStreamEvent { + const skills: string[] = []; + const textFragments: string[] = []; + + if ( + typeof event !== 'object' || event === null || + (event as Record).type !== 'assistant' || + !Array.isArray((event as { message?: { content?: unknown } }).message?.content) + ) { + return { skills, textFragments }; + } + + const msg = event as { type: string; message: { content: Record[] } }; + for (const block of msg.message.content) { + if (block.type === 'tool_use' && block.name === 'Skill' && typeof (block.input as Record)?.skill === 'string') { + skills.push((block.input as Record).skill as string); + } + if (block.type === 'text' && typeof block.text === 'string') { + textFragments.push(block.text as string); + } + } + + return { skills, textFragments }; +} + /** Result from a streaming claude invocation */ export interface StreamResult { /** Skill tool invocations detected (skill names) */ @@ -106,33 +141,14 @@ export function runClaudeStreaming( if (!line.trim()) continue; try { const event: unknown = JSON.parse(line); + const parsed = parseStreamEvent(event); + skills.push(...parsed.skills); + textFragments.push(...parsed.textFragments); - // Detect Skill tool_use in assistant messages - if ( - typeof event === 'object' && event !== null && - (event as Record).type === 'assistant' && - Array.isArray((event as Record).message?.content) - ) { - const msg = event as { type: string; message: { content: Record[] } }; - for (const block of msg.message.content) { - // tool_use block for Skill - if (block.type === 'tool_use' && block.name === 'Skill' && typeof (block.input as Record)?.skill === 'string') { - skills.push((block.input as Record).skill as string); - } - // text block — capture for classification detection - if (block.type === 'text' && typeof block.text === 'string') { - textFragments.push(block.text); - } - } - - // Once we have skills, give a brief window for more, then finish - if (skills.length > 0 && !graceTimer) { - graceTimer = setTimeout(() => { - finish(true); - }, 8000); // 8s grace for additional skill loads after first detection - } + // Once we have skills, give a brief window for more, then finish + if (skills.length > 0 && !graceTimer) { + graceTimer = setTimeout(() => finish(true), 8000); } - } catch { // Partial JSON line, skip } @@ -210,19 +226,6 @@ export function extractDepth(result: StreamResult): string | null { return match ? match[2].toUpperCase() : null; } -/** - * Check whether the result contains a Devflow classification tag. - * - * @see hasClassification — functionally identical after both helpers were - * unified on {@link CLASSIFICATION_PATTERN}. Kept as a distinct export so - * existing test assertions that describe "branding presence" (vs. "a - * classification exists") remain self-documenting at the call site. - */ -export function hasDevFlowBranding(result: StreamResult): boolean { - const text = result.textFragments.join(' '); - return CLASSIFICATION_PATTERN.test(text); -} - /** * Check if required skills are present in the result. * Uses bounded matching: exact match, namespace-suffixed, or devflow-prefixed. From d70998ae0bade6e11685f1d26f51771a556d6bdb Mon Sep 17 00:00:00 2001 From: Dean Sharon Date: Sun, 19 Apr 2026 23:50:22 +0300 Subject: [PATCH 4/5] fix(test): check all subagent transcripts to avoid mtime race condition MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit getLatestSubagentPreloadedSkills picked the single most-recent transcript by mtime. When Claude spawns auxiliary subagents alongside the target agent, the wrong transcript could win the race — causing the Designer test to read Git agent skills instead. Now returns all transcripts so tests can match against the correct one. --- tests/integration/helpers.ts | 15 ++-- .../subagent-skill-preload.test.ts | 76 ++++++++++++------- 2 files changed, 59 insertions(+), 32 deletions(-) diff --git a/tests/integration/helpers.ts b/tests/integration/helpers.ts index 2416308..1dd2a66 100644 --- a/tests/integration/helpers.ts +++ b/tests/integration/helpers.ts @@ -363,13 +363,18 @@ function parsePreloadedSkills(transcriptPath: string): string[] { } /** - * Find the most recent subagent transcript written at or after `since` and - * return the preloaded skill names from its initial user message. + * Find all subagent transcripts written at or after `since` and return the + * preloaded skill names from each transcript's initial user message. * - * Returns an empty array if no transcript is found or the directory structure + * Returns one string[] per transcript. The caller can assert that at least one + * transcript contains the expected skills — this avoids a race condition where + * Claude spawns auxiliary subagents (e.g., Git) alongside the target agent, + * and the auxiliary transcript has a later mtime. + * + * Returns an empty array if no transcripts are found or the directory structure * has changed (graceful degradation). */ -export function getLatestSubagentPreloadedSkills(since: Date): string[] { +export function getAllSubagentPreloadedSkills(since: Date): string[][] { const homeDir = process.env.HOME ?? process.env.USERPROFILE ?? ''; const cwd = process.cwd(); // Claude Code encodes the project path by replacing / with - @@ -382,7 +387,7 @@ export function getLatestSubagentPreloadedSkills(since: Date): string[] { // Most recent transcript first transcripts.sort((a, b) => b.mtime.getTime() - a.mtime.getTime()); - return parsePreloadedSkills(transcripts[0].path); + return transcripts.map((t) => parsePreloadedSkills(t.path)); } catch { // Project dir doesn't exist or structure changed — return empty gracefully return []; diff --git a/tests/integration/subagent-skill-preload.test.ts b/tests/integration/subagent-skill-preload.test.ts index b8eea2d..aec68c5 100644 --- a/tests/integration/subagent-skill-preload.test.ts +++ b/tests/integration/subagent-skill-preload.test.ts @@ -2,25 +2,28 @@ import { describe, it, expect } from 'vitest'; import { isClaudeAvailable, runClaudeAndWait, - getLatestSubagentPreloadedSkills, + getAllSubagentPreloadedSkills, } from './helpers.js'; /** - * Spawn an agent by name and return the preloaded skills from its transcript. - * Asserts that a transcript was actually found before returning. + * Spawn an agent by name and return ALL subagent transcripts' preloaded skills. + * + * Returns string[][] — one skill list per transcript. The caller asserts that + * at least one transcript contains the expected skills, avoiding a race where + * Claude spawns auxiliary subagents whose transcript mtime beats the target's. */ -async function spawnAgentAndGetPreloads(agentType: string, prompt: string): Promise { +async function spawnAgentAndGetAllPreloads(agentType: string, prompt: string): Promise { const since = new Date(); const result = await runClaudeAndWait( `Use the Agent tool with subagent_type="${agentType}" to ${prompt}. Only spawn the agent, do not do any other work.`, { timeout: 60000, model: 'haiku', allowedTools: 'Agent' }, ); - const preloaded = getLatestSubagentPreloadedSkills(since); + const allPreloads = getAllSubagentPreloadedSkills(since); expect( - preloaded.length, + allPreloads.length, `No subagent transcript found for ${agentType} (exit=${result.exitCode}, ${result.durationMs}ms, cwd=${process.cwd()})`, ).toBeGreaterThan(0); - return preloaded; + return allPreloads; } /** @@ -39,43 +42,62 @@ async function spawnAgentAndGetPreloads(agentType: string, prompt: string): Prom describe.skipIf(!isClaudeAvailable())('subagent skill preload', () => { it('Simplifier preloads software-design and worktree-support', async () => { - const preloaded = await spawnAgentAndGetPreloads('Simplifier', 'simplify this trivial function: function add(a, b) { return a + b; }'); - expect(preloaded).toEqual(expect.arrayContaining(['software-design', 'worktree-support'])); + const allPreloads = await spawnAgentAndGetAllPreloads('Simplifier', 'simplify this trivial function: function add(a, b) { return a + b; }'); + const expected = ['software-design', 'worktree-support']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); // Simplifier must NOT have apply-knowledge (PR #182 explicit assertion) - expect(preloaded).not.toContain('apply-knowledge'); + const simplifierTranscript = allPreloads.find((p) => expected.every((s) => p.includes(s)))!; + expect(simplifierTranscript).not.toContain('apply-knowledge'); }, 90000); it('Scrutinizer preloads quality-gates, software-design, worktree-support, apply-knowledge', async () => { - const preloaded = await spawnAgentAndGetPreloads('Scrutinizer', 'evaluate this code: const x = 1;'); - expect(preloaded).toEqual(expect.arrayContaining([ - 'quality-gates', 'software-design', 'worktree-support', 'apply-knowledge', - ])); + const allPreloads = await spawnAgentAndGetAllPreloads('Scrutinizer', 'evaluate this code: const x = 1;'); + const expected = ['quality-gates', 'software-design', 'worktree-support', 'apply-knowledge']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Reviewer preloads review-methodology, worktree-support, apply-knowledge', async () => { - const preloaded = await spawnAgentAndGetPreloads('Reviewer', 'review this code: const y = 2;'); - expect(preloaded).toEqual(expect.arrayContaining([ - 'review-methodology', 'worktree-support', 'apply-knowledge', - ])); + const allPreloads = await spawnAgentAndGetAllPreloads('Reviewer', 'review this code: const y = 2;'); + const expected = ['review-methodology', 'worktree-support', 'apply-knowledge']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Coder preloads all 8 declared core skills', async () => { - const preloaded = await spawnAgentAndGetPreloads('Coder', 'implement a no-op task'); - expect(preloaded).toEqual(expect.arrayContaining([ + const allPreloads = await spawnAgentAndGetAllPreloads('Coder', 'implement a no-op task'); + const expected = [ 'software-design', 'git', 'patterns', 'testing', 'test-driven-development', 'research', 'boundary-validation', 'worktree-support', - ])); + ]; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Designer preloads worktree-support, apply-knowledge, gap-analysis, design-review', async () => { - const preloaded = await spawnAgentAndGetPreloads('Designer', 'analyze this design: "Add a cache layer."'); - expect(preloaded).toEqual(expect.arrayContaining([ - 'worktree-support', 'apply-knowledge', 'gap-analysis', 'design-review', - ])); + const allPreloads = await spawnAgentAndGetAllPreloads('Designer', 'analyze this design: "Add a cache layer."'); + const expected = ['worktree-support', 'apply-knowledge', 'gap-analysis', 'design-review']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Git agent preloads git and worktree-support', async () => { - const preloaded = await spawnAgentAndGetPreloads('Git', 'run git status'); - expect(preloaded).toEqual(expect.arrayContaining(['git', 'worktree-support'])); + const allPreloads = await spawnAgentAndGetAllPreloads('Git', 'run git status'); + const expected = ['git', 'worktree-support']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); }); From dc8298ce46fa6664cf44ea3ab5c2888024142149 Mon Sep 17 00:00:00 2001 From: Dean Sharon Date: Mon, 20 Apr 2026 00:06:26 +0300 Subject: [PATCH 5/5] chore: remove dead legacy fallback path from classification hook MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The legacy path can never fire — the hook and the classification-rules.md file are always installed together by devflow init. --- scripts/hooks/session-start-classification | 3 --- 1 file changed, 3 deletions(-) diff --git a/scripts/hooks/session-start-classification b/scripts/hooks/session-start-classification index f5297d1..217495d 100755 --- a/scripts/hooks/session-start-classification +++ b/scripts/hooks/session-start-classification @@ -16,11 +16,8 @@ CWD=$(printf '%s' "$INPUT" | json_field "cwd" "") if [ -z "$CWD" ]; then exit 0; fi CLASSIFICATION_RULES="$HOME/.claude/skills/devflow:router/classification-rules.md" -CLASSIFICATION_RULES_LEGACY="$HOME/.claude/skills/devflow:router/references/classification-rules.md" if [ -f "$CLASSIFICATION_RULES" ]; then CONTEXT=$(cat "$CLASSIFICATION_RULES") -elif [ -f "$CLASSIFICATION_RULES_LEGACY" ]; then - CONTEXT=$(cat "$CLASSIFICATION_RULES_LEGACY") else exit 0 fi