diff --git a/CLAUDE.md b/CLAUDE.md index ff749c8..f60124f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -40,7 +40,7 @@ Commands with Teams Variant ship as `{name}.md` (parallel subagents) and `{name} **Working Memory**: Four shell-script hooks (`scripts/hooks/`) provide automatic session continuity. Toggleable via `devflow memory --enable/--disable/--status` or `devflow init --memory/--no-memory`. UserPromptSubmit (`prompt-capture-memory`) captures user prompt to `.memory/.pending-turns.jsonl` queue. Stop hook captures `assistant_message` (on `end_turn` only) to same queue, then spawns throttled background `claude -p --model haiku` updater (skips if triggered <2min ago; concurrent sessions serialize via mkdir-based lock). Background updater uses `mv`-based atomic handoff to process all pending turns in batch (capped at 10 most recent), with crash recovery via `.pending-turns.processing` file. Updates `.memory/WORKING-MEMORY.md` with structured sections (`## Now`, `## Progress`, `## Decisions`, `## Modified Files`, `## Context`, `## Session Log`). SessionStart hook → injects previous memory + git state as `additionalContext` on `/clear`, startup, or compact (warns if >1h stale; injects pre-compact memory snapshot when compaction happened mid-session). PreCompact hook → saves git state + WORKING-MEMORY.md snapshot + bootstraps minimal WORKING-MEMORY.md if none exists. Disabling memory removes all four hooks. Use `devflow memory --clear` to clean up pending queue files across projects. Zero-ceremony context preservation. -**Ambient Mode**: Three-layer architecture for always-on intent classification. SessionStart hook (`session-start-classification`) reads lean classification rules (`~/.claude/skills/devflow:router/references/classification-rules.md`, ~30 lines) and injects as `additionalContext` — once per session, deterministic, zero model overhead. UserPromptSubmit hook (`preamble`) injects a one-sentence prompt per message triggering classification + router loading via Skill tool. Router SKILL.md is a pure skill lookup table (~50 lines) loaded on-demand only for GUIDED/ORCHESTRATED depth — maps intent×depth to domain and orchestration skills. Toggleable via `devflow ambient --enable/--disable/--status` or `devflow init`. +**Ambient Mode**: Three-layer architecture for always-on intent classification. SessionStart hook (`session-start-classification`) reads lean classification rules (`~/.claude/skills/devflow:router/classification-rules.md`, ~30 lines) and injects as `additionalContext` — once per session, deterministic, zero model overhead. UserPromptSubmit hook (`preamble`) injects a one-sentence prompt per message triggering classification + conditional router loading via Skill tool. Router SKILL.md is a pure skill lookup table (~50 lines) loaded on-demand only for GUIDED/ORCHESTRATED depth — maps intent×depth to domain and orchestration skills. Toggleable via `devflow ambient --enable/--disable/--status` or `devflow init`. **Self-Learning**: A SessionEnd hook (`session-end-learning`) accumulates session IDs and triggers a background `claude -p --model sonnet` every 3 sessions (5 at 15+ observations) to detect **4 observation types** — workflow, procedural, decision, and pitfall — from batch transcripts. Transcript content is split into two channels by `scripts/hooks/lib/transcript-filter.cjs`: `USER_SIGNALS` (plain user messages, feeds workflow/procedural detection) and `DIALOG_PAIRS` (prior-assistant + user turns, feeds decision/pitfall detection). Detection uses per-type linguistic markers and quality gates stored in each observation as `quality_ok`. Per-type thresholds govern promotion (workflow: 3 required; procedural: 4 required; decision/pitfall: 2 required), each with independent temporal spread requirements. Observations accumulate in `.memory/learning-log.jsonl`; their lifecycle is `observing → ready → created → deprecated`. When thresholds are met, `json-helper.cjs render-ready` renders deterministically to 4 targets: slash commands (`.claude/commands/self-learning/`), skills (`.claude/skills/{slug}/`), decisions.md ADR entries, and pitfalls.md PF entries. A session-start feedback reconciler (`json-helper.cjs reconcile-manifest`) checks the manifest at `.memory/.learning-manifest.json` against the filesystem to detect deletions (applies 0.3× confidence penalty) and edits (ignored per D13). The reconciler also **self-heals** from render-ready crash-window states: when a knowledge file contains an ADR/PF anchor that is absent from the manifest *and* the section carries the `- **Source**: self-learning:` marker, the heal scans the log for `status: 'ready'` observations matching by normalized pattern (exactly one match = upgrade to `status: 'created'` and reconstruct manifest entry; zero or multiple matches = silently skipped). The marker check excludes pre-v2 seeded entries from the heal path so they cannot be falsely paired with a current ready obs. Loaded artifacts are reinforced locally (no LLM) on each session end. Single toggle mechanism: hook presence in `settings.json` IS the enabled state — no `enabled` field in `learning.json`. Toggleable via `devflow learn --enable/--disable/--status` or `devflow init --learn/--no-learn`. Configurable model/throttle/caps/debug via `devflow learn --configure`. Use `devflow learn --reset` to remove all artifacts + log + transient state. Use `devflow learn --purge` to remove invalid observations. Use `devflow learn --review` to inspect observations needing attention. Debug logs stored at `~/.devflow/logs/{project-slug}/`. The `knowledge-persistence` skill is a format specification only; the actual writer is `scripts/hooks/background-learning` via `json-helper.cjs render-ready`. diff --git a/scripts/hooks/preamble b/scripts/hooks/preamble index 234c050..2c46255 100755 --- a/scripts/hooks/preamble +++ b/scripts/hooks/preamble @@ -34,6 +34,6 @@ fi # Minimal preamble — classification rules injected at SessionStart, not here. # SYNC: must match tests/ambient.test.ts preamble drift detection -PREAMBLE="Classify this request's intent and depth, then load devflow:router via Skill tool." +PREAMBLE="Classify this request's intent and depth. If GUIDED or ORCHESTRATED, load devflow:router via Skill tool." json_prompt_output "$PREAMBLE" diff --git a/scripts/hooks/session-start-classification b/scripts/hooks/session-start-classification index aa13500..217495d 100755 --- a/scripts/hooks/session-start-classification +++ b/scripts/hooks/session-start-classification @@ -15,12 +15,9 @@ INPUT=$(cat) CWD=$(printf '%s' "$INPUT" | json_field "cwd" "") if [ -z "$CWD" ]; then exit 0; fi -CLASSIFICATION_RULES="$HOME/.claude/skills/devflow:router/references/classification-rules.md" +CLASSIFICATION_RULES="$HOME/.claude/skills/devflow:router/classification-rules.md" if [ -f "$CLASSIFICATION_RULES" ]; then CONTEXT=$(cat "$CLASSIFICATION_RULES") -elif [ -f "$HOME/.claude/skills/devflow:router/SKILL.md" ]; then - # Fallback for upgrade window: old install without classification-rules.md - CONTEXT=$(awk '/^---$/{n++; next} n>=2' "$HOME/.claude/skills/devflow:router/SKILL.md") else exit 0 fi diff --git a/shared/skills/router/references/classification-rules.md b/shared/skills/router/classification-rules.md similarity index 100% rename from shared/skills/router/references/classification-rules.md rename to shared/skills/router/classification-rules.md diff --git a/tests/ambient.test.ts b/tests/ambient.test.ts index 35ecd75..07eacf4 100644 --- a/tests/ambient.test.ts +++ b/tests/ambient.test.ts @@ -7,8 +7,8 @@ import { hasClassification, extractIntent, extractDepth, - hasDevFlowBranding, hasSkillInvocations, + parseStreamEvent, } from './integration/helpers.js'; /** Helper to create a StreamResult from text for unit-testing classification helpers. */ @@ -421,12 +421,68 @@ describe('classification helpers', () => { expect(extractDepth(textResult('no classification here'))).toBeNull(); }); - it('detects Devflow branding', () => { - expect(hasDevFlowBranding(textResult('Devflow: IMPLEMENT/GUIDED. Loading: devflow:patterns.'))).toBe(true); + it('CLASSIFICATION_PATTERN matches model output variations', () => { + // Canonical format (model instruction says "Devflow: INTENT/DEPTH") + expect(hasClassification(textResult('Devflow: IMPLEMENT/GUIDED'))).toBe(true); + // Lowercase (model might vary casing) + expect(hasClassification(textResult('devflow: implement/guided'))).toBe(true); + // No space after colon + expect(hasClassification(textResult('Devflow:CHAT/QUICK'))).toBe(true); + // Extra whitespace + expect(hasClassification(textResult('Devflow: PLAN / ORCHESTRATED'))).toBe(true); }); +}); - it('returns false for non-Devflow branding', () => { - expect(hasDevFlowBranding(textResult('Some random text without branding.'))).toBe(false); +describe('parseStreamEvent', () => { + it('extracts skills from assistant tool_use events', () => { + const event = { + type: 'assistant', + message: { content: [ + { type: 'tool_use', name: 'Skill', input: { skill: 'devflow:router' } }, + ] }, + }; + const parsed = parseStreamEvent(event); + expect(parsed.skills).toEqual(['devflow:router']); + expect(parsed.textFragments).toEqual([]); + }); + + it('extracts text from assistant text blocks', () => { + const event = { + type: 'assistant', + message: { content: [ + { type: 'text', text: 'Devflow: IMPLEMENT/GUIDED' }, + ] }, + }; + const parsed = parseStreamEvent(event); + expect(parsed.textFragments).toEqual(['Devflow: IMPLEMENT/GUIDED']); + expect(parsed.skills).toEqual([]); + }); + + it('returns empty arrays for non-assistant events', () => { + expect(parseStreamEvent({ type: 'user', message: { content: [] } })).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({ type: 'system', message: { content: [] } })).toEqual({ skills: [], textFragments: [] }); + }); + + it('returns empty arrays for malformed events', () => { + expect(parseStreamEvent(null)).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent(undefined)).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({})).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({ type: 'assistant' })).toEqual({ skills: [], textFragments: [] }); + expect(parseStreamEvent({ type: 'assistant', message: {} })).toEqual({ skills: [], textFragments: [] }); + }); + + it('handles mixed content blocks', () => { + const event = { + type: 'assistant', + message: { content: [ + { type: 'text', text: 'Devflow: DEBUG/ORCHESTRATED' }, + { type: 'tool_use', name: 'Skill', input: { skill: 'devflow:debug:orch' } }, + { type: 'text', text: 'Loading debug orchestrator.' }, + ] }, + }; + const parsed = parseStreamEvent(event); + expect(parsed.skills).toEqual(['devflow:debug:orch']); + expect(parsed.textFragments).toEqual(['Devflow: DEBUG/ORCHESTRATED', 'Loading debug orchestrator.']); }); }); @@ -489,7 +545,7 @@ function parseClassificationIntents(content: string): string[] { describe('router structural validation', () => { const routerPath = path.resolve(__dirname, '../shared/skills/router/SKILL.md'); - const rulesPath = path.resolve(__dirname, '../shared/skills/router/references/classification-rules.md'); + const rulesPath = path.resolve(__dirname, '../shared/skills/router/classification-rules.md'); const sharedSkillsDir = path.resolve(__dirname, '../shared/skills'); it('router covers all ORCHESTRATED intents (every non-CHAT intent has a row)', async () => { @@ -594,7 +650,7 @@ describe('preamble drift detection', () => { }); it('classification-rules.md contains required classification elements', async () => { - const rulesPath = path.resolve(__dirname, '../shared/skills/router/references/classification-rules.md'); + const rulesPath = path.resolve(__dirname, '../shared/skills/router/classification-rules.md'); const rulesContent = await fs.readFile(rulesPath, 'utf-8'); // Must contain Intent Signals heading diff --git a/tests/integration/ambient-activation.test.ts b/tests/integration/ambient-activation.test.ts index ccd56a1..1163e28 100644 --- a/tests/integration/ambient-activation.test.ts +++ b/tests/integration/ambient-activation.test.ts @@ -4,6 +4,7 @@ import { runClaudeStreaming, runClaudeStreamingWithRetry, hasSkillInvocations, + hasClassification, getSkillInvocations, hasRequiredSkills, } from './helpers.js'; @@ -34,12 +35,14 @@ describe.skipIf(!isClaudeAvailable())('devflow classification', () => { // "thanks" is ≤2 words — preamble's word-count filter skips it before classification runs const result = await runClaudeStreaming('thanks', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`preamble filter (single-word): no skills (${result.durationMs}ms)`); }); it('QUICK — explore: "where is the config?" loads no skills', async () => { const result = await runClaudeStreaming('where is the config file?', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`QUICK explore: no skills (${result.durationMs}ms)`); }); @@ -47,6 +50,7 @@ describe.skipIf(!isClaudeAvailable())('devflow classification', () => { // Passes preamble's word-count filter (>2 words) but classified CHAT/QUICK — no skills loaded const result = await runClaudeStreaming('sounds good, thanks for explaining that', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`CHAT/QUICK (multi-word): no skills (${result.durationMs}ms)`); }); @@ -54,6 +58,7 @@ describe.skipIf(!isClaudeAvailable())('devflow classification', () => { // Preamble filters prompts starting with "/" — no classification or skill loading const result = await runClaudeStreaming('/help with something', { timeout: 20000 }); expect(hasSkillInvocations(result)).toBe(false); + expect(hasClassification(result)).toBe(false); console.log(`preamble filter (slash command): no skills (${result.durationMs}ms)`); }); diff --git a/tests/integration/helpers.ts b/tests/integration/helpers.ts index dc02414..1dd2a66 100644 --- a/tests/integration/helpers.ts +++ b/tests/integration/helpers.ts @@ -21,13 +21,48 @@ export function isClaudeAvailable(): boolean { * Simulates SessionStart injection for integration tests. */ function loadRouterContext(): string { - const rulesPath = resolve(import.meta.dirname, '../../shared/skills/router/references/classification-rules.md'); + const rulesPath = resolve(import.meta.dirname, '../../shared/skills/router/classification-rules.md'); return readFileSync(rulesPath, 'utf-8').trim(); } // Simulates SessionStart injection (classification rules) + per-message preamble const DEVFLOW_PREAMBLE = loadRouterContext() + - '\nClassify this request\'s intent and depth, then load devflow:router via Skill tool.'; + '\nClassify this request\'s intent and depth. If GUIDED or ORCHESTRATED, load devflow:router via Skill tool.'; + +/** Parsed fields from a single streaming event */ +export interface ParsedStreamEvent { + skills: string[]; + textFragments: string[]; +} + +/** + * Extract skill invocations and text fragments from a single streaming event. + * Only processes assistant messages with content arrays. + */ +export function parseStreamEvent(event: unknown): ParsedStreamEvent { + const skills: string[] = []; + const textFragments: string[] = []; + + if ( + typeof event !== 'object' || event === null || + (event as Record).type !== 'assistant' || + !Array.isArray((event as { message?: { content?: unknown } }).message?.content) + ) { + return { skills, textFragments }; + } + + const msg = event as { type: string; message: { content: Record[] } }; + for (const block of msg.message.content) { + if (block.type === 'tool_use' && block.name === 'Skill' && typeof (block.input as Record)?.skill === 'string') { + skills.push((block.input as Record).skill as string); + } + if (block.type === 'text' && typeof block.text === 'string') { + textFragments.push(block.text as string); + } + } + + return { skills, textFragments }; +} /** Result from a streaming claude invocation */ export interface StreamResult { @@ -106,33 +141,14 @@ export function runClaudeStreaming( if (!line.trim()) continue; try { const event: unknown = JSON.parse(line); + const parsed = parseStreamEvent(event); + skills.push(...parsed.skills); + textFragments.push(...parsed.textFragments); - // Detect Skill tool_use in assistant messages - if ( - typeof event === 'object' && event !== null && - (event as Record).type === 'assistant' && - Array.isArray((event as Record).message?.content) - ) { - const msg = event as { type: string; message: { content: Record[] } }; - for (const block of msg.message.content) { - // tool_use block for Skill - if (block.type === 'tool_use' && block.name === 'Skill' && typeof (block.input as Record)?.skill === 'string') { - skills.push((block.input as Record).skill as string); - } - // text block — capture for classification detection - if (block.type === 'text' && typeof block.text === 'string') { - textFragments.push(block.text); - } - } - - // Once we have skills, give a brief window for more, then finish - if (skills.length > 0 && !graceTimer) { - graceTimer = setTimeout(() => { - finish(true); - }, 8000); // 8s grace for additional skill loads after first detection - } + // Once we have skills, give a brief window for more, then finish + if (skills.length > 0 && !graceTimer) { + graceTimer = setTimeout(() => finish(true), 8000); } - } catch { // Partial JSON line, skip } @@ -210,19 +226,6 @@ export function extractDepth(result: StreamResult): string | null { return match ? match[2].toUpperCase() : null; } -/** - * Check whether the result contains a Devflow classification tag. - * - * @see hasClassification — functionally identical after both helpers were - * unified on {@link CLASSIFICATION_PATTERN}. Kept as a distinct export so - * existing test assertions that describe "branding presence" (vs. "a - * classification exists") remain self-documenting at the call site. - */ -export function hasDevFlowBranding(result: StreamResult): boolean { - const text = result.textFragments.join(' '); - return CLASSIFICATION_PATTERN.test(text); -} - /** * Check if required skills are present in the result. * Uses bounded matching: exact match, namespace-suffixed, or devflow-prefixed. @@ -360,13 +363,18 @@ function parsePreloadedSkills(transcriptPath: string): string[] { } /** - * Find the most recent subagent transcript written at or after `since` and - * return the preloaded skill names from its initial user message. + * Find all subagent transcripts written at or after `since` and return the + * preloaded skill names from each transcript's initial user message. + * + * Returns one string[] per transcript. The caller can assert that at least one + * transcript contains the expected skills — this avoids a race condition where + * Claude spawns auxiliary subagents (e.g., Git) alongside the target agent, + * and the auxiliary transcript has a later mtime. * - * Returns an empty array if no transcript is found or the directory structure + * Returns an empty array if no transcripts are found or the directory structure * has changed (graceful degradation). */ -export function getLatestSubagentPreloadedSkills(since: Date): string[] { +export function getAllSubagentPreloadedSkills(since: Date): string[][] { const homeDir = process.env.HOME ?? process.env.USERPROFILE ?? ''; const cwd = process.cwd(); // Claude Code encodes the project path by replacing / with - @@ -379,7 +387,7 @@ export function getLatestSubagentPreloadedSkills(since: Date): string[] { // Most recent transcript first transcripts.sort((a, b) => b.mtime.getTime() - a.mtime.getTime()); - return parsePreloadedSkills(transcripts[0].path); + return transcripts.map((t) => parsePreloadedSkills(t.path)); } catch { // Project dir doesn't exist or structure changed — return empty gracefully return []; diff --git a/tests/integration/subagent-skill-preload.test.ts b/tests/integration/subagent-skill-preload.test.ts index b8eea2d..aec68c5 100644 --- a/tests/integration/subagent-skill-preload.test.ts +++ b/tests/integration/subagent-skill-preload.test.ts @@ -2,25 +2,28 @@ import { describe, it, expect } from 'vitest'; import { isClaudeAvailable, runClaudeAndWait, - getLatestSubagentPreloadedSkills, + getAllSubagentPreloadedSkills, } from './helpers.js'; /** - * Spawn an agent by name and return the preloaded skills from its transcript. - * Asserts that a transcript was actually found before returning. + * Spawn an agent by name and return ALL subagent transcripts' preloaded skills. + * + * Returns string[][] — one skill list per transcript. The caller asserts that + * at least one transcript contains the expected skills, avoiding a race where + * Claude spawns auxiliary subagents whose transcript mtime beats the target's. */ -async function spawnAgentAndGetPreloads(agentType: string, prompt: string): Promise { +async function spawnAgentAndGetAllPreloads(agentType: string, prompt: string): Promise { const since = new Date(); const result = await runClaudeAndWait( `Use the Agent tool with subagent_type="${agentType}" to ${prompt}. Only spawn the agent, do not do any other work.`, { timeout: 60000, model: 'haiku', allowedTools: 'Agent' }, ); - const preloaded = getLatestSubagentPreloadedSkills(since); + const allPreloads = getAllSubagentPreloadedSkills(since); expect( - preloaded.length, + allPreloads.length, `No subagent transcript found for ${agentType} (exit=${result.exitCode}, ${result.durationMs}ms, cwd=${process.cwd()})`, ).toBeGreaterThan(0); - return preloaded; + return allPreloads; } /** @@ -39,43 +42,62 @@ async function spawnAgentAndGetPreloads(agentType: string, prompt: string): Prom describe.skipIf(!isClaudeAvailable())('subagent skill preload', () => { it('Simplifier preloads software-design and worktree-support', async () => { - const preloaded = await spawnAgentAndGetPreloads('Simplifier', 'simplify this trivial function: function add(a, b) { return a + b; }'); - expect(preloaded).toEqual(expect.arrayContaining(['software-design', 'worktree-support'])); + const allPreloads = await spawnAgentAndGetAllPreloads('Simplifier', 'simplify this trivial function: function add(a, b) { return a + b; }'); + const expected = ['software-design', 'worktree-support']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); // Simplifier must NOT have apply-knowledge (PR #182 explicit assertion) - expect(preloaded).not.toContain('apply-knowledge'); + const simplifierTranscript = allPreloads.find((p) => expected.every((s) => p.includes(s)))!; + expect(simplifierTranscript).not.toContain('apply-knowledge'); }, 90000); it('Scrutinizer preloads quality-gates, software-design, worktree-support, apply-knowledge', async () => { - const preloaded = await spawnAgentAndGetPreloads('Scrutinizer', 'evaluate this code: const x = 1;'); - expect(preloaded).toEqual(expect.arrayContaining([ - 'quality-gates', 'software-design', 'worktree-support', 'apply-knowledge', - ])); + const allPreloads = await spawnAgentAndGetAllPreloads('Scrutinizer', 'evaluate this code: const x = 1;'); + const expected = ['quality-gates', 'software-design', 'worktree-support', 'apply-knowledge']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Reviewer preloads review-methodology, worktree-support, apply-knowledge', async () => { - const preloaded = await spawnAgentAndGetPreloads('Reviewer', 'review this code: const y = 2;'); - expect(preloaded).toEqual(expect.arrayContaining([ - 'review-methodology', 'worktree-support', 'apply-knowledge', - ])); + const allPreloads = await spawnAgentAndGetAllPreloads('Reviewer', 'review this code: const y = 2;'); + const expected = ['review-methodology', 'worktree-support', 'apply-knowledge']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Coder preloads all 8 declared core skills', async () => { - const preloaded = await spawnAgentAndGetPreloads('Coder', 'implement a no-op task'); - expect(preloaded).toEqual(expect.arrayContaining([ + const allPreloads = await spawnAgentAndGetAllPreloads('Coder', 'implement a no-op task'); + const expected = [ 'software-design', 'git', 'patterns', 'testing', 'test-driven-development', 'research', 'boundary-validation', 'worktree-support', - ])); + ]; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Designer preloads worktree-support, apply-knowledge, gap-analysis, design-review', async () => { - const preloaded = await spawnAgentAndGetPreloads('Designer', 'analyze this design: "Add a cache layer."'); - expect(preloaded).toEqual(expect.arrayContaining([ - 'worktree-support', 'apply-knowledge', 'gap-analysis', 'design-review', - ])); + const allPreloads = await spawnAgentAndGetAllPreloads('Designer', 'analyze this design: "Add a cache layer."'); + const expected = ['worktree-support', 'apply-knowledge', 'gap-analysis', 'design-review']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); it('Git agent preloads git and worktree-support', async () => { - const preloaded = await spawnAgentAndGetPreloads('Git', 'run git status'); - expect(preloaded).toEqual(expect.arrayContaining(['git', 'worktree-support'])); + const allPreloads = await spawnAgentAndGetAllPreloads('Git', 'run git status'); + const expected = ['git', 'worktree-support']; + expect( + allPreloads.some((p) => expected.every((s) => p.includes(s))), + `No transcript contains ${expected.join(', ')}. Found: ${JSON.stringify(allPreloads)}`, + ).toBe(true); }, 90000); }); diff --git a/tests/skill-references.test.ts b/tests/skill-references.test.ts index 6e1940e..b5428eb 100644 --- a/tests/skill-references.test.ts +++ b/tests/skill-references.test.ts @@ -703,7 +703,7 @@ describe('Test infrastructure skill references', () => { it('DEVFLOW_PREAMBLE reads classification-rules.md which has valid refs', () => { // helpers.ts loads DEVFLOW_PREAMBLE from classification-rules.md at runtime. // Verify the classification rules reference devflow:router (loaded via Skill tool). - const rulesPath = path.join(ROOT, 'shared', 'skills', 'router', 'references', 'classification-rules.md'); + const rulesPath = path.join(ROOT, 'shared', 'skills', 'router', 'classification-rules.md'); const rulesContent = readFileSync(rulesPath, 'utf-8'); const rulesRefs = extractPrefixedRefs(rulesContent);