Skip to content

feat: per-feature knowledge bases#193

Open
dean0x wants to merge 44 commits intomainfrom
feat/per-feature-knowledge-bases
Open

feat: per-feature knowledge bases#193
dean0x wants to merge 44 commits intomainfrom
feat/per-feature-knowledge-bases

Conversation

@dean0x
Copy link
Copy Markdown
Owner

@dean0x dean0x commented Apr 23, 2026

Adds a new knowledge layer where each feature area gets a KNOWLEDGE.md file in .features/, with creation during planning, loading across all workflows, and staleness detection.

Changes

  • Runtime module (feature-kb.cjs) — loads and caches KNOWLEDGE.md files per feature area
  • Two skills (feature-kb, apply-feature-kb) — KB creation and consumption across workflows
  • KB Builder agent — generates KNOWLEDGE.md during planning phase
  • CLI command (devflow kb) — manual KB creation and inspection
  • Integration across orchestration — wired into all 8 orchestration skills (implement:orch, explore:orch, plan:orch, review:orch, resolve:orch, debug:orch, pipeline:orch)
  • Agent integration — loaded by 5 key agents (Coder, Reviewer, Resolver, Designer, Evaluator)
  • Base command integration — available in 6 base commands (implement, code-review, resolve, plan, debug, self-review)
  • Teams variants — 5 command teams variants updated (implement-teams, code-review-teams, resolve-teams, plan-teams, debug-teams)

Breaking Changes

None.

Testing

  • Knowledge base creation tested during planning phase
  • Loading across workflows tested in integration flows
  • Staleness detection verified with timestamp checks

Related Issues

None.

Dean Sharon and others added 5 commits April 23, 2026 17:30
Runtime module, two skills, one agent, CLI command, plugin registration,
init integration, and tests for the per-feature knowledge base system.

- scripts/hooks/lib/feature-kb.cjs: CJS runtime for KB CRUD, staleness
  detection via git log, mkdir-based lock protocol, and CLI interface
- shared/skills/feature-kb/SKILL.md: 4-phase KB creation skill (Scan,
  Extract, Distill, Forge) with category templates and quality checks
- shared/skills/apply-feature-kb/SKILL.md: 3-step consumption algorithm
  for FEATURE_KNOWLEDGE variable with staleness handling and skip guard
- shared/agents/kb-builder.md: Sonnet agent that structures exploration
  outputs into KNOWLEDGE.md files under .features/{slug}/
- src/cli/commands/kb.ts: devflow kb list|create|check|refresh|remove
  subcommands with @clack/prompts TUI and claude -p integration
- Plugin registration: feature-kb + apply-feature-kb skills added to
  core-skills, plan, ambient; apply-feature-kb to implement, code-review,
  resolve, debug, self-review; kb-builder agent to plan and ambient
- Init integration: creates .features/index.json on local-scope installs;
  adds .features/.kb.lock to .gitignore entries
- 41 new tests across 4 test files (all 1108 tests pass)
- Security: execFileSync with array args for git commands (fixes CWE-78)

Co-Authored-By: Claude <noreply@anthropic.com>
…ills, agents, and commands

Integrates the feature-kb system into all 7 orchestration skills, 5 shared
agents, 6 base commands, and 5 teams command variants. Each component loads
relevant feature KBs from .features/index.json based on task context, passes
FEATURE_KNOWLEDGE to downstream agents, and marks KBs stale after files change.

- plan:orch: Phase 0.5 loads KBs; Phase 8.5 spawns KB Builder for new areas
- implement:orch: Phase 1.5 loads KBs; Phase 6 marks stale after changes
- review:orch: Phase 2b loads KBs; Reviewers receive FEATURE_KNOWLEDGE
- resolve:orch: Phase 1.5 loads KBs; Resolvers receive FEATURE_KNOWLEDGE
- debug:orch, explore:orch: load KBs orchestrator-local (not passed to workers)
- quality-gates: P0-Design pillar checks FEATURE_KNOWLEDGE compliance
- Agents (coder, reviewer, designer, resolver, scrutinizer): add apply-feature-kb skill + input field
Security (P0): Add validateSlug() with kebab-case enforcement to prevent
path traversal via malicious slugs in loadKBContent, checkStaleness,
updateIndex, removeEntry. CLI commands (kb create, kb remove) validate
at boundary before any filesystem operations.

Functionality (P0): Remove scope === 'local' guard on .features/ directory
creation in init.ts -- .features/ is committed to the project repo and
should be created whenever a gitRoot exists, regardless of install scope.

Documentation (P2): Update CLAUDE.md skill count (41 -> 44) and agent
count (12 -> 13) to reflect new feature-kb/apply-feature-kb skills and
kb-builder agent.

Tests: Add 6 validateSlug tests covering valid slugs, path traversal,
slashes, dot-prefix, and non-kebab-case rejection.
Distributes KB Builder agent to devflow-ambient and devflow-plan
plugins at build time.

Co-Authored-By: Claude <noreply@anthropic.com>
@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 23, 2026

Code Review: High-Confidence Issues (≥80%)

Consolidated findings from 8 specialized reviewers. These are BLOCKING issues that should be fixed before merge.

1. Security: Hardcoded --dangerously-skip-permissions (92%)

Files: src/cli/commands/kb.ts:206-207, src/cli/commands/kb.ts:270-285

The kb create and kb refresh commands spawn claude -p with unrestricted permissions, bypassing all safety prompts. For a user-initiated CLI command, document the elevated permissions in the spinner message or prompt for confirmation.

Suggestion: Update spinner to show: 'Running KB Builder agent (unrestricted tool access)...'


2. Security: User Input Interpolated Without Sanitization (85%)

File: src/cli/commands/kb.ts:178-200

User-provided name is interpolated into the LLM prompt without validation. Combined with --dangerously-skip-permissions, this creates a prompt injection surface.

Fix: Validate name to alphanumeric + spaces + hyphens only:

validate: (v) => {
  if (v.trim().length < 3) return 'Name must be at least 3 characters';
  if (!/^[a-zA-Z0-9 \-]+$/.test(v.trim())) return 'Name must contain only letters, numbers, spaces, and hyphens';
  return undefined;
},

3. Regression: Missing Directory Creation (90%)

Files: scripts/hooks/lib/feature-kb.cjs:227, scripts/hooks/lib/feature-kb.cjs:291

Both updateIndex and removeEntry will throw ENOENT if .features/ directory doesn't exist. Lock acquisition fails when called from worktrees without .features/ initialized.

Fix: Add before acquireLock in both functions:

fs.mkdirSync(featuresDir, { recursive: true });

4. Performance: N+1 Process Spawning (92%)

File: scripts/hooks/lib/feature-kb.cjs:153-161

checkAllStaleness spawns N+2 redundant processes per slug—each call re-reads index.json and spawns git rev-parse (invariant). For 20 KBs this adds 300-600ms of blocking I/O.

Fix: Hoist loadIndex and git rev-parse above the loop, reuse for all slug checks.


5. Complexity: Sequential if Blocks Instead of Dispatch Table (85%)

File: scripts/hooks/lib/feature-kb.cjs:339-463

CLI interface uses 5 sequential if (subcommand === '...') blocks with duplicated validation. Cyclomatic complexity ~12; could be ~60 lines with a dispatch table.

Fix: Extract worktree validation helper and use a command map.


6. Consistency: Missing FEATURE_KNOWLEDGE in Coder Templates (92%)

Files: plugins/devflow-implement/commands/implement.md:117-169, plugins/devflow-implement/commands/implement-teams.md:110-162

FEATURE_KNOWLEDGE is added to SINGLE_CODER but omitted from SEQUENTIAL_CODERS and PARALLEL_CODERS templates. ~20% of implement invocations silently drop feature knowledge context.

Fix: Add FEATURE_KNOWLEDGE: {feature_knowledge} to all four Coder spawn templates in both files.


7. Consistency: Missing Phase 14.5 in Architecture Diagram (95%)

Files: plugins/devflow-plan/commands/plan.md:428-486, plugins/devflow-plan/commands/plan-teams.md:480-528

Phase 14.5 (Feature KB Generation) exists in documentation but is missing from the ASCII architecture diagram. The orchestrator may skip this phase.

Fix: Add Phase 14.5 to the diagram under Block 6: Output, after Phase 14.


8. TypeScript: Missing validateSlug in refresh Command (85%)

File: src/cli/commands/kb.ts:242

The create and remove commands validate slugs upfront, but refresh does not. Malformed slugs will cause confusing internal errors instead of user-facing validation messages.

Fix: Add validation at the start of the refresh action handler.


Lower-Confidence Suggestions (60-79%)

These should be addressed but are not blocking:

  • Missing validateSlug in specific commands (82%)
  • Inconsistent FEATURE_KNOWLEDGE loading instructions (84%)
  • Duplicate feature-kb in plugin manifests (85%)
  • acquireLock deeply nested error recovery (82%)
  • removeEntry early return inside finally (90%)

See .docs/reviews/feat-per-feature-knowledge-bases/2026-04-23_1941/ for complete review reports and detailed context.


Authored by: Claude Code Review Git Agent
Review Depth: Full multi-domain (Architecture, Complexity, Consistency, Performance, Regression, Security, Testing, TypeScript)

Dean Sharon and others added 9 commits April 24, 2026 12:44
- H5: Fix N+1 in checkAllStaleness — load index + git-dir check once
- H6: Fix updateIndex crash on missing .features/ with mkdirSync
- H7: Fix removeEntry crash on missing .features/ with early return
- H8/H9: Refactor CLI to dispatch table with requireWorktree validation
- M2: Fix findOverlapping prefix matching to use directory boundary
- M3: Simplify removeEntry control flow (no early return inside lock)
- S2: Extract tryBreakStaleLock helper from acquireLock
- S3: Rename markStale → findOverlapping (more accurate name)
- S4: Add requireWorktree() to validate CLI worktree arg is a directory
- T1: Add optional lockTimeoutMs param to updateIndex/removeEntry

Co-Authored-By: Claude <noreply@anthropic.com>
- H1: Remove Bash from --allowedTools in kb create/refresh claude invocations
  (KB Builder only needs Read,Grep,Glob,Write — Bash is unnecessary surface)
- H8: Add validateSlug() call in refresh action when slug is provided
- Update FeatureKbModule interface to reflect findOverlapping rename

Co-Authored-By: Claude <noreply@anthropic.com>
- M4: Remove feature-kb from devflow-core-skills (it belongs in plan/ambient)
- M5: Remove trigger:agent-loaded from apply-feature-kb (no such trigger exists)
- M6: Simplify apply-feature-kb allowed-tools to inline format matching apply-knowledge
- S1: Add apply-feature-kb, apply-knowledge to kb-builder agent skills list

Co-Authored-By: Claude <noreply@anthropic.com>
Constrain background claude -p invocations to the minimum required tool set:
- background-learning: --allowedTools 'Read' (analysis only, no writes)
- background-memory-update: --allowedTools 'Read,Write' (reads context, writes WORKING-MEMORY.md)

Pairing --dangerously-skip-permissions with --allowedTools provides a tighter
permission boundary without requiring interactive confirmation prompts.

Co-Authored-By: Claude <noreply@anthropic.com>
…gin.json sync

- Rename markStale → findOverlapping in feature-kb.test.ts import and describe block
- Add boundary-matching test case for findOverlapping (no false positives on prefix match)
- Remove feature-kb from devflow-core-skills plugin.json (sync with plugins.ts fix)

Co-Authored-By: Claude <noreply@anthropic.com>
Rename fractional and lettered phase designators to sequential integers
across all 6 orch skills and 2 plan command files:

- plan:orch: Phase 0→1, 0.5→2, 0.6→3, 1→4, 2→5, 3→6, 4→7, 5→8, 6→9, 7→10, 8→11, 8.5→12
- implement:orch: Phase 1.5→2, 2→3, 3→4, 4→5, 5→6, 6→7; also renames
  mark-stale to find-overlapping per updated CJS API
- explore:orch: Phase 0.5→1, 1→2, 2→3, 3→4, 4→5
- resolve:orch: Phase 1.5→2, 2→3, 3→4, 4→5, 5→6, 6→7
- review:orch: Phase 2b→3, 3→4, 4→5, 5→6, 6→7
- debug:orch: Phase 0→1, 1→2, 2→3, 3→4, 4→5, 5→6
- plan.md, plan-teams.md: Phase 14.5→15

All cross-references, Phase Completion Checklists, and descriptive text
updated to match. Tests updated to reference new phase numbers.

Co-Authored-By: Claude <noreply@anthropic.com>
Dean Sharon and others added 3 commits April 24, 2026 13:05
…d CLI tests (T1-T6)

- T1: lock failure test — throws when .kb.lock dir is pre-created
- T2: positive staleness in a real git repo — detects changed referenced files
- T3: directory boundary matching — referencedFiles prefix vs exact-file disambiguation
- T4: updateIndex creates missing .features/ directory automatically
- T5: removeEntry is a no-op when .features/ directory does not exist
- T6: CLI tests for unknown subcommand, invalid worktree, find-overlapping

Co-Authored-By: Claude <noreply@anthropic.com>
- P0: Restore Bash to KB_AGENT_TOOLS — the KB Builder agent needs Bash
  to run `node scripts/hooks/lib/feature-kb.cjs update-index` after
  writing KNOWLEDGE.md. Previous hardening (e2d8a09) incorrectly
  removed it, breaking the create/refresh CLI flows.
- P1: Commit uncommitted Coder refactoring (NOT_STALE sentinel,
  parseGitChangedFiles helper, exitOnInvalidSlug DRY, nullish coalescing).
- P2: Fix "mark stale" wording in implement commands to match the actual
  operation name "find-overlapping" (consistency with implement:orch).
@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 24, 2026

Review Finding: plan:orch GUIDED step numbering duplicated

File: shared/skills/plan:orch/SKILL.md
Lines: 27-31
Confidence: 92-95%
Category: Documentation (BLOCKING)

The GUIDED Behavior section was partially renumbered in the phase refactoring. Steps 1 and 2 were renumbered, but subsequent steps restarted from 1 instead of continuing the sequence:

1. **Discover** — ...
2. **Load Feature KBs** — ...
1. **Spawn Skimmer** — ...  ← Should be 3
2. **Design** — ...          ← Should be 4  
3. **Present** — ...         ← Should be 5

This produces duplicate step numbers (1, 2, 1, 2, 3) that will confuse agents following the GUIDED path.

Fix: Continue the sequence:

1. **Discover** — ...
2. **Load Feature KBs** — ...
3. **Spawn Skimmer** — ...
4. **Design** — ...
5. **Present** — ...

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 24, 2026

Review Finding: pipeline:orch phase ranges stale after orch renumbering

File: shared/skills/pipeline:orch/SKILL.md
Lines: 35, 56, 78
Confidence: 95%
Category: Documentation (BLOCKING)

All three sub-orchestrator phase ranges say "Phases 1-6" but after the fractional-to-integer renumbering, each now has 7 phases. The descriptions also omit the new knowledge-loading phase:

  • Line 35: implement:orch "Phases 1-6" → Should be "Phases 1-7 (pre-flight → load feature knowledge → ...→ completion)"
  • Line 56: review:orch "Phases 1-6" → Should be "Phases 1-7 (pre-flight → incremental detection → load knowledge → ...→ finalize)"
  • Line 78: resolve:orch "Phases 1-6" → Should be "Phases 1-7 (target review directory → load project knowledge → ...→ report)"

This inconsistency could cause agents to skip Phase 7 (Completion/Finalize/Report) during pipeline runs.

Fix: Update each parenthetical to include the knowledge phase and correct the phase count from 6 to 7.

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 24, 2026

Review Finding: review:orch "Phase 3" reference stale after renumbering

File: shared/skills/review:orch/SKILL.md
Line: 99
Confidence: 92%
Category: Documentation (BLOCKING)

After the phase renumbering, file analysis moved from Phase 3 to Phase 4. The conditional reviewers description still references the old phase:

**Conditional reviewers** (from Phase 3 file analysis):

Other references in the same file were correctly updated (line 62 says "Phase 4 file analysis", lines 107-108 correctly reference "Phase 3" for knowledge loading). This is an incomplete migration within the same file.

Fix: Change to **Conditional reviewers** (from Phase 4 file analysis):

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 24, 2026

Review Finding: checkAllStaleness duplicates staleness logic from checkStaleness

File: scripts/hooks/lib/feature-kb.cjs
Lines: 166-198
Confidence: 85%
Category: Architecture (HIGH)

The refactored checkAllStaleness inlines the git-log staleness check logic rather than delegating to checkStaleness. While the N+1 optimization (checking git-dir once instead of N times) is sound, the implementation duplicates the core staleness algorithm:

  • Both functions call git log --after=${lastUpdated} --name-only --pretty=format: -- ...files
  • Both parse results with the same logic
  • Both return the same shape { stale, changedFiles }

This violates DRY — if the staleness algorithm changes, two code paths must be updated in lockstep.

Fix: Extract the per-entry staleness computation into a shared helper (e.g., checkEntryStaleness):

function checkEntryStaleness(worktreePath, entry) {
  const files = entry.referencedFiles || [];
  if (files.length === 0) return NOT_STALE;
  try {
    const result = execFileSync('git',
      ['log', `--after=${entry.lastUpdated}`, '--name-only', '--pretty=format:', '--', ...files],
      { cwd: worktreePath, encoding: 'utf8', stdio: ['pipe', 'pipe', 'pipe'] });
    return { stale: parseGitChangedFiles(result).length > 0, changedFiles: parseGitChangedFiles(result) };
  } catch { return NOT_STALE; }
}

Then both checkStaleness and checkAllStaleness call this shared helper.

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 24, 2026

Review Finding: removeEntry behavior change on corrupt index uncovered

File: scripts/hooks/lib/feature-kb.cjs
Line: 355
Confidence: 90%
Category: Testing (HIGH)

The old removeEntry returned early when index.json could not be parsed (} catch { return; }). The new code catches the parse error silently (} catch { /* no index to modify */ }) and falls through, which writes { version: 1, features: {} } to disk.

This means a corrupt but recoverable index.json will be silently overwritten with an empty index. This behavioral change is a breaking semantic shift from "no-op when nothing to remove" to "overwrite with empty index when index is corrupt or missing."

There is no test covering this new behavior, which represents an untested code path.

Fix: Either:

  1. Restore the early-return on parse failure (preserve old behavior), or
  2. Add a test that explicitly documents and validates the new behavior:
it('overwrites corrupt index.json with empty index on remove', () => {
  const tmp = makeTmpFeatureWorktree();
  writeFileSync(path.join(tmp, '.features', 'index.json'), 'not-valid-json');
  removeEntry(tmp, 'nonexistent');
  const index = loadIndex(tmp);
  expect(index).toEqual({ version: 1, features: {} });
});

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 24, 2026

Review Finding: Broad tool surface in KB Builder spawning

File: src/cli/commands/kb.ts
Lines: 214-218, 301-305
Confidence: 82%
Category: Security (MEDIUM)

The kb create and kb refresh commands spawn claude -p with --dangerously-skip-permissions and a broad --allowedTools Read,Grep,Glob,Write,Bash set. The combination of:

  • --dangerously-skip-permissions (no permission prompts)
  • Bash tool access (execute arbitrary shell commands)
  • User-supplied prompt content (slug, name, worktreePath, directoriesRaw)

...means a specially crafted slug or name could inject prompt instructions that influence the spawned agent's behavior.

The KB Builder agent's frontmatter declares exactly Read,Grep,Glob,Write tools. Bash is not declared, suggesting it may not be necessary.

Fix: Review whether Bash is actually required for KB Builder execution. If the node scripts/hooks/lib/feature-kb.cjs update-index call must happen within the agent, document this as an accepted risk with an explicit code comment (as done elsewhere in the codebase). If Bash can be avoided, restrict --allowedTools to exclude it.

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 24, 2026

Code Review Summary

Date: 2026-04-24 | Branch: feat/per-feature-knowledge-bases → main | Reviewers: 9 specialists

Blocking Issues (≥80% confidence)

All identified via inline comments above. Key areas:

  • Documentation: 4 phase reference mismatches (plan:orch GUIDED, pipeline:orch ranges, review:orch cross-ref)
  • Architecture: 1 DRY violation (checkAllStaleness logic duplication)
  • Testing: 1 untested behavior change (removeEntry on corrupt index)
  • Security: 1 overly broad tool surface (KB Builder Bash access)

Should-Fix Issues (≥80% confidence, not blocking)

file-organization.md shared agent count stale (95% confidence)

  • File: docs/reference/file-organization.md
  • Lines: 18, 141
  • Issue: Lists "12 shared agents" but there are 13 (missing kb-builder)
  • Fix: Update both locations to reflect 13 agents including kb-builder

Lock failure test takes 515ms (82% confidence)

  • File: tests/feature-kb/feature-kb.test.ts:221-239
  • Issue: Slowest test by 3x due to 500ms timeout; suite will degrade with more similar tests
  • Fix: Reduce timeout to 200ms (still provides 2 retry cycles)

Non-null assertions in test code (82% confidence)

  • File: tests/feature-kb/feature-kb.test.ts:179,195,214,258,275,285
  • Issue: Multiple index!.features[...] without guards; fails with opaque "Cannot read properties of null" if loadIndex returns null
  • Fix: Add explicit null checks: expect(index).not.toBeNull() before assertions

Pre-Existing Issues (Not blocking, noted for context)

acquireLock busy-wait fallback (80-82% confidence)

  • File: scripts/hooks/lib/feature-kb.cjs:242-243
  • Issue: Comment says "Node < 16 fallback: busy-wait" but catch block is empty; would spin at 100% CPU on older runtimes
  • Context: Not new to this PR but touched by refactoring

CONTRIBUTING.md shared agents count (95% confidence)

  • File: CONTRIBUTING.md:28
  • Issue: Still says "12 shared agents"; not touched by this PR, recommend separate fix

Lower-Confidence Findings (60-79% confidence)

These warrant consideration but are not critical:

  • User-supplied feature name unescaped in prompt (65% confidence) — Slug validation is thorough and slug value is user-provided (low risk), but defense-in-depth could escape prompt strings
  • checkAllStaleness positive-path test coverage gap (85% confidence) — The refactored git-log path has no direct test; covered indirectly through checkStaleness but new code path should have explicit coverage
  • tryBreakStaleLock extracted but not directly tested (80% confidence) — Function is only exercised indirectly through lock-failure test; stale-lock-breaking path untested
  • ** asymmetric with trailing slashes** (70% confidence) — No validation prevents trailing slashes in referencedFiles; current tests use no-slash convention

Positive Findings

The codebase demonstrates strong practices:

  • Security: execFileSync with array args throughout (prevents shell injection)
  • Validation: validateSlug is thorough with path traversal prevention and test coverage
  • Testing: 85 new tests, all passing; good fixture patterns and cleanup
  • Performance: N+1 git-dir optimization correct and achieved
  • Architecture: 12+ refactorings improve readability (dispatch table, DRY helpers, renames)

Overall Assessment: CHANGES_REQUESTED — Approve after addressing the 6 blocking issues above (mostly documentation/test coverage fixes). The functional changes are sound and well-tested.

Claude Code | devflow:git

Dean Sharon and others added 3 commits April 24, 2026 17:41
…view:orch

- plan:orch GUIDED section: renumber steps 1,2,3 → 3,4,5 to fix duplicate numbering after prior renumbering of first two steps
- pipeline:orch: update all three sub-orchestrator descriptions from "Phases 1-6" to "Phases 1-7" and add the knowledge-loading phase to each phase name list
- review:orch: fix Phase 5 conditional reviewer source reference from "Phase 3" to "Phase 4" after file analysis was renumbered
…try early-return

Extract a private `checkEntryFiles(worktreePath, entry)` helper that runs the
git-log staleness check for a single entry. Both `checkStaleness` and
`checkAllStaleness` now delegate to it, eliminating the duplicated git-log +
parseGitChangedFiles pattern. `checkAllStaleness` retains its git-dir-once
optimization — only the per-entry loop body changes.

Also restore early-return in `removeEntry`'s parse-failure catch block: when
the index file is absent or corrupt, release the lock and return immediately
rather than falling through to write an empty index that could clobber a
recoverable file.

Co-Authored-By: Claude <noreply@anthropic.com>
Rename shared/agents/kb-builder.md → knowledge.md and update all
references across plugin manifests, CLI source, commands, and skills.
Update shared agent count from 12 to 13 in file-organization.md.

Co-Authored-By: Claude <noreply@anthropic.com>
Comment thread tests/feature-kb/feature-kb.test.ts Outdated
});

it('exits non-zero and prints usage when worktree argument is missing', () => {
let threw = false;
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-Idiomatic Test Pattern (85% confidence, HIGH)

Five tests use manual try/catch with a threw boolean flag instead of expect().toThrow(), which is already used elsewhere in this file.

Occurrences: Lines 509, 525, 538, 549, 561 (5 total)

Fix: Replace with expect(() => execFileSync(...)).toThrow()

Comment thread scripts/hooks/background-kb-refresh Outdated
# Read existing KB content
KB_PATH="$CWD/.features/$SLUG/KNOWLEDGE.md"
[ -f "$KB_PATH" ] || { log "KB file missing for $SLUG — skipping"; continue; }
EXISTING=$(cat "$KB_PATH")
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prompt Size Risk: Full KB Content in Shell Variable (85% confidence, MEDIUM)

The entire KNOWLEDGE.md (~500 lines = ~25KB) is read into a shell variable and passed as CLI arg to claude -p. Could fail on systems with low ARG_MAX limits.

Fix: Pass the prompt via stdin instead of as a CLI argument.

Comment thread scripts/hooks/background-kb-refresh Outdated
# invocation with Sonnet to refresh up to 3 stale KBs.
# On failure: logs error, does nothing (missing updates are better than corrupt data).

set -e
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error Handling Fragility: set -e with Background Processes (82% confidence, MEDIUM)

The script uses set -e but relies on || true patterns throughout. The interaction with backgrounded wait is notoriously fragile across bash versions.

Fix: Add || true to the wait call: wait "$CLAUDE_PID" 2>/dev/null || true

[ -z "$CWD" ] && exit 0

# Quick guards
[ -f "$CWD/.features/index.json" ] || exit 0
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State Consistency Risk: Dual-Gate System (80% confidence, MEDIUM)

KB feature is gated by two independent mechanisms: (1) hook presence in settings.json and (2) .features/.disabled sentinel. If they drift out of sync, state becomes inconsistent.

Fix: Document the dual-gate design or have the hook check both conditions.

--createdBy=\"devflow-kb\""

# Spawn claude -p with 180s watchdog (POSIX-compatible, no GNU timeout)
DEVFLOW_BG_KB_REFRESH=1 "$CLAUDE_BIN" -p \
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broad Tool Surface: Unrestricted Bash Access (84% confidence, MEDIUM)

Background KB refresh runs with --dangerously-skip-permissions --allowedTools 'Read,Grep,Glob,Write,Bash'. This allows arbitrary shell execution on repository-committed data.

Consider: Can Bash be removed or restricted to only node scripts/hooks/lib/feature-kb.cjs update-index?

Comment thread src/cli/commands/kb.ts
s.start(`Refreshing ${kbSlug}...`);

const staleInfo = featureKb.checkStaleness(worktreePath, kbSlug);
const kbEntry = kbs.find((k: { slug: string }) => k.slug === kbSlug);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeScript: Redundant Type Annotation (85% confidence, MEDIUM)

The callback (k: { slug: string }) explicitly annotates a narrower type than the array element type already inferred from FeatureKbModule interface (line 22).

Fix: const kbEntry = kbs.find((k) => k.slug === kbSlug);

Copy link
Copy Markdown
Owner Author

@dean0x dean0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary: Lower-Confidence Findings (60-79%)

The following suggestions were found at 60-79% confidence and are NOT blocking:

  • Lock directory race window in background-kb-refresh (65%) - Refresh timestamp written immediately after lock acquire, before work completes
  • Session-start memory creates .memory/knowledge/ without validation (62%) - Self-healing on every session start could create symlink targets
  • Watchdog PID leak on set -e failure (65%) - If main script exits early, watchdog sleep may linger 180s
  • Explore:orch asymmetric knowledge pattern (70%) - Differs from plan:orch, needs documentation
  • init.ts KB enable/disable has no re-entrancy guard (62%) - Possible race on concurrent devflow init calls
  • Magic number 7200 in session-end-kb-refresh throttle (65%) - Should be THROTTLE_SECONDS=7200 constant
  • Repeated index boilerplate pattern (70%) - JSON.stringify pattern duplicated in init.ts and kb.ts
  • Duplicate test coverage (82%) - CLI subcommand tests overlap between feature-kb.test.ts and kb-command.test.ts; consolidate into one location

Overall Assessment: 8 high-confidence blocking/should-fix issues identified across security, architecture, complexity, testing, and TypeScript categories. All are addressable without major architectural changes. The feature KB system demonstrates solid defensive patterns and thoughtful integration across the codebase.

Dean Sharon added 4 commits April 27, 2026 11:38
Remove set -e from all 3 background hooks (existing || true guards
suffice). Remove Bash from KB agent allowedTools to prevent prompt
injection via shell commands. Switch to sidecar pattern: agent writes
.refresh-result.json / .create-result.json, caller reads it and
updates index directly. Pass pre-computed stale slugs from
session-end to background hook. Touch lock between iterations to
prevent watchdog from breaking it during sequential refreshes.
Replace 4 instances of let-threw-false pattern with
expect().toThrow(expect.objectContaining({ status: 1 })). Remove 5
duplicate stale-slugs/refresh-context tests from kb-command.test.ts
(kept in feature-kb.test.ts with stronger assertions).
6 tests covering all guard clauses: DEVFLOW_BG_KB_REFRESH recursion,
DEVFLOW_BG_UPDATER background, missing index.json, .disabled
sentinel, throttle marker, and no-stale-KBs (non-git worktree).
…tion

Add knowledge-context.cjs index invocation to plan:orch GUIDED step 2
so GUIDED plans have the same knowledge context as ORCHESTRATED.
Update file-organization.md: skills count 41→44, add KB hooks and
lib/ directory to source tree, add KB hook rows to hooks table,
mention KB SessionEnd hook in settings override.
@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

scripts/hooks/background-kb-refresh:163-168

Shell injection via sidecar path interpolation

The $SIDECAR variable is interpolated directly into a JavaScript string literal. If the path contains a single quote (possible if $CWD has one), the JS string breaks and executes arbitrary code. Flagged by: Security, Architecture, Performance reviewers (85%+ confidence).

Fix: Pass the path via environment variable instead:

REF_FILES=$(SIDECAR_PATH="$SIDECAR" node -e "
  try {
    const d = JSON.parse(require('fs').readFileSync(process.env.SIDECAR_PATH,'utf8'));
    console.log(JSON.stringify(d.referencedFiles || []));
  } catch { console.log('[]'); }
" 2>/dev/null || echo "[]")

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

scripts/hooks/background-kb-refresh:8-13

Missing error guards after set -e removal

set -e was removed but critical setup (sourcing log-paths, get-mtime) has no guards. If source "$SCRIPT_DIR/log-paths" fails, downstream log calls and file redirects fail silently. Flagged by: Security, Performance, Regression reviewers (82-83%+ confidence).

Fix: Add explicit error checks after sourcing:

source "$SCRIPT_DIR/log-paths" || { echo "Failed to source log-paths" >&2; exit 1; }
source "$SCRIPT_DIR/get-mtime" || { echo "Failed to source get-mtime" >&2; exit 1; }

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

src/cli/commands/kb.ts:412

Missing --model pin in kb create command

Unlike the background refresh (which pins --model sonnet), kb create spawns claude -p without --model. This means cost/behavior varies with user's default model. Flagged by: Security reviewer (82% confidence).

Fix: Add '--model', 'sonnet' to execFileSync arguments on line 412.

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

src/cli/commands/kb.ts:421, 532

Sidecar JSON parsed without schema validation

The sidecar files (written by LLM agent) are parsed with JSON.parse but types are not validated. Malformed but parseable JSON (e.g., referencedFiles as a string instead of array) propagates unchecked. Flagged by: Security, TypeScript reviewers (80-82% confidence).

Fix: Add runtime type guards:

const raw: unknown = JSON.parse(await fs.readFile(sidecarPath, 'utf8'));
const sidecar = (typeof raw === 'object' && raw !== null) ? raw as Record<string, unknown> : {};
const referencedFiles = Array.isArray(sidecar.referencedFiles) 
  ? sidecar.referencedFiles.filter((f): f is string => typeof f === 'string') 
  : [];

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

src/cli/commands/kb.ts:539

Unsafe double type assertion bypasses type safety

(kbEntry as Record<string, unknown>)?.referencedFiles as string[] chains two unsafe casts that defeat TypeScript's type system. Root cause: FeatureKbModule.listKBs return type omits referencedFiles. Flagged by: TypeScript (95%), Architecture (82%), Consistency (90%) reviewers.

Fix: Add referencedFiles: string[] to the listKBs return type interface (line 22), then simplify to:

referencedFiles: sidecar.referencedFiles ?? kbEntry?.referencedFiles ?? [],

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

src/cli/commands/kb.ts:22

FeatureKbModule.listKBs return type incomplete

The interface omits referencedFiles from the return type, yet the actual CJS implementation spreads it in. This forces unsafe casts elsewhere and violates the "Type everything" principle. Flagged by: TypeScript (92%), Architecture (88%), Consistency (90%) reviewers.

Fix: Update the listKBs return type to include all fields from FeatureEntry:

listKBs: (worktreePath: string) => Array<{
  slug: string; name: string; description: string;
  category: string; directories: string[];
  referencedFiles: string[]; lastUpdated: string;
  createdBy: string;
}>;

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

Sidecar pattern duplication (scripts/hooks/background-kb-refresh:108-175, src/cli/commands/kb.ts:381-434, 497-544)

Sidecar lifecycle duplicated across three call sites

The pattern (pre-clean → spawn agent → read sidecar → updateIndex → cleanup) appears three times with identical structure but slightly different field handling. This violates DRY and OCP. Flagged by: Architecture (85%), Complexity (85%) reviewers.

Fix: Extract a shared helper in feature-kb.cjs or consolidate the lifecycle logic:

node "$SCRIPT_DIR/lib/feature-kb.cjs" update-index-from-sidecar "$CWD" "$SLUG" "$SIDECAR"

This eliminates the inline node -e and centralizes the lifecycle.

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

src/cli/commands/kb.ts:492

Redundant checkStaleness calls in refresh loop

checkAllStaleness is called once on line 472 to determine stale slugs, then checkStaleness is called again inside the per-slug loop on line 492. Each call runs git log — duplicating I/O. Flagged by: Performance (80%) reviewer.

Fix: Reuse the staleness result from the initial call:

const staleness = featureKb.checkAllStaleness(worktreePath);
// ...
for (const kbSlug of slugsToRefresh) {
  const staleInfo = staleness[kbSlug]; // Already computed
}

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

Sidecar files not covered by .gitignore

Transient sidecar files could be accidentally committed

.features/$SLUG/.create-result.json and .features/$SLUG/.refresh-result.json are cleaned up after use, but a crash/timeout could leave them in a committed directory. Flagged by: Consistency (85%) reviewer.

Fix: Add to .gitignore:

.features/*/.create-result.json
.features/*/.refresh-result.json

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

Silent sidecar failure (scripts/hooks/background-kb-refresh:176-178)

Missing sidecar leaves index in inconsistent state

If the agent fails to write the sidecar (but exits 0), the code logs "skipping index update" but the throttle has advanced via .kb-last-refresh. Next session, KB appears stale but cannot refresh until 2-hour throttle expires. Flagged by: Regression (82%) reviewer.

Fix: When sidecar is missing but agent exited successfully, still update lastUpdated with existing metadata (like the CLI refresh path does at lines 535-542).

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

Summary: Findings Consolidated

10 inline comments created for issues with ≥80% confidence across Security, Architecture, Performance, Complexity, Consistency, Regression, Testing, and TypeScript reviews.

Additional Suggestions (60-79% confidence, not blocking)

  • Unvalidated referencedFiles could reference paths outside worktree (scripts/hooks/lib/feature-kb.cjs:289, 65% confidence) -- While git log is read-only, paths with ../ traversal could leak information about files outside the project. Recommend path validation in checkStaleness.

  • Shell variable word-splitting in loop (scripts/hooks/background-kb-refresh:101, 70% confidence) -- The for SLUG in $STALE_SLUGS loop relies on word splitting. Safe in practice since slugs are validated to kebab-case, but explicit assertion would be defensive.

  • Assertion weakening in test (tests/feature-kb/feature-kb.test.ts:528, 85% confidence) -- Changed from toBeInstanceOf(Array) to not.toThrow() — the new assertion is strictly weaker and doesn't validate the behavioral contract. Recommend restoring the Array check.

  • Test behavior not fully verified (tests/shell-hooks.test.ts:1512-1588, 82% confidence) -- Shell hook guard-clause tests verify exit code but not that background spawn was actually skipped. For throttle case, recommend asserting stderr/stdout does not contain "background-kb-refresh".

Summary of Consolidated Issues

Aspect Count
Blocking HIGH findings 10
Pre-existing issues noted 3
Reviewers engaged 8
Total consensus themes Shell injection, type safety gaps, sidecar duplication, error guard removal

Deduplication: Findings appearing in multiple reviewer reports (e.g., sidecar shell injection flagged by Security, Architecture, Performance) were consolidated into single comments referencing all reviewers.

Next steps for author: Address the 10 inline comments before merge. Test refactoring improvements (try/catch → .toThrow()) are well-executed and do not require changes.


Created by devflow:git agent on 2026-04-27. Rate limit check: API calls within safe limits.

Dean Sharon and others added 7 commits April 27, 2026 23:31
Guard all source commands in background-kb-refresh, background-learning,
and background-memory-update so failures produce a clear diagnostic message
instead of silently continuing with undefined functions.

Co-Authored-By: Claude <noreply@anthropic.com>
The Knowledge agent writes KNOWLEDGE.md and a sidecar JSON via Write tool;
it does not need Bash to run CLI commands. The host process (CLI or background
hook) reads the sidecar and calls feature-kb.cjs update-index itself.

Removes the Bash tool and rewrites responsibility 7 to reflect the sidecar
handoff pattern. Updates the test to match.

Co-Authored-By: Claude <noreply@anthropic.com>
The inline node -e block in background-kb-refresh interpolated $SIDECAR
directly into a JavaScript string literal — a shell injection vector if the
path contained quotes or special characters. Replace with a new json-helper
read-sidecar subcommand that receives the path as a CLI argument processed
through safePath().

Co-Authored-By: Claude <noreply@anthropic.com>
Category added complexity without value — callers passed it as a constant
and it was never used for filtering or routing. Remove from:
- feature-kb.cjs typedef, updateIndex, refresh-context output (4→3 fields)
- background-kb-refresh IFS read and update-index call
- kb.ts FeatureKbModule types, list output, create/refresh prompts and calls
- feature-kb SKILL.md frontmatter template and Category Templates section
- All test fixtures and assertions

Co-Authored-By: Claude <noreply@anthropic.com>
… pin model

- Extract and export readSidecar() with validated JSON parsing — filters
  referencedFiles to string-only elements, handles all error cases with {}
- Expand listKBs return type with optional description/createdBy/referencedFiles
  so refresh path can fall back to existing index values without type cast
- Remove double cast in refresh referencedFiles fallback
- Reuse staleness map from checkAllStaleness in batch refresh loop instead of
  calling checkStaleness per-slug (avoids redundant git log calls)
- Pin --model sonnet in kb create for consistent agent behavior

Co-Authored-By: Claude <noreply@anthropic.com>
- Empty-index stale-slugs test: verify no output for { features: {} }
- json-helper read-sidecar: 5 tests covering valid, missing, malformed,
  non-array field, and missing args cases
- readSidecar TypeScript helper: 5 unit tests covering valid sidecar,
  missing file, invalid JSON, string-instead-of-array, and mixed-type
  array filtering

Co-Authored-By: Claude <noreply@anthropic.com>
- Pin --model sonnet on kb refresh command (was missing, create had it)
- Remove Bash from feature-kb skill allowed-tools (agent no longer has it)
- Distribute updated agent to plugin copies
- Include Simplifier changes: source shared get-mtime, disambiguate test names
@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented Apr 27, 2026

Summary: Consolidated Review Findings

From 8 specialized reviewers (8 scan types), here are key blocking issues ≥80% confidence not yet addressed:

BLOCKING REGRESSIONS

1. Knowledge agent cannot update KB index from plan:orch (90% confidence)

  • Issue: Removing Bash from the agent breaks KB creation during /plan workflow. Agent writes sidecar file but plan:orch Phase 12 has no handler to read it and call updateIndex. Result: KBs created via /plan exist on disk but are missing from .features/index.json.
  • Files: shared/agents/knowledge.md:38, shared/skills/feature-kb/SKILL.md:107-117
  • Fix: Either (a) add post-spawn sidecar handling in plan:orch, or (b) restore Bash to the agent.

2. Skill contradicts agent tools (92% confidence)

  • Issue: shared/skills/feature-kb/SKILL.md:107-117 instructs agent to run node ... update-index via Bash, but Knowledge agent no longer has Bash in its tool list.
  • Fix: Update Integration section to describe sidecar pattern: "After writing KNOWLEDGE.md, write .features/{slug}/.create-result.json with referencedFiles and description. Host process reads and updates index."

CRITICAL CODE ISSUES

3. Unused import fsPromises (95% confidence)

  • File: tests/feature-kb/feature-kb.test.ts:6
  • Fix: Remove unused import (test uses only synchronous fs API).

4. CJS vs TS readSidecar behavioral divergence (82% confidence)

  • Files: scripts/hooks/json-helper.cjs:1823 vs src/cli/commands/kb.ts:37-39
  • Issue: TS version filters non-string array elements; CJS does not.
  • Fix: Add .filter(v => typeof v === 'string') to CJS version for consistency.

5. Duplicated sidecar parsing logic (85% confidence)

  • Files: src/cli/commands/kb.ts:26-44 + scripts/hooks/json-helper.cjs:1813-1828
  • Fix: Consolidate into shared feature-kb.cjs readSidecar() export consumed by both paths.

CODE QUALITY

6. Structural duplication in create/refresh handlers (85% confidence)

  • File: src/cli/commands/kb.ts:375-467, 476-574
  • Issue: Both follow identical ~95-line sidecar-lifecycle pattern.
  • Fix: Extract runKbAgent(opts) helper to reduce to ~25 lines each.

7. Inconsistent error guards on source statements (85% confidence)

  • Files: Multiple hook files
  • Fix: Apply || { echo; exit 1; } pattern consistently across all hooks (currently only in background-* hooks).

Other medium-confidence issues: Layer separation (readSidecar in command module), inline type lengths, missing test coverage, redundant staleness checks — see detailed reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant