Skip to content

feat: path command, expanded benchmarks, docs updates#121

Merged
carlos-alm merged 10 commits intomainfrom
feat/path-command
Feb 26, 2026
Merged

feat: path command, expanded benchmarks, docs updates#121
carlos-alm merged 10 commits intomainfrom
feat/path-command

Conversation

@carlos-alm
Copy link
Contributor

@carlos-alm carlos-alm commented Feb 26, 2026

Summary

  • codegraph path: New A→B symbol pathfinding command with BFS traversal, exposed in CLI, programmatic API, and MCP
  • Incremental rebuild fix: Edge preservation during incremental builds (bug(builder): incremental rebuild drops edges when re-parsing a file #116, fix: incremental rebuild drops edges from unchanged files (#116) #120)
  • Expanded benchmarks: benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query latency (fn-deps, fn-impact, path, roles); report pipeline renders new sections in BUILD-BENCHMARKS.md and README
  • Token savings benchmark: New standalone benchmark comparing codegraph-assisted navigation vs raw file reads
  • Documentation: Titan Paradigm use case, CLI/MCP examples, roadmap updates, recommended practices

Test plan

  • All 624 tests pass (npx vitest run)
  • node scripts/benchmark.js outputs JSON with noopRebuildMs, oneFileRebuildMs, and queries
  • Report pipeline generates correct Incremental Rebuilds and Query Latency sections
  • README ### Lightweight Footprint section preserved after update

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 26, 2026

Greptile Summary

This PR adds A→B symbol pathfinding (codegraph path), expands benchmarks with incremental rebuild metrics and query latency measurements, and includes comprehensive documentation updates.

Core changes:

  • Implemented BFS-based pathfinding in pathData with bidirectional traversal, alternate path counting, and proper edge filtering
  • Added path CLI command and symbol_path MCP tool with full option support
  • Extended benchmarks to measure no-op rebuilds, 1-file rebuilds, and query latency (fn-deps, fn-impact, path, roles)
  • Added token savings benchmarks comparing codegraph-assisted navigation vs raw file reads
  • Updated all tool count references from 19 to 21 (22 in multi-repo mode)

Testing:

  • 12 new tests for pathData covering all edge cases (direct/multi-hop paths, reverse, self-paths, maxDepth, nonexistent symbols, alternate paths)
  • All 624 tests pass per test plan

Documentation:

  • Comprehensive CLI and MCP examples for the new path command
  • New Titan Paradigm use case document describing multi-agent codebase orchestration patterns
  • Updated roadmap and recommended practices

The implementation is clean, well-tested, and properly integrated across CLI, programmatic API, and MCP interfaces.

Confidence Score: 5/5

  • This PR is safe to merge with high confidence
  • The implementation is well-architected with proper BFS algorithm, comprehensive test coverage (12 new tests for pathData alone), clean integration across all interfaces (CLI, API, MCP), and thorough documentation. All 624 tests pass, and the code follows established patterns in the codebase.
  • No files require special attention

Important Files Changed

Filename Overview
src/queries.js Added pathData and symbolPath functions implementing BFS-based A→B pathfinding with proper edge handling, alternate path counting, and bidirectional traversal
src/cli.js Added path command with comprehensive options (max-depth, kinds, reverse, file disambiguation)
src/mcp.js Added symbol_path MCP tool with proper schema and handler dispatch
tests/integration/queries.test.js Added 12 comprehensive tests for pathData covering direct paths, multi-hop, reverse, self-paths, maxDepth, nonexistent symbols, noTests filter, and alternate paths
scripts/benchmark.js Added incremental rebuild benchmarks (no-op, 1-file) and query latency benchmarks (fn-deps, fn-impact, path, roles) with median calculation
scripts/update-benchmark-report.js Added report sections for incremental rebuilds and query latency with trend indicators
README.md Updated tool count to 21 (22 in multi-repo), added path command examples, updated feature table
docs/examples/CLI.md Added comprehensive examples for path, roles, and co-change commands

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[path command entry] --> B[symbolPath]
    B --> C[pathData]
    C --> D{Find from/to nodes}
    D -->|Not found| E[Return error]
    D -->|Found| F{Same node?}
    F -->|Yes| G[Return 0-hop path]
    F -->|No| H[BFS traversal]
    H --> I{Direction?}
    I -->|Forward| J[source_id → target_id]
    I -->|Reverse| K[target_id → source_id]
    J --> L[Build neighbor query]
    K --> L
    L --> M[Queue traversal up to maxDepth]
    M --> N{Target found?}
    N -->|Yes| O[Reconstruct path]
    N -->|No| P[Return not found]
    O --> Q[Count alternates]
    Q --> R[Return result]
    
    S[CLI: src/cli.js] --> A
    T[MCP: src/mcp.js] --> C
    U[API: src/index.js] --> C
    
    V[Tests] --> W[queries.test.js]
    V --> X[cli.test.js]
    V --> Y[mcp.test.js]
    
    Z[Benchmarks] --> AA[benchmark.js]
    AA --> AB[Query latency: path]
Loading

Last reviewed commit: d55f8ea

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

25 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Enable `--multi-repo` to let the agent query any registered repository, or use `--repos` to restrict access to a specific set of repos.

The server exposes 17 tools: `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `context`, `explain`, `where`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, `hotspots`, and `list_repos` (multi-repo only). See the [AI Agent Guide MCP reference](./ai-agent-guide.md#mcp-server-reference) for the full tool-to-CLI mapping table.
The server exposes 20 tools: `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `symbol_path`, `context`, `explain`, `where`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, `hotspots`, `node_roles`, `co_changes`, and `list_repos` (multi-repo only). See the [AI Agent Guide MCP reference](./ai-agent-guide.md#mcp-server-reference) for the full tool-to-CLI mapping table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tool count mismatch with README — README says "19 MCP tools" (single-repo default), this says "20 tools" which includes list_repos (multi-repo only)

Suggested change
The server exposes 20 tools: `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `symbol_path`, `context`, `explain`, `where`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, `hotspots`, `node_roles`, `co_changes`, and `list_repos` (multi-repo only). See the [AI Agent Guide MCP reference](./ai-agent-guide.md#mcp-server-reference) for the full tool-to-CLI mapping table.
The server exposes 19 tools (20 in multi-repo mode): `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `symbol_path`, `context`, `explain`, `where`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, `hotspots`, `node_roles`, `co_changes`, and `list_repos` (multi-repo only). See the [AI Agent Guide MCP reference](./ai-agent-guide.md#mcp-server-reference) for the full tool-to-CLI mapping table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed. Updated to "19 tools (20 in multi-repo mode)" to match README and clarify that list_repos is multi-repo only.

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected
- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities
- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review
@carlos-alm
Copy link
Contributor Author

@greptileai

@claude
Copy link

claude bot commented Feb 26, 2026

Claude finished @carlos-alm's task —— View job


PR Review Complete ✅

Todo List:

  • Phase 1: Root Cause Analysis - Understand problem being solved
  • Phase 2: Configuration Change Detection - Check for high-risk changes
  • Phase 3: Test Integrity - Verify test quality and coverage
  • Phase 4: Code Quality - Review implementation details
  • Final Assessment - Provide rating and recommendation

Phase 1 Root Cause Analysis ✅

Problem Being Solved: Adding BFS pathfinding between symbols to trace calling relationships (A calls B calls C). This fills a gap in the query capabilities - users can see direct callers/callees but couldn't trace transitive paths.

Why This Approach: The implementation uses proper BFS algorithm with depth limiting, bidirectional search (forward/reverse), and edge-kind filtering. This is the correct approach for shortest-path finding in dependency graphs.

Risk Assessment: Low risk - this is additive functionality with no breaking changes to existing APIs.

Phase 2 Configuration Change Detection ✅

No High-Risk Changes Detected:

  • ✅ No biome.json or linting rule modifications
  • ✅ No test timeout increases or coverage threshold reductions
  • ✅ No build configuration changes
  • ✅ vitest.config.js unchanged (still 30s timeout, no exclusions added)

This is purely additive functionality with proper testing.

Phase 3 Test Integrity ✅

Excellent Test Coverage:

  • 11 comprehensive test cases for pathData function covering all scenarios
  • Edge cases: nonexistent symbols, self-path, maxDepth limiting, reverse direction
  • Core functionality: 1-hop, multi-hop, alternate path counting, candidate population
  • Filtering: noTests option properly tested
  • Proper assertions: Tests verify exact hop counts, path structure, edge kinds
  • No test weakening: All assertions are strict and meaningful

Phase 4 Code Quality ✅

Excellent Implementation Quality:

Security:

  • SQL injection prevention: Uses parameterized queries with db.prepare() and ? placeholders (lines 654-660 in queries.js)
  • No shell injection risk: Uses execFileSync (secure) instead of execSync throughout codebase
  • Proper input validation: Edge kinds filtered through predefined arrays, numeric inputs parsed/validated

Performance:

  • Efficient BFS algorithm: Proper queue-based implementation with visited set to prevent cycles
  • Query optimization: Single prepared statement for neighbor lookup, node caching for path reconstruction (lines 728-734)
  • Memory management: Database properly closed in all code paths
  • Depth limiting: Configurable maxDepth prevents infinite traversal (default 10)

Code Quality:

  • Clean algorithm: Textbook BFS with clear variable names and logic flow
  • Robust error handling: Graceful handling of nonexistent symbols, no paths found
  • Comprehensive output: Returns candidates, alternate path counts, search metadata
  • Consistent API: Follows existing patterns in queries.js (same option structure, error formats)

Documentation:

  • Titan Paradigm use case (286 lines): High-quality documentation mapping multi-agent architecture to codegraph capabilities
  • Complete examples: CLI.md and MCP.md updated with real path command output
  • Accurate backlog updates: Properly marks shipped items as DONE

Root Cause Analysis

Problem Being Solved: Adding graph pathfinding to enable tracing call chains across codebases - essential for dependency analysis and refactoring workflows.
Why This Approach: BFS is the optimal algorithm for shortest paths in unweighted graphs. The implementation is correct and efficient.
Risk Assessment: Zero risk - purely additive with comprehensive test coverage.

Critical Concerns

None identified. The implementation follows security best practices, uses efficient algorithms, and has excellent test coverage.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a high-quality feature implementation with 249 lines of well-tested BFS pathfinding code, comprehensive documentation, and proper security practices. The only reason it doesn't get 5 stars is that it's a straightforward algorithm implementation rather than exceptional innovation. Ready to merge.

Comment on lines +36 to +46
| 21 | Composite audit command | Single `codegraph audit <file-or-function>` that combines `explain`, `fn-impact`, and code health metrics into one structured report per function. Core version uses graph data; enhanced version includes Phase 4.4 `risk_score`/`complexity_notes`/`side_effects` when available. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) Gauntlet phase. | Orchestration | Each sub-agent in a multi-agent swarm gets everything it needs to assess a function in one call instead of 3-4 — directly reduces token waste and round-trips | ✓ | ✓ | 4 | No |
| 22 | Batch querying | Accept a list of targets (file or JSON) and return all query results in one JSON payload. Applies to `audit`, `fn-impact`, `context`, and other per-symbol commands. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) swarm pattern. | Orchestration | A swarm of 20+ agents auditing different files can be fed from a single orchestrator call instead of N sequential invocations — reduces overhead and enables parallel dispatch | ✓ | ✓ | 4 | No |
| 23 | Triage priority queue | Single `codegraph triage` command that merges `map` connectivity, `hotspots` fan-in/fan-out, node roles, and optionally git churn + `risk_score` into one ranked audit queue. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) RECON phase. | Orchestration | Orchestrating agent gets a single prioritized list of what to audit first — replaces manual synthesis of 3+ commands, saves RECON phase from burning tokens on orientation | ✓ | ✓ | 4 | No |
| 24 | Change validation predicates | `codegraph check --staged` with configurable predicates: `--no-new-cycles`, `--max-blast-radius N`, `--no-signature-changes`, `--no-boundary-violations`. Returns exit code 0/1 for CI gates and state machines. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) STATE MACHINE phase. | CI | Automated rollback triggers without parsing JSON — orchestrators and CI pipelines get first-class pass/fail signals for blast radius, cycles, and contract changes | ✓ | ✓ | 4 | No |
| 26 | MCP orchestration tools | Expose `audit`, `triage`, and `check` as MCP tools alongside existing tools. Enables multi-agent orchestrators (Claude Code agent teams, custom MCP clients) to run the full Titan Paradigm loop through the MCP protocol without CLI overhead. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md). | Embeddability | Agents query the graph through MCP with zero CLI overhead — fewer tokens, faster round-trips, native integration with AI agent frameworks | ✓ | ✓ | 4 | No |
| 5 | TF-IDF lightweight search | SQLite FTS5 + TF-IDF as a middle tier (~50MB) between "no search" and full transformer embeddings (~500MB). Provides decent keyword search with near-zero overhead. Inspired by codexray. | Search | Users get useful search without the 500MB embedding model download; faster startup for small projects | ✓ | ✓ | 3 | No |
| 13 | Architecture boundary rules | User-defined rules for allowed/forbidden dependencies between modules (e.g., "controllers must not import from other controllers"). Violations flagged in `diff-impact` and CI. Inspired by codegraph-rust, stratify. | Architecture | Prevents architectural decay in CI; agents are warned before introducing forbidden cross-module dependencies | ✓ | ✓ | 3 | No |
| 15 | Hybrid BM25 + semantic search | Combine BM25 keyword matching with embedding-based semantic search using Reciprocal Rank Fusion. Better recall than either approach alone. Inspired by GitNexus, claude-context-local. | Search | Search results improve dramatically — keyword matches catch exact names, embeddings catch conceptual matches, RRF merges both | ✓ | ✓ | 3 | No |
| 18 | CODEOWNERS integration | Map graph nodes to CODEOWNERS entries. Show who owns each function, surface ownership boundaries in `diff-impact`. Inspired by CKB. | Developer Experience | `diff-impact` tells agents which teams to notify; ownership-aware impact analysis reduces missed reviews | ✓ | ✓ | 3 | No |
| 22 | Manifesto-driven pass/fail | User-defined rule engine with custom thresholds (e.g. "cognitive > 15 = fail", "cyclomatic > 10 = fail", "imports > 10 = decompose"). Outputs pass/fail per function/file. Generalizes ID 13 (boundary rules) into a generic rule system. | Analysis | Enables autonomous multi-agent audit workflows (GAUNTLET pattern); CI integration for code health gates with configurable thresholds | ✓ | ✓ | 3 | No |
| 25 | Graph snapshots | `codegraph snapshot save <name>` / `codegraph snapshot restore <name>` for lightweight SQLite DB backup and restore. Enables orchestrators to checkpoint before refactoring passes and instantly rollback without rebuilding. After Phase 4, also preserves embeddings and semantic metadata. Inspired by [Titan Paradigm](../docs/use-cases/titan-paradigm.md) STATE MACHINE phase. | Orchestration | Multi-agent workflows get instant rollback without re-running expensive builds or LLM calls — orchestrator checkpoints before each pass and restores on failure | ✓ | ✓ | 3 | No |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate item IDs — new items 21-26 collide with existing items 21-22

The backlog already had:

  • ID 21: "Cognitive + cyclomatic complexity" (line 50)
  • ID 22: "Manifesto-driven pass/fail" (line 45)

New items need different IDs. Suggest renumbering to 27-32:

  • 27: Composite audit command
  • 28: Batch querying
  • 29: Triage priority queue
  • 30: Change validation predicates
  • 31: MCP orchestration tools (currently 26)
  • 32: Graph snapshots (currently 25)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ab57fb6 — renumbered the new orchestration items to IDs 27-32 to avoid colliding with existing IDs 21-22.

@@ -158,7 +158,7 @@ Full agent setup: [AI Agent Guide](docs/guides/ai-agent-guide.md) &middot; [CLAU
| 🔍 | **Symbol search** | Find any function, class, or method by name — exact match priority, relevance scoring, `--file` and `--kind` filters |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tool count is incorrect — says "19 MCP tools" but BASE_TOOLS in src/mcp.js contains 21 tools after adding symbol_path

Should be "21 MCP tools" (22 in multi-repo mode with list_repos)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated all 3 occurrences in README.md from "19-tool" to "21-tool" (lines 100, 147, 173). BASE_TOOLS has 21 tools, 22 in multi-repo with list_repos.

Enable `--multi-repo` to let the agent query any registered repository, or use `--repos` to restrict access to a specific set of repos.

The server exposes 17 tools: `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `context`, `explain`, `where`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, `hotspots`, and `list_repos` (multi-repo only). See the [AI Agent Guide MCP reference](./ai-agent-guide.md#mcp-server-reference) for the full tool-to-CLI mapping table.
The server exposes 19 tools (20 in multi-repo mode): `query_function`, `file_deps`, `impact_analysis`, `find_cycles`, `module_map`, `fn_deps`, `fn_impact`, `symbol_path`, `context`, `explain`, `where`, `diff_impact`, `semantic_search`, `export_graph`, `list_functions`, `structure`, `hotspots`, `node_roles`, `co_changes`, and `list_repos` (multi-repo only). See the [AI Agent Guide MCP reference](./ai-agent-guide.md#mcp-server-reference) for the full tool-to-CLI mapping table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tool count is incorrect — says "19 tools (20 in multi-repo mode)" but should be "21 tools (22 in multi-repo mode)"

BASE_TOOLS array in src/mcp.js has 21 tools (including the newly added symbol_path), plus list_repos in multi-repo mode makes 22

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ab57fb6 — now reads "21 tools (22 in multi-repo mode)" which matches the actual BASE_TOOLS count.

- What testing approaches am I missing?
- **Cross-command pipelines:** Have I tested `build` → `embed` → `search` → modify → `build` → `search`? Have I tested `watch` detecting changes then `diff-impact`?
- **MCP server:** Have I tested the `mcp` command? Initialize via JSON-RPC on stdin, send `tools/list`, verify all 17 tools are present. Test single-repo mode (default — `list_repos` should be absent, no `repo` parameter on tools) vs `--multi-repo` mode.
- **MCP server:** Have I tested the `mcp` command? Initialize via JSON-RPC on stdin, send `tools/list`, verify all 18 tools are present. Test single-repo mode (default — `list_repos` should be absent, no `repo` parameter on tools) vs `--multi-repo` mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tool count is incorrect — says "18 tools" but should be "21 tools"

BASE_TOOLS in src/mcp.js contains 21 tools after adding symbol_path in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ab57fb6 — now reads "21 tools" matching BASE_TOOLS count in src/mcp.js.

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22
@claude
Copy link

claude bot commented Feb 26, 2026

Claude finished @carlos-alm's task —— View job


Claude finished @carlos-alm's task —— View job


PR Review Complete ✅

Todo List:

  • Phase 1: Root Cause Analysis - Understand problem being solved
  • Phase 2: Configuration Change Detection - Check for high-risk changes
  • Phase 3: Test Integrity - Verify test quality and coverage
  • Phase 4: Code Quality - Review implementation details
  • Final Assessment - Provide rating and recommendation

Phase 1 Root Cause Analysis ✅

Problem Being Solved: Adding BFS pathfinding between symbols to trace calling relationships (A calls B calls C). This fills a gap in the query capabilities - users can see direct callers/callees but couldn't trace transitive paths.

Why This Approach: The implementation uses proper BFS algorithm with depth limiting, bidirectional search (forward/reverse), and edge-kind filtering. This is the correct approach for shortest-path finding in dependency graphs.

Risk Assessment: Low risk - this is additive functionality with no breaking changes to existing APIs.

Phase 2 Configuration Change Detection ✅

No High-Risk Changes Detected:

  • ✅ No biome.json or linting rule modifications
  • ✅ No test timeout increases or coverage threshold reductions
  • ✅ No build configuration changes
  • ✅ vitest.config.js unchanged (still 30s timeout, no exclusions added)

This is purely additive functionality with proper testing.

Phase 3 Test Integrity ✅

Excellent Test Coverage:

  • 11 comprehensive test cases for pathData function covering all scenarios
  • Edge cases: nonexistent symbols, self-path, maxDepth limiting, reverse direction
  • Core functionality: 1-hop, multi-hop, alternate path counting, candidate population
  • Filtering: noTests option properly tested
  • Proper assertions: Tests verify exact hop counts, path structure, edge kinds
  • No test weakening: All assertions are strict and meaningful

Phase 4 Code Quality ✅

Excellent Implementation Quality:

Security:

  • SQL injection prevention: Uses parameterized queries with db.prepare() and ? placeholders (lines 654-660 in queries.js)
  • No shell injection risk: Uses execFileSync (secure) instead of execSync throughout codebase
  • Proper input validation: Edge kinds filtered through predefined arrays, numeric inputs parsed/validated

Performance:

  • Efficient BFS algorithm: Proper queue-based implementation with visited set to prevent cycles
  • Query optimization: Single prepared statement for neighbor lookup, node caching for path reconstruction (lines 728-734)
  • Memory management: Database properly closed in all code paths
  • Depth limiting: Configurable maxDepth prevents infinite traversal (default 10)

Code Quality:

  • Clean algorithm: Textbook BFS with clear variable names and logic flow
  • Robust error handling: Graceful handling of nonexistent symbols, no paths found
  • Comprehensive output: Returns candidates, alternate path counts, search metadata
  • Consistent API: Follows existing patterns in queries.js (same option structure, error formats)

Documentation:

  • Titan Paradigm use case (286 lines): High-quality documentation mapping multi-agent architecture to codegraph capabilities
  • Complete examples: CLI.md and MCP.md updated with real path command output
  • Accurate backlog updates: Properly marks shipped items as DONE

Root Cause Analysis

Problem Being Solved: Adding graph pathfinding to enable tracing call chains across codebases - essential for dependency analysis and refactoring workflows.
Why This Approach: BFS is the optimal algorithm for shortest paths in unweighted graphs. The implementation is correct and efficient.
Risk Assessment: Zero risk - purely additive with comprehensive test coverage.

Critical Concerns

None identified. The implementation follows security best practices, uses efficient algorithms, and has excellent test coverage.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a high-quality feature implementation with 249 lines of well-tested BFS pathfinding code, comprehensive documentation, and proper security practices. The only reason it doesn't get 5 stars is that it's a straightforward algorithm implementation rather than exceptional innovation. Ready to merge.

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected
@claude
Copy link

claude bot commented Feb 26, 2026

Claude finished @carlos-alm's task —— View job


Claude finished reviewing this PR

PR Review Complete

  • Phase 1: Root Cause Analysis - Understand problem being solved
  • Phase 2: Configuration Change Detection - Check for high-risk changes
  • Phase 3: Test Integrity - Verify test quality and coverage
  • Phase 4: Code Quality - Review implementation details
  • Address Greptile's identified issues (tool counts, duplicate IDs)
  • Final Assessment - Provide rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Adding BFS pathfinding between symbols to trace calling relationships (A calls B calls C). This fills a gap in the query capabilities - users can see direct callers/callees but couldn't trace transitive paths.

Why This Approach: The implementation uses proper BFS algorithm with depth limiting, bidirectional search (forward/reverse), and edge-kind filtering. This is the correct approach for shortest-path finding in dependency graphs.

Risk Assessment: Low risk - this is additive functionality with no breaking changes to existing APIs.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • ✅ No biome.json or linting rule modifications
  • ✅ No test timeout increases or coverage threshold reductions
  • ✅ No build configuration changes
  • ✅ vitest.config.js unchanged (still 30s timeout, no exclusions added)

This is purely additive functionality with proper testing.

Phase 3: Test Integrity ✅

Excellent Test Coverage:

  • 11 comprehensive test cases for pathData function covering all scenarios
  • Edge cases: nonexistent symbols, self-path, maxDepth limiting, reverse direction
  • Core functionality: 1-hop, multi-hop, alternate path counting, candidate population
  • Filtering: noTests option properly tested
  • Proper assertions: Tests verify exact hop counts, path structure, edge kinds
  • No test weakening: All assertions are strict and meaningful

Phase 4: Code Quality ✅

Excellent Implementation Quality:

Security:

  • SQL injection prevention: Uses parameterized queries with db.prepare() and ? placeholders (lines 654-660 in queries.js)
  • No shell injection risk: Uses execFileSync (secure) instead of execSync throughout codebase
  • Proper input validation: Edge kinds filtered through predefined arrays, numeric inputs parsed/validated

Performance:

  • Efficient BFS algorithm: Proper queue-based implementation with visited set to prevent cycles
  • Query optimization: Single prepared statement for neighbor lookup, node caching for path reconstruction (lines 728-734)
  • Memory management: Database properly closed in all code paths
  • Depth limiting: Configurable maxDepth prevents infinite traversal (default 10)

Code Quality:

  • Clean algorithm: Textbook BFS with clear variable names and logic flow
  • Robust error handling: Graceful handling of nonexistent symbols, no paths found
  • Comprehensive output: Returns candidates, alternate path counts, search metadata
  • Consistent API: Follows existing patterns in queries.js (same option structure, error formats)

Documentation:

  • Titan Paradigm use case (286 lines): High-quality documentation mapping multi-agent architecture to codegraph capabilities
  • Complete examples: CLI.md and MCP.md updated with real path command output
  • Accurate backlog updates: Properly marks shipped items as DONE

Issues Identified (Same as Greptile)

1. Duplicate Backlog IDs: ✅ CONFIRMED

  • Lines 50 & 45 in BACKLOG.md have existing items:
    • ID 21: "Cognitive + cyclomatic complexity" (line 50)
    • ID 22: "Manifesto-driven pass/fail" (line 45)
  • New items 27-32 collide with these existing IDs
  • Solution: New items should be renumbered to IDs 33-38

2. MCP Tool Count Inaccuracies: ✅ CONFIRMED

  • BASE_TOOLS contains 21 tools (counted from src/mcp.js lines 20-406)
  • Multi-repo mode: 22 tools (BASE_TOOLS + LIST_REPOS_TOOL)
  • Incorrect references:
    • README.md line 58: says "21 MCP tools" ✅ Actually correct
    • README.md line 147: says "19-tool MCP server" ❌ Should be "21-tool"
    • README.md line 173: says "19-tool MCP server" ❌ Should be "21-tool"
    • docs/guides/recommended-practices.md line 146: says "21 tools (22 in multi-repo)" ✅ Actually correct
    • .claude/skills/dogfood/SKILL.md line 206: says "21 tools" ✅ Actually correct

Root Cause Analysis

Problem Being Solved: Adding graph pathfinding to enable tracing call chains across codebases - essential for dependency analysis and refactoring workflows.
Why This Approach: BFS is the optimal algorithm for shortest paths in unweighted graphs. The implementation is correct and efficient.
Risk Assessment: Zero risk - purely additive with comprehensive test coverage.

Critical Concerns

Minor Documentation Issues Only: Two instances in README.md incorrectly state "19-tool MCP server" instead of "21-tool". These are simple documentation accuracy issues with no runtime impact.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE WITH CONDITIONS
  • Conditions: Fix the 2 incorrect tool count references in README.md (lines 147 & 173: "19-tool" → "21-tool")
  • Reasoning: This is a high-quality feature implementation with 249 lines of well-tested BFS pathfinding code, comprehensive documentation, and proper security practices. The duplicate backlog IDs are cosmetic and don't affect functionality. The tool count discrepancies are minor documentation accuracy issues that should be corrected but don't block the core feature. Ready to merge after fixing the tool counts.

@carlos-alm
Copy link
Contributor Author

Added: Token Savings Benchmark

New commit 51fedb4 adds a benchmark suite that quantifies codegraph's token savings for AI agent navigation.

New files:

  • scripts/token-benchmark-issues.js — 5 real Next.js PRs as test cases
  • scripts/token-benchmark.js — Runner using Claude Agent SDK (baseline vs codegraph MCP)
  • scripts/update-token-report.js — JSON → Markdown report generator
  • docs/benchmarks/README.md — Methodology and instructions
  • docs/benchmarks/TOKEN-SAVINGS.md — Placeholder report

Usage: node scripts/token-benchmark.js --perf > result.json && node scripts/update-token-report.js result.json

…verage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected
@claude
Copy link

claude bot commented Feb 26, 2026

Claude finished @carlos-alm's task —— View job


Claude finished reviewing this PR

PR Review Complete

  • Phase 1: Root Cause Analysis - Understand problem being solved
  • Phase 2: Configuration Change Detection - Check for high-risk changes
  • Phase 3: Test Integrity - Verify test quality and coverage
  • Phase 4: Code Quality - Review implementation details
  • Address Greptile's identified issues (tool counts, duplicate IDs)
  • Final Assessment - Provide rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Adding BFS pathfinding between symbols to trace calling relationships (A calls B calls C). This fills a gap in the query capabilities - users can see direct callers/callees but couldn't trace transitive paths.

Why This Approach: The implementation uses proper BFS algorithm with depth limiting, bidirectional search (forward/reverse), and edge-kind filtering. This is the correct approach for shortest-path finding in dependency graphs.

Risk Assessment: Low risk - this is additive functionality with no breaking changes to existing APIs.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • ✅ No biome.json or linting rule modifications
  • ✅ No test timeout increases or coverage threshold reductions
  • ✅ No build configuration changes
  • ✅ vitest.config.js unchanged (still 30s timeout, no exclusions added)

This is purely additive functionality with proper testing.

Phase 3: Test Integrity ✅

Excellent Test Coverage:

  • 11 comprehensive test cases for pathData function covering all scenarios
  • Edge cases: nonexistent symbols, self-path, maxDepth limiting, reverse direction
  • Core functionality: 1-hop, multi-hop, alternate path counting, candidate population
  • Filtering: noTests option properly tested
  • Proper assertions: Tests verify exact hop counts, path structure, edge kinds
  • No test weakening: All assertions are strict and meaningful

Phase 4: Code Quality ✅

Excellent Implementation Quality:

Security:

  • SQL injection prevention: Uses parameterized queries with db.prepare() and ? placeholders (lines 654-660 in queries.js)
  • No shell injection risk: Uses execFileSync (secure) instead of execSync throughout codebase
  • Proper input validation: Edge kinds filtered through predefined arrays, numeric inputs parsed/validated

Performance:

  • Efficient BFS algorithm: Proper queue-based implementation with visited set to prevent cycles
  • Query optimization: Single prepared statement for neighbor lookup, node caching for path reconstruction (lines 728-734)
  • Memory management: Database properly closed in all code paths
  • Depth limiting: Configurable maxDepth prevents infinite traversal (default 10)

Code Quality:

  • Clean algorithm: Textbook BFS with clear variable names and logic flow
  • Robust error handling: Graceful handling of nonexistent symbols, no paths found
  • Comprehensive output: Returns candidates, alternate path counts, search metadata
  • Consistent API: Follows existing patterns in queries.js (same option structure, error formats)

Documentation:

  • Titan Paradigm use case (286 lines): High-quality documentation mapping multi-agent architecture to codegraph capabilities
  • Complete examples: CLI.md and MCP.md updated with real path command output
  • Accurate backlog updates: Properly marks shipped items as DONE

Issues Identified (Same as Greptile)

1. Duplicate Backlog IDs: ✅ CONFIRMED

  • Lines 50 & 45 in BACKLOG.md have existing items:
    • ID 21: "Cognitive + cyclomatic complexity" (line 50)
    • ID 22: "Manifesto-driven pass/fail" (line 45)
  • New items 27-32 collide with these existing IDs
  • Solution: New items should be renumbered to IDs 33-38

2. MCP Tool Count Inaccuracies: ✅ CONFIRMED

  • BASE_TOOLS contains 21 tools (counted from src/mcp.js lines 20-406)
  • Multi-repo mode: 22 tools (BASE_TOOLS + LIST_REPOS_TOOL)
  • Incorrect references:
    • README.md line 58: says "21 MCP tools" ✅ Actually correct
    • README.md line 147: says "19-tool MCP server" ❌ Should be "21-tool"
    • README.md line 173: says "19-tool MCP server" ❌ Should be "21-tool"
    • docs/guides/recommended-practices.md line 146: says "21 tools (22 in multi-repo)" ✅ Actually correct
    • .claude/skills/dogfood/SKILL.md line 206: says "21 tools" ✅ Actually correct

Root Cause Analysis

Problem Being Solved: Adding graph pathfinding to enable tracing call chains across codebases - essential for dependency analysis and refactoring workflows.
Why This Approach: BFS is the optimal algorithm for shortest paths in unweighted graphs. The implementation is correct and efficient.
Risk Assessment: Zero risk - purely additive with comprehensive test coverage.

Critical Concerns

Minor Documentation Issues Only: Two instances in README.md incorrectly state "19-tool MCP server" instead of "21-tool". These are simple documentation accuracy issues with no runtime impact.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE WITH CONDITIONS
  • Conditions: Fix the 2 incorrect tool count references in README.md (lines 147 & 173: "19-tool" → "21-tool")
  • Reasoning: This is a high-quality feature implementation with 249 lines of well-tested BFS pathfinding code, comprehensive documentation, and proper security practices. The duplicate backlog IDs are cosmetic and don't affect functionality. The tool count discrepancies are minor documentation accuracy issues that should be corrected but don't block the core feature. Ready to merge after fixing the tool counts.

@carlos-alm carlos-alm changed the title docs: Titan Paradigm use case, roles/co-change/path across all docs feat: path command, expanded benchmarks, docs updates Feb 26, 2026
Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).
@carlos-alm
Copy link
Contributor Author

Addressing Review Feedback

Greptile findings — all resolved:

  1. BACKLOG.md duplicate IDs — Fixed in ab57fb6. New orchestration items renumbered to 27-32 to avoid colliding with existing IDs 21-22.

  2. MCP tool counts — Fixed across all files:

    • README.md lines 100, 147, 173: updated from "19-tool" → "21-tool" ✅
    • docs/guides/recommended-practices.md: already reads "21 tools (22 in multi-repo)" ✅ (fixed in ab57fb6)
    • .claude/skills/dogfood/SKILL.md: already reads "21 tools" ✅ (fixed in ab57fb6)

Claude review — acknowledged:

  • Rating: 4/5, APPROVE — no code changes requested. All findings (duplicate IDs, tool counts) overlap with Greptile's and are addressed above.

Remaining: merge conflicts in benchmark.yml, README.md, BUILD-BENCHMARKS.md, and update-benchmark-report.js need resolution before merge.

Resolve conflicts:
- benchmark.yml: keep version-in-branch-name convention
- README.md: take main's dual-engine performance table
- BUILD-BENCHMARKS.md: main's dual-engine data + incremental/query sections
- update-benchmark-report.js: take main's clearer regex comment

Impact: 12 functions changed, 13 affected
@carlos-alm
Copy link
Contributor Author

@greptileai

Take main's latest benchmark numbers (PR #125) while preserving
the new incremental rebuild and query latency fields from the branch.
@claude
Copy link

claude bot commented Feb 26, 2026

Claude finished @carlos-alm's task —— View job


PR Review Complete

  • Phase 1: Root Cause Analysis - Understand problem being solved
  • Phase 2: Configuration Change Detection - Check for high-risk changes
  • Phase 3: Test Integrity - Verify test quality and coverage
  • Phase 4: Code Quality - Review implementation details
  • Address Greptile's identified issues (tool counts, duplicate IDs)
  • Final Assessment - Provide rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Adding BFS pathfinding between symbols to trace calling relationships (A calls B calls C). This fills a gap in the query capabilities - users can see direct callers/callees but couldn't trace transitive paths.

Why This Approach: The implementation uses proper BFS algorithm with depth limiting, bidirectional search (forward/reverse), and edge-kind filtering. This is the correct approach for shortest-path finding in unweighted graphs.

Risk Assessment: Low risk - this is additive functionality with no breaking changes to existing APIs.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • ✅ No biome.json or linting rule modifications
  • ✅ No test timeout increases or coverage threshold reductions
  • ✅ No build configuration changes
  • ✅ vitest.config.js unchanged (still 30s timeout, no exclusions added)

This is purely additive functionality with proper testing.

Phase 3: Test Integrity ✅

Excellent Test Coverage:

  • 11 comprehensive test cases for pathData function covering all scenarios
  • Edge cases: nonexistent symbols, self-path, maxDepth limiting, reverse direction
  • Core functionality: 1-hop, multi-hop, alternate path counting, candidate population
  • Filtering: noTests option properly tested
  • Proper assertions: Tests verify exact hop counts, path structure, edge kinds
  • No test weakening: All assertions are strict and meaningful

Phase 4: Code Quality ✅

Excellent Implementation Quality:

Security:

  • SQL injection prevention: Uses parameterized queries with db.prepare() and ? placeholders (lines 654-660 in queries.js)
  • No shell injection risk: Uses execFileSync (secure) instead of execSync throughout codebase
  • Proper input validation: Edge kinds filtered through predefined arrays, numeric inputs parsed/validated

Performance:

  • Efficient BFS algorithm: Proper queue-based implementation with visited set to prevent cycles
  • Query optimization: Single prepared statement for neighbor lookup, node caching for path reconstruction (lines 728-734)
  • Memory management: Database properly closed in all code paths
  • Depth limiting: Configurable maxDepth prevents infinite traversal (default 10)

Code Quality:

  • Clean algorithm: Textbook BFS with clear variable names and logic flow
  • Robust error handling: Graceful handling of nonexistent symbols, no paths found
  • Comprehensive output: Returns candidates, alternate path counts, search metadata
  • Consistent API: Follows existing patterns in queries.js (same option structure, error formats)

Documentation:

  • Titan Paradigm use case (286 lines): High-quality documentation mapping multi-agent architecture to codegraph capabilities
  • Complete examples: CLI.md and MCP.md updated with real path command output
  • Accurate backlog updates: Properly marks shipped items as DONE

Issues Identified (Same as Greptile)

1. Duplicate Backlog IDs: ✅ CONFIRMED

  • Lines 50 & 45 in BACKLOG.md have existing items:
    • ID 21: "Cognitive + cyclomatic complexity" (line 50)
    • ID 22: "Manifesto-driven pass/fail" (line 45)
  • New items 27-32 collide with these existing IDs
  • Solution: New items should be renumbered to IDs 33-38

2. MCP Tool Count Inaccuracies: ✅ CONFIRMED

  • BASE_TOOLS contains 21 tools (counted from src/mcp.js lines 20-406)
  • Multi-repo mode: 22 tools (BASE_TOOLS + LIST_REPOS_TOOL)
  • Incorrect references:
    • README.md line 58: says "21 MCP tools" ✅ Actually correct
    • README.md line 147: says "19-tool MCP server" ❌ Should be "21-tool"
    • README.md line 173: says "19-tool MCP server" ❌ Should be "21-tool"
    • docs/guides/recommended-practices.md line 146: says "21 tools (22 in multi-repo)" ✅ Actually correct
    • .claude/skills/dogfood/SKILL.md line 206: says "21 tools" ✅ Actually correct

Root Cause Analysis

Problem Being Solved: Adding graph pathfinding to enable tracing call chains across codebases - essential for dependency analysis and refactoring workflows.
Why This Approach: BFS is the optimal algorithm for shortest paths in unweighted graphs. The implementation is correct and efficient.
Risk Assessment: Zero risk - purely additive with comprehensive test coverage.

Critical Concerns

Minor Documentation Issues Only: Two instances in README.md incorrectly state "19-tool MCP server" instead of "21-tool". These are simple documentation accuracy issues with no runtime impact.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE WITH CONDITIONS
  • Conditions: Fix the 2 incorrect tool count references in README.md (lines 147 & 173: "19-tool" → "21-tool")
  • Reasoning: This is a high-quality feature implementation with 249 lines of well-tested BFS pathfinding code, comprehensive documentation, and proper security practices. The duplicate backlog IDs are cosmetic and don't affect functionality. The tool count discrepancies are minor documentation accuracy issues that should be corrected but don't block the core feature. Ready to merge after fixing the tool counts.

@carlos-alm carlos-alm merged commit ef0ea81 into main Feb 26, 2026
23 of 26 checks passed
@carlos-alm carlos-alm deleted the feat/path-command branch February 26, 2026 09:15
carlos-alm added a commit that referenced this pull request Feb 26, 2026
* feat: add codegraph path for A→B symbol pathfinding

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected

* docs: add Titan Paradigm use case, update docs with roles/co-change/path

- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities

* docs: restore Architecture Refactoring phase, fix references

- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review

* fix: correct MCP tool counts and backlog ID collisions

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22

* feat: add token savings benchmark (codegraph vs raw navigation)

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected

* feat: extend benchmarks with incremental builds and expanded query coverage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected

* ci: include version in automated benchmark commits and PRs

Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).

* fix: update remaining 19-tool references to 21-tool in README

* docs: remove "viral" from titan paradigm LinkedIn reference

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
carlos-alm added a commit that referenced this pull request Feb 26, 2026
* feat: add codegraph path for A→B symbol pathfinding

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected

* docs: add Titan Paradigm use case, update docs with roles/co-change/path

- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities

* docs: restore Architecture Refactoring phase, fix references

- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review

* fix: correct MCP tool counts and backlog ID collisions

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22

* feat: add token savings benchmark (codegraph vs raw navigation)

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected

* feat: extend benchmarks with incremental builds and expanded query coverage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected

* ci: include version in automated benchmark commits and PRs

Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).

* fix: update remaining 19-tool references to 21-tool in README

* docs: remove "viral" from titan paradigm LinkedIn reference

* fix: use endLine for scope-aware caller selection in nested functions

Nested/closure functions (e.g. nodeId inside exportMermaid) were
incorrectly classified as [dead] because the caller selection loop
picked the last definition where line <= call.line, creating self-call
edges that got filtered out. Now uses endLine to find the innermost
enclosing scope, so calls within an outer function correctly attribute
the outer function as caller rather than the nested function itself.

Fixes false-positive [dead] for nodeId in branch-compare.js, export.js,
and queries.js.

Impact: 1 functions changed, 17 affected

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
carlos-alm added a commit that referenced this pull request Feb 26, 2026
* feat: add codegraph path for A→B symbol pathfinding

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected

* docs: add Titan Paradigm use case, update docs with roles/co-change/path

- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities

* docs: restore Architecture Refactoring phase, fix references

- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review

* fix: correct MCP tool counts and backlog ID collisions

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22

* feat: add token savings benchmark (codegraph vs raw navigation)

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected

* feat: extend benchmarks with incremental builds and expanded query coverage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected

* ci: include version in automated benchmark commits and PRs

Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).

* fix: update remaining 19-tool references to 21-tool in README

* docs: remove "viral" from titan paradigm LinkedIn reference

* fix: use endLine for scope-aware caller selection in nested functions

Nested/closure functions (e.g. nodeId inside exportMermaid) were
incorrectly classified as [dead] because the caller selection loop
picked the last definition where line <= call.line, creating self-call
edges that got filtered out. Now uses endLine to find the innermost
enclosing scope, so calls within an outer function correctly attribute
the outer function as caller rather than the nested function itself.

Fixes false-positive [dead] for nodeId in branch-compare.js, export.js,
and queries.js.

Impact: 1 functions changed, 17 affected

* feat: add cognitive & cyclomatic complexity metrics

Compute per-function complexity during build via single-traversal DFS
of tree-sitter ASTs: cognitive (SonarSource), cyclomatic (McCabe), and
max nesting depth. Stores results in new function_complexity table
(migration v8) and surfaces them in stats, context, explain, and a
dedicated `complexity` CLI command + MCP tool.

Adds manifesto config section with warn thresholds (cognitive: 15,
cyclomatic: 10, maxNesting: 4) seeding the future rule engine.

Phase 1 supports JS/TS/TSX; unsupported languages are skipped gracefully.

Impact: 18 functions changed, 32 affected

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant