From ee9f4877d5457533063c8a710d48f26698d5f28e Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 17:43:05 -0700 Subject: [PATCH 1/8] docs: add competitive deep-dive for Joern and reorganize competitive folder Move COMPETITIVE_ANALYSIS.md into generated/competitive/ and add a comprehensive feature-by-feature comparison against joernio/joern (our #1-ranked competitor). Covers parsing, graph model, query language, performance, installation, AI/MCP integration, security analysis, developer productivity, and ecosystem across 100+ individual features. Update FOUNDATION.md reference to the new path. --- FOUNDATION.md | 2 +- .../{ => competitive}/COMPETITIVE_ANALYSIS.md | 0 generated/competitive/joern.md | 338 ++++++++++++++++++ 3 files changed, 339 insertions(+), 1 deletion(-) rename generated/{ => competitive}/COMPETITIVE_ANALYSIS.md (100%) create mode 100644 generated/competitive/joern.md diff --git a/FOUNDATION.md b/FOUNDATION.md index 8db549a8..80234f1d 100644 --- a/FOUNDATION.md +++ b/FOUNDATION.md @@ -133,7 +133,7 @@ Staying in our lane means we can be embedded inside IDEs, AI agents, CI pipeline ## Competitive Position -As of February 2026, codegraph is **#7 out of 22** in the code intelligence tool space (see [COMPETITIVE_ANALYSIS.md](./COMPETITIVE_ANALYSIS.md)). +As of February 2026, codegraph is **#7 out of 22** in the code intelligence tool space (see [COMPETITIVE_ANALYSIS.md](./generated/competitive/COMPETITIVE_ANALYSIS.md)). Six tools rank above us on feature breadth and community size. But none of them can answer yes to all three questions: diff --git a/generated/COMPETITIVE_ANALYSIS.md b/generated/competitive/COMPETITIVE_ANALYSIS.md similarity index 100% rename from generated/COMPETITIVE_ANALYSIS.md rename to generated/competitive/COMPETITIVE_ANALYSIS.md diff --git a/generated/competitive/joern.md b/generated/competitive/joern.md new file mode 100644 index 00000000..0b7d0487 --- /dev/null +++ b/generated/competitive/joern.md @@ -0,0 +1,338 @@ +# Competitive Deep-Dive: Codegraph vs Joern + +**Date:** 2026-03-02 +**Competitors:** `@optave/codegraph` v0.x (Apache-2.0) vs `joernio/joern` v4.x (Apache-2.0) +**Context:** Both are Apache-2.0-licensed code analysis tools with CLI interfaces. Joern is ranked #1 in our [competitive analysis](./COMPETITIVE_ANALYSIS.md) with a score of 4.5 vs codegraph's 4.0 at #8. + +--- + +## Executive Summary + +Joern and codegraph solve fundamentally **different problems** using code graphs as a shared substrate: + +| Dimension | Joern | Codegraph | +|-----------|-------|-----------| +| **Primary mission** | Vulnerability discovery & security research | Always-current structural code intelligence for developers and AI agents | +| **Target user** | Security researchers, pentesters, auditors | Developers, AI coding agents, CI pipelines | +| **Graph model** | Code Property Graph (AST + CFG + PDG + DDG) | Structural dependency graph (symbols + call edges + imports) | +| **Core question answered** | "Can attacker-controlled data reach this dangerous sink?" | "What breaks if I change this function?" | +| **Rebuild model** | Full re-import on every change (minutes) | Incremental sub-second rebuilds (milliseconds) | +| **Runtime** | JVM (Scala) — 4-100 GB heap | Node.js — <100 MB typical | + +**Bottom line:** Joern is deeper (taint analysis, control flow, data dependence). Codegraph is faster, lighter, and purpose-built for the developer/AI-agent loop. They are complementary tools, not direct substitutes. Where they overlap (structural queries, call graphs, language support), codegraph wins on speed and simplicity; Joern wins on analysis depth. + +--- + +## Problem Alignment with FOUNDATION.md + +Codegraph's foundation document defines the problem as: *"Fast local analysis with no AI, or powerful AI features that require full re-indexing through cloud APIs on every change. None of them give you an always-current graph."* + +### Principle-by-principle evaluation + +| # | Principle | Codegraph | Joern | Verdict | +|---|-----------|-----------|-------|---------| +| 1 | **The graph is always current** — rebuild on every commit/save/agent loop | File-level MD5 hashing. Change 1 file in 3,000 → <500ms rebuild. Watch mode, commit hooks, agent loops all practical | Full re-import always. Small project: 19-30s. Linux kernel: 6+ hours. No incremental mode. Unusable in tight feedback loops | **Codegraph wins decisively.** This is the single most important differentiator. Joern cannot participate in commit hooks or agent-driven loops | +| 2 | **Native speed, universal reach** — dual engine (Rust + WASM) | Native napi-rs with rayon parallelism + automatic WASM fallback. `npm install` on any platform | JVM/Scala. Requires JDK 19+. Pre-built binaries or Docker. No cross-platform auto-detection | **Codegraph wins.** Automatic platform detection with native performance + universal fallback vs. manual JVM setup | +| 3 | **Confidence over noise** — scored results | 6-level import resolution with 0.0-1.0 confidence on every edge. False-positive filtering. Graph quality score | Overapproximation by default (assumes full taint propagation for unresolved methods). Requires manual semantic definitions to reduce false positives | **Codegraph wins.** Scored results by default vs. noise-by-default requiring manual tuning | +| 4 | **Zero-cost core, LLM-enhanced when you choose** | Full pipeline local, zero API keys. Optional embeddings with user's LLM provider | Fully local, zero API keys. No LLM enhancement path | **Codegraph wins.** Both are local-first, but codegraph adds optional AI enhancement that Joern lacks entirely | +| 5 | **Functional CLI, embeddable API** | 35+ CLI commands + 18-tool MCP server + full programmatic JS API | Interactive Scala REPL + server mode + script execution. No MCP. Python client library | **Codegraph wins.** Purpose-built MCP for AI agents + embeddable npm package vs. Scala REPL that requires JVM expertise | +| 6 | **One registry, one schema, no magic** | `LANGUAGE_REGISTRY` — add a language in <100 lines, 2 files | Each language has a separate frontend (Eclipse CDT, JavaParser, GraalVM, etc.) — fundamentally different parsers per language | **Codegraph wins.** Uniform tree-sitter extraction vs. heterogeneous parser zoo | +| 7 | **Security-conscious defaults** — multi-repo opt-in | Single-repo MCP default. `apiKeyCommand` for secrets. `--multi-repo` opt-in | Server mode has no sandboxing (docs explicitly warn: "raw interpreter access"). No MCP isolation concept | **Codegraph wins.** Security-by-default vs. "trust the user" | +| 8 | **Honest about what we're not** | Code intelligence engine. Not an app, not a coding tool, not an agent | Code analysis platform for security research. Not a CI tool, not a developer productivity tool | **Tie.** Both are honest about scope. Different scopes | + +**Score: Codegraph 6, Joern 0, Tie 2** — against codegraph's own principles, codegraph wins overwhelmingly. This is expected: the principles were designed around codegraph's unique value proposition. The comparison below examines where Joern's strengths matter despite these principle misalignments. + +--- + +## Feature-by-Feature Comparison + +### A. Parsing & Language Support + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **Parser technology** | tree-sitter (WASM + native Rust) | Language-specific frontends (Eclipse CDT, JavaParser, GraalVM JS, etc.) | **Joern** for depth per language (type-aware); **Codegraph** for uniformity and extensibility | +| **JavaScript** | tree-sitter (native + WASM) | GraalVM JS parser | **Codegraph** — native Rust speed + uniform extraction | +| **TypeScript** | tree-sitter (native + WASM) | GraalVM JS parser (TS via JS) | **Codegraph** — first-class TS + TSX support | +| **Python** | tree-sitter | JavaCC-based parser | **Tie** — both handle standard Python | +| **Go** | tree-sitter | go.parser | **Tie** | +| **Rust** | tree-sitter | Not directly supported (LLVM bitcode only) | **Codegraph** — direct source parsing vs. requiring LLVM compilation | +| **Java** | tree-sitter | JavaParser + Soot (bytecode) | **Joern** — bytecode analysis + type-aware parsing | +| **C/C++** | tree-sitter | Eclipse CDT (fuzzy parsing) | **Joern** — fuzzy parsing handles macros and incomplete code better | +| **C#** | tree-sitter | Roslyn (.NET) | **Joern** — compiler-grade .NET analysis | +| **PHP** | tree-sitter | PHP-Parser | **Tie** | +| **Ruby** | tree-sitter | ANTLR | **Tie** | +| **Kotlin** | Not supported | IntelliJ PSI | **Joern** | +| **Swift** | Not supported | SwiftSyntax | **Joern** | +| **Terraform/HCL** | tree-sitter | Not supported | **Codegraph** | +| **Binary analysis (x86/x64)** | Not supported | Ghidra disassembler | **Joern** | +| **JVM bytecode** | Not supported | Soot framework | **Joern** | +| **LLVM bitcode** | Not supported | LLVM frontend | **Joern** | +| **Language count** | 11 source languages | 13 source + 3 binary/bytecode/IR | **Joern** (16 vs 11) | +| **Adding a new language** | 1 registry entry + 1 extractor (<100 lines, 2 files) | New frontend module (thousands of lines, custom parser integration) | **Codegraph** — dramatically lower barrier | +| **Incomplete/non-compilable code** | Requires syntactically valid input (tree-sitter) | Fuzzy parsing handles partial/broken code | **Joern** — critical for security audits of partial codebases | +| **Incremental parsing** | File-level hash tracking — only changed files re-parsed | Full re-import always | **Codegraph** — orders of magnitude faster for iterative work | + +**Summary:** Joern covers more languages and handles edge cases (binaries, bytecode, broken code) that codegraph cannot. Codegraph is faster, simpler to extend, and has better support for modern web languages (TSX, Terraform). For codegraph's target users (developers, AI agents), codegraph's coverage is sufficient. For security researchers auditing compiled artifacts, Joern is essential. + +--- + +### B. Graph Model & Analysis Depth + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **Graph type** | Structural dependency graph (symbols + edges) | Code Property Graph (AST + CFG + PDG merged) | **Joern** for depth; **Codegraph** for speed | +| **Node types** | 10 kinds: `function`, `method`, `class`, `interface`, `type`, `struct`, `enum`, `trait`, `record`, `module` | 45+ node types across 18 layers (METHOD, CALL, IDENTIFIER, LITERAL, CONTROL_STRUCTURE, BLOCK, LOCAL, etc.) | **Joern** — 4x more granular | +| **Edge types** | `calls`, `imports` (with confidence scores) | 20+ types: AST, CFG, CDG, REACHING_DEF, CALL, ARGUMENT, RECEIVER, CONTAINS, EVAL_TYPE, REF, BINDS, DOMINATE, POST_DOMINATE, etc. | **Joern** — 10x more edge types, representing fundamentally different relationships | +| **Abstract Syntax Tree** | Extracted for complexity metrics, not stored in graph | Full AST stored and queryable | **Joern** | +| **Control Flow Graph** | Not available | Full CFG with dominator/post-dominator trees | **Joern** | +| **Data Dependence Graph** | Not available | Reaching definitions (def-use chains) across procedures | **Joern** | +| **Program Dependence Graph** | Not available | Combined control + data dependence | **Joern** | +| **Taint analysis** | Not available | Full interprocedural taint tracking (sources → sinks) | **Joern** — Joern's killer feature | +| **Call graph** | Import-aware resolution with 6-level confidence scoring, qualified call filtering | Pre-computed CALL edges, caller/callee traversal | **Codegraph** for precision (confidence scoring, false-positive filtering); **Joern** for completeness (type-aware resolution) | +| **Import resolution** | 6-level priority system with confidence scoring (import-aware → same-file → directory → parent → global → method hierarchy) | Type-based resolution via language frontends | **Codegraph** for transparency (scores); **Joern** for accuracy (type information) | +| **Dead code detection** | Node role classification: `roles --role dead` lists unreferenced non-exported symbols | No built-in dead code command (queryable via CPG traversals) | **Codegraph** — built-in command vs. manual query writing | +| **Complexity metrics** | Cognitive, cyclomatic, Halstead, MI, nesting depth per function | Not built-in (would require custom CPG queries) | **Codegraph** | +| **Node role classification** | Auto-tags every symbol: `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` based on fan-in/fan-out | Not available | **Codegraph** | +| **Community detection** | Louvain algorithm with drift analysis | Not built-in | **Codegraph** | +| **Impact analysis** | `fn-impact` (function-level), `diff-impact` (git-aware), `impact` (file-level) | Not purpose-built (achievable via CPG traversals) | **Codegraph** — first-class impact commands vs. manual graph traversal | +| **Shortest path** | `path ` — BFS between any two symbols | Not purpose-built (achievable via CPG traversals) | **Codegraph** — built-in command | +| **Custom data-flow semantics** | Not applicable | User-defined taint propagation rules for external methods | **Joern** | +| **Binary analysis** | Not available | Ghidra frontend: disassembly → CPG | **Joern** | +| **Execution flow tracing** | `flow` — traces from entry points (routes, commands, events) through callees to leaves | Achievable via CFG + call graph traversals | **Codegraph** — purpose-built command; **Joern** — more precise with CFG | + +**Summary:** Joern's CPG is fundamentally deeper — it captures control flow, data dependence, and taint propagation that codegraph's structural graph cannot represent. Codegraph compensates with purpose-built commands (impact analysis, complexity, roles, communities) that would require expert CPG query writing in Joern. For vulnerability discovery, Joern is irreplaceable. For developer productivity and AI agent consumption, codegraph's pre-built commands are more accessible. + +--- + +### C. Query Language & Interface + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **Query interface** | Fixed CLI commands with flags + SQL under the hood | Interactive Scala REPL with tab completion + arbitrary graph traversals | **Depends on user.** Codegraph for instant answers; Joern for exploratory research | +| **Query language** | CLI flags (`--kind`, `--file`, `--role`, `--json`) | CPGQL (Scala-based DSL): `cpg.method.name("foo").callee.name.l` | **Joern** for expressiveness; **Codegraph** for accessibility | +| **Learning curve** | Zero — standard CLI with `--help` | Steep — requires Scala/FP knowledge + graph theory | **Codegraph** | +| **AI agent interface** | 18-tool MCP server with structured JSON responses | Community MCP server (mcp-joern). REST/WebSocket server mode | **Codegraph** — first-party MCP vs. community add-on | +| **Compound queries** | `context` (source + deps + callers + tests in 1 call), `explain` (structural summary), `audit` (explain + impact + health) | Must compose via CPGQL chaining | **Codegraph** — purpose-built for agent token efficiency | +| **Batch queries** | `batch` command for multi-target dispatch | Script mode (`--script`) for batch execution | **Tie** — different approaches, both work | +| **JSON output** | `--json` flag on every command | `.toJsonPretty` method on query results | **Tie** | +| **Syntax-highlighted output** | Colored terminal output | `.dump` for syntax-highlighted code display | **Tie** | +| **Visualization** | DOT, Mermaid, JSON export | DOT, GraphML, GraphSON, Neo4j CSV export + interactive `.plotDotCfg` | **Joern** — more formats + interactive plotting | +| **Script execution** | Not available (but full programmatic JS API) | `--script test.sc` with params and imports | **Joern** for scripting; **Codegraph** for API embedding | +| **Plugin system** | Not available | JVM plugins (ZIP/JAR), DiffGraph API, schema extension | **Joern** | +| **Regex in queries** | Glob-style filtering on names | Full regex in all query steps + semantic definitions | **Joern** | + +**Summary:** Joern's CPGQL is vastly more expressive — you can write arbitrary graph traversals that codegraph simply cannot express. But this power comes with a steep learning curve (Scala + graph theory). Codegraph's fixed commands with flags are instantly usable by any developer or AI agent. For the target users defined in FOUNDATION.md (developers and AI agents, not security researchers), codegraph's approach is better. + +--- + +### D. Performance & Resource Usage + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **Cold build (small project, ~100 files)** | <2 seconds | 19-30 seconds | **Codegraph** (10-15x faster) | +| **Cold build (medium project, ~1,000 files)** | 5-15 seconds | 1-5 minutes | **Codegraph** (10-20x faster) | +| **Cold build (large project, ~50,000 files)** | 30-120 seconds (native Rust) | 30 minutes to hours | **Codegraph** (10-60x faster) | +| **Cold build (Linux kernel, ~30M LOC)** | Not benchmarked (estimated: minutes) | 6+ hours, 30-100 GB heap | **Codegraph** (estimated orders of magnitude faster) | +| **Incremental rebuild (1 file changed)** | <500ms | Full re-import (same as cold build) | **Codegraph** (100-10,000x faster) | +| **Memory usage (small project)** | <100 MB | 4-8 GB heap recommended | **Codegraph** (40-80x less memory) | +| **Memory usage (medium project)** | 100-300 MB | 8-16 GB heap | **Codegraph** (30-50x less memory) | +| **Memory usage (large project)** | 300 MB - 1 GB | 30-100 GB heap | **Codegraph** (30-100x less memory) | +| **Startup time** | <100ms (Node.js) | 5-15 seconds (JVM cold start) | **Codegraph** (50-150x faster) | +| **Storage format** | SQLite file (compact, portable) | Flatgraph binary (columnar, in-memory) | **Codegraph** — SQLite is universally readable; flatgraph is opaque | +| **Disk usage** | Typically <10 MB for medium projects | Linux kernel: 625 MB (flatgraph) | **Codegraph** (60x+ smaller) | +| **Overflow to disk** | SQLite handles this natively | Flatgraph has no overflow — entire graph must fit in memory | **Codegraph** — can handle repos larger than available RAM | +| **Parallel parsing** | Native Rust engine uses rayon for parallel tree-sitter | Language frontends may parallelize internally | **Codegraph** — explicit parallel architecture | +| **Watch mode** | Built-in `watch` command for live incremental rebuilds | Not available | **Codegraph** | +| **Commit hook viability** | Yes — <500ms rebuilds are invisible to developers | No — 19+ second minimum makes hooks impractical | **Codegraph** | +| **CI pipeline viability** | Yes — full build in seconds, `check` command returns exit code 0/1 | Possible but slow — Joern itself is "not yet well-suited as a CI/CD SAST tool" (per comparative analysis) | **Codegraph** | + +**Summary:** Codegraph is 10-10,000x faster than Joern depending on scenario. Joern's JVM overhead, full re-import model, and in-memory graph requirement make it unsuitable for tight feedback loops. This is codegraph's single most important competitive advantage (FOUNDATION.md Principle 1). + +--- + +### E. Installation & Deployment + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **Install method** | `npm install @optave/codegraph` | Shell script (`joern-install.sh`) or Docker or build from source (sbt) | **Codegraph** — one command vs. multi-step | +| **Runtime dependency** | Node.js >= 20 | JDK 19+ (JDK 21 recommended) | **Codegraph** — Node.js is more ubiquitous in developer environments | +| **External database** | None (SQLite embedded) | None (flatgraph embedded) | **Tie** | +| **Docker required** | No | No (but Docker images available) | **Tie** | +| **Platform binaries** | Auto-resolved per platform (`@optave/codegraph-{platform}-{arch}`) | Pre-built binaries for major platforms | **Codegraph** — npm handles platform resolution automatically | +| **Disk footprint (tool itself)** | ~50 MB (with WASM grammars) | ~500 MB+ (JVM + all frontends) | **Codegraph** (10x smaller) | +| **Offline capability** | Full functionality offline | Full functionality offline | **Tie** | +| **Configuration** | `.codegraphrc.json` + env vars + `apiKeyCommand` | JVM flags (`-Xmx`), workspace settings | **Codegraph** — simpler, declarative | +| **Uninstall** | `npm uninstall` | Manual removal of install directory | **Codegraph** | + +**Summary:** Codegraph is dramatically simpler to install and manage. `npm install` vs. downloading a shell script and ensuring JDK compatibility. For the FOUNDATION.md goal of "`npm install` and done" (Principle 2, 5), codegraph is the clear winner. + +--- + +### F. AI Agent & MCP Integration + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **MCP server** | First-party, 18 tools, single-repo default, `--multi-repo` opt-in | Community-built (mcp-joern), Python wrapper around Joern | **Codegraph** — first-party, security-conscious, production-ready | +| **MCP tools count** | 18 purpose-built tools | ~10 tools (community MCP) | **Codegraph** | +| **Token efficiency** | `context`/`explain`/`audit` compound commands reduce agent round-trips by 50-80% | Raw query results, no compound optimization | **Codegraph** | +| **Structured JSON output** | Every command supports `--json` | `.toJsonPretty` on query results | **Tie** | +| **Pagination** | Built-in pagination helpers with configurable limits | Not built-in | **Codegraph** | +| **REST API** | Not available (MCP + programmatic API) | Server mode with REST + WebSocket | **Joern** for HTTP integration; **Codegraph** for MCP | +| **Python client** | Not available | `cpgqls-client-python` | **Joern** for Python ecosystems | +| **Programmatic embedding** | Full JS API: `import { buildGraph, queryNameData } from '@optave/codegraph'` | JVM-only: Scala/Java library | **Codegraph** for JS/TS ecosystems; **Joern** for JVM ecosystems | +| **Multi-repo support** | Registry-based, opt-in via `--multi-repo` or `--repos` | Workspace with multiple projects | **Tie** — different approaches | + +**Summary:** Codegraph is purpose-built for AI agent consumption (FOUNDATION.md Principle 5). Joern's community MCP exists but is a wrapper, not a first-class integration. For the AI-agent-driven development workflow that codegraph targets, codegraph is the clear choice. + +--- + +### G. Security Analysis + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **Taint analysis** | Not available | Full interprocedural source-to-sink tracking | **Joern** — this is Joern's raison d'etre | +| **Vulnerability scanning** | Not available | `joern-scan` with predefined query bundles, tag-based selection | **Joern** | +| **Data-flow tracking** | Not available | Reaching definitions, def-use chains across procedures | **Joern** | +| **Control-flow analysis** | Not available | Full CFG with dominator trees | **Joern** | +| **Custom security rules** | Not available | CPGQL-based custom queries + data-flow semantics | **Joern** | +| **Binary vulnerability analysis** | Not available | Ghidra integration for x86/x64 | **Joern** | +| **OWASP/CWE detection** | Not available (roadmap) | Achievable via custom CPGQL queries | **Joern** | +| **Secret scanning** | Not available | Not built-in | **Tie** — neither has it built-in | +| **CPG slicing** | Not available | `joern-slice` with data-flow and usages modes | **Joern** | + +**Summary:** Joern dominates security analysis completely. Codegraph has no security features today. This is by design — FOUNDATION.md Principle 8 says "we are not a security tool." OWASP pattern detection is on the roadmap as lightweight AST-based checks, not full taint analysis. + +--- + +### H. Developer Productivity Features + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **Impact analysis (function-level)** | `fn-impact ` — transitive callers + downstream impact | Achievable via CPGQL (not purpose-built) | **Codegraph** | +| **Impact analysis (git-aware)** | `diff-impact --staged` / `diff-impact main` — shows what functions break from git changes | Not available | **Codegraph** | +| **CI gate** | `check --staged` — exit code 0/1 for CI pipelines (cycles, complexity, blast radius, boundaries) | Not purpose-built for CI | **Codegraph** | +| **Complexity metrics** | `complexity` — cognitive, cyclomatic, Halstead, MI per function | Not built-in | **Codegraph** | +| **Code health manifesto** | `manifesto` — configurable rule engine with warn/fail thresholds | Not available | **Codegraph** | +| **Structure analysis** | `structure` — directory hierarchy with cohesion scores + per-file metrics | Not available | **Codegraph** | +| **Hotspot detection** | `hotspots` — files/dirs with extreme fan-in/fan-out/density | Not available | **Codegraph** | +| **Co-change analysis** | `co-change` — git history analysis for files that change together | Not available | **Codegraph** | +| **Branch comparison** | `branch-compare` — structural diff between branches | Not available | **Codegraph** | +| **Triage/risk ranking** | `triage` — ranked audit queue by composite risk score | Not available | **Codegraph** | +| **CODEOWNERS integration** | `owners` — maps functions to code owners | Not available | **Codegraph** | +| **Semantic search** | `search` — natural language function search with optional embeddings | Not available | **Codegraph** | +| **Watch mode** | `watch` — live incremental rebuilds on file changes | Not available | **Codegraph** | +| **Snapshot management** | `snapshot save/restore` — DB backup and restore | Workspace save/undo | **Tie** | +| **Execution flow tracing** | `flow` — traces from entry points through callees | Achievable via CFG traversals (more precise) | **Codegraph** for convenience; **Joern** for precision | +| **Module overview** | `map` — high-level module map with most-connected nodes | Not purpose-built | **Codegraph** | +| **Cycle detection** | `cycles` — circular dependency detection | Achievable via CPGQL | **Codegraph** — built-in command | +| **Export formats** | DOT, Mermaid, JSON | DOT, GraphML, GraphSON, Neo4j CSV | **Joern** — more export formats | + +**Summary:** Codegraph has 15+ purpose-built developer productivity commands that Joern either lacks entirely or requires expert CPGQL queries to achieve. This is where codegraph's value proposition is strongest for its target audience. + +--- + +### I. Ecosystem & Community + +| Feature | Codegraph | Joern | Best Approach | +|---------|-----------|-------|---------------| +| **GitHub stars** | New project (growing) | ~2,968 | **Joern** | +| **Contributors** | Small team | 64 | **Joern** | +| **Release cadence** | As needed | **Daily automated releases** | **Joern** — impressive automation | +| **Academic backing** | None | IEEE S&P 2014 paper (Test-of-Time Award 2024), TU Braunschweig, Stellenbosch University | **Joern** | +| **Commercial backing** | Optave AI Solutions Inc. | Qwiet AI (formerly ShiftLeft), Privado, Whirly Labs | **Joern** — multiple sponsors | +| **Documentation** | CLAUDE.md + CLI `--help` + programmatic API docs | docs.joern.io + cpg.joern.io + blog + query database | **Joern** — comprehensive docs site | +| **Community channels** | GitHub Issues | Discord + GitHub Issues + Twitter | **Joern** — more channels | +| **Plugin ecosystem** | Not available | JVM plugin system with sample plugin | **Joern** | +| **Client libraries** | JS/TS (first-party) | Python client (first-party), any language via REST | **Tie** — different language ecosystems | +| **License** | Apache-2.0 | Apache-2.0 | **Tie** | + +**Summary:** Joern has a massive head start — 7 years of development, academic foundation, commercial backing, and a mature community. Codegraph is a new entrant competing on a different value proposition. + +--- + +## Where Each Tool is the Better Choice + +### Choose Codegraph when: + +1. **You need the graph to stay current in tight feedback loops** — commit hooks, watch mode, AI agent loops. Joern's 19+ second minimum rebuild makes this impossible. +2. **You're building AI-agent-driven workflows** — MCP server, compound commands, structured JSON, token-efficient responses. Codegraph is purpose-built for this. +3. **You want zero-configuration setup** — `npm install` vs. JDK + shell script + heap tuning. +4. **Memory is constrained** — <100 MB vs. 4-100 GB. Codegraph runs on any developer machine; Joern may require dedicated infrastructure for large repos. +5. **You need developer productivity features** — impact analysis, complexity metrics, code health rules, co-change analysis, hotspots, structure analysis. These don't exist in Joern. +6. **You're working with modern web stacks** — TSX, Terraform, and tree-sitter's broad but uniform coverage. Joern's web language support is secondary to its C/C++/Java strength. +7. **You want scored, confidence-ranked results** — every edge has a confidence score. Joern overapproximates by default. +8. **You're integrating into CI/CD** — `check --staged` returns exit code 0/1 in seconds. Joern is "not yet well-suited" for CI/CD. + +### Choose Joern when: + +1. **You're doing security research or vulnerability discovery** — taint analysis, CPG traversals, binary analysis. Codegraph has zero security analysis features. +2. **You need control-flow or data-dependence analysis** — CFG, PDG, DDG, dominator trees. Codegraph's structural graph doesn't capture these. +3. **You're analyzing compiled artifacts** — JVM bytecode, LLVM bitcode, x86/x64 binaries. Codegraph is source-only. +4. **You need exploratory graph queries** — CPGQL lets you write arbitrary traversals. Codegraph's fixed commands can't express ad-hoc queries. +5. **You're auditing C/C++ code** — Eclipse CDT's fuzzy parsing handles macros, `#ifdef`, and incomplete code that tree-sitter cannot. +6. **You need to analyze non-compilable code** — partial codebases, broken builds, code fragments. Joern's fuzzy parsing handles these; tree-sitter requires syntactically valid input. +7. **You want academic-grade analysis** — Joern is backed by published research with IEEE recognition. Its CPG model is formally specified. +8. **You're in a JVM ecosystem** — Scala/Java/Kotlin interop, Soot bytecode analysis, plugin system. + +### Use both together when: + +- **CI pipeline**: Codegraph for fast structural checks on every commit (`check --staged`), Joern for periodic deep security scans (weekly/release-gated). +- **AI agent workflow**: Codegraph's MCP provides structural context in agent loops; Joern's server mode provides deep analysis for security-focused queries. +- **Pre-commit + pre-release**: Codegraph in commit hooks (fast), Joern in release gates (thorough). + +--- + +## Gap Analysis: What Codegraph Could Learn from Joern + +### Worth adopting (adapted to codegraph's model) + +| Joern Feature | Adaptation for Codegraph | Effort | Priority | +|---------------|--------------------------|--------|----------| +| **CPG slicing** | Lightweight call-chain slicing — extract a subgraph around a function (callers + callees to depth N) as standalone JSON. Not full PDG slicing, but useful for AI context windows | Medium | High — directly serves AI agent use case | +| **More export formats** | Add GraphML and Neo4j CSV to `export` command alongside existing DOT/Mermaid/JSON | Low | Medium | +| **Interactive plotting** | `plotDotCfg`-style browser-based visualization from `export --format html` | Medium | Medium — on roadmap as "interactive HTML visualization" | +| **Script/batch automation** | Already have `batch` command. Could add a simple query script format for CI pipelines | Low | Low | +| **Custom query language** | Not worth building a DSL. Instead, expand `--filter` expressions on existing commands (e.g. `where --filter "fanIn > 5 AND kind = function"`) | Medium | Medium | + +### Not worth adopting (violates FOUNDATION.md) + +| Joern Feature | Why Not | +|---------------|---------| +| **Full CPG (AST + CFG + PDG)** | Would require fundamentally different parsing — we'd be rebuilding Joern. Violates Principle 1 (rebuild speed) and Principle 6 (one registry). Tree-sitter + lightweight dataflow is the pragmatic path | +| **Taint analysis** | Requires control-flow and data-dependence graphs we don't have. Adding these would 10-100x our build time, violating Principle 1 | +| **Scala DSL** | Our users are developers and AI agents, not security researchers. Fixed commands with flags serve them better (Principle 5) | +| **JVM binary analysis** | Violates Principle 8 (honest about what we're not) — we're a source code tool | +| **Plugin system** | Premature complexity. Programmatic API + MCP tools are sufficient interfaces today | +| **Workspace with multiple loaded CPGs** | Our registry + `--multi-repo` achieves this without loading multiple graphs into memory simultaneously | + +--- + +## Competitive Positioning Statement + +> **Joern is the gold standard for security-focused code analysis** — if you need taint tracking, control-flow analysis, or binary vulnerability discovery, nothing else comes close. But its JVM overhead (4-100 GB heap), full re-import model (minutes to hours), and Scala learning curve make it impractical for the fast-feedback, AI-agent-driven development workflows that modern teams need. +> +> **Codegraph occupies a different niche entirely:** always-current structural intelligence that rebuilds in milliseconds, runs with zero configuration, and serves AI agents via purpose-built MCP tools. Where Joern answers "can attacker data reach this sink?", codegraph answers "what breaks if I change this function?" — and answers it 1,000x faster. +> +> They are not substitutes. They are complements. The team that uses codegraph in their commit hooks and Joern in their release gates gets the best of both worlds. + +--- + +## Key Metrics Summary + +| Metric | Codegraph | Joern | Winner | +|--------|-----------|-------|--------| +| Incremental rebuild speed | <500ms | N/A (full re-import) | Codegraph | +| Cold build speed | Seconds | Minutes to hours | Codegraph | +| Memory usage | <100 MB typical | 4-100 GB | Codegraph | +| Install complexity | `npm install` | JDK + shell script | Codegraph | +| Analysis depth (structural) | High | Very High | Joern | +| Analysis depth (security) | None | Best in class | Joern | +| AI agent integration | 18-tool MCP (first-party) | Community MCP wrapper | Codegraph | +| Developer productivity commands | 35+ built-in | ~5 built-in + custom CPGQL | Codegraph | +| Language support | 11 | 16 (incl. binary/bytecode) | Joern | +| Query expressiveness | Fixed commands | Arbitrary graph traversals | Joern | +| Community & maturity | New | 7 years, IEEE award, 2,968 stars | Joern | +| CI/CD readiness | Yes (`check --staged`) | Limited | Codegraph | + +**Final score against FOUNDATION.md principles: Codegraph 6, Joern 0, Tie 2.** +Joern doesn't compete on codegraph's principles — it competes on analysis depth and security research, which are outside codegraph's stated scope. From d3c4daef43221872419444718b00e805ca53ca60 Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 17:50:37 -0700 Subject: [PATCH 2/8] fix: update broken links to moved COMPETITIVE_ANALYSIS.md README.md and docs/roadmap/BACKLOG.md still referenced the old path at generated/COMPETITIVE_ANALYSIS.md after the file was moved to generated/competitive/COMPETITIVE_ANALYSIS.md in #260. --- README.md | 2 +- docs/roadmap/BACKLOG.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 6784fe9a..4f79017c 100644 --- a/README.md +++ b/README.md @@ -69,7 +69,7 @@ That's it. No config files, no Docker, no JVM, no API keys, no accounts. The gra ### Feature comparison -Comparison last verified: March 2026. Full analysis: COMPETITIVE_ANALYSIS.md +Comparison last verified: March 2026. Full analysis: COMPETITIVE_ANALYSIS.md | Capability | codegraph | [joern](https://github.com/joernio/joern) | [narsil-mcp](https://github.com/postrv/narsil-mcp) | [code-graph-rag](https://github.com/vitali87/code-graph-rag) | [cpg](https://github.com/Fraunhofer-AISEC/cpg) | [GitNexus](https://github.com/abhigyanpatwari/GitNexus) | [CodeMCP](https://github.com/SimplyLiz/CodeMCP) | [axon](https://github.com/harshkedia177/axon) | |---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| diff --git a/docs/roadmap/BACKLOG.md b/docs/roadmap/BACKLOG.md index 7b6e7fd2..5e4aa7f7 100644 --- a/docs/roadmap/BACKLOG.md +++ b/docs/roadmap/BACKLOG.md @@ -1,7 +1,7 @@ # Codegraph Feature Backlog **Last updated:** 2026-03-02 -**Source:** Features derived from [COMPETITIVE_ANALYSIS.md](../../generated/COMPETITIVE_ANALYSIS.md) and internal roadmap discussions. +**Source:** Features derived from [COMPETITIVE_ANALYSIS.md](../../generated/competitive/COMPETITIVE_ANALYSIS.md) and internal roadmap discussions. --- From 3b4da909be5b9bbe6d49a89f08586c17c9993d5f Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 18:12:08 -0700 Subject: [PATCH 3/8] docs: add Joern-inspired feature candidates with BACKLOG-style grading MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Append a new "Joern-Inspired Feature Candidates" section to the Joern competitive deep-dive. Lists 11 actionable features extracted from Parsing & Language Support, Graph Model & Analysis Depth, and Query Language & Interface sections — assessed with the same tier/grading system used in BACKLOG.md (zero-dep, foundation-aligned, problem-fit, breaking). Tier 1 non-breaking: call-chain slicing, type-informed resolution, error-tolerant parsing, regex filtering, Kotlin, Swift, script execution. Tier 1 breaking: expanded node/edge types, intraprocedural CFG, stored AST. Not adopted: 9 features with FOUNDATION.md reasoning. Cross-references BACKLOG IDs 14 and 7. --- generated/competitive/joern.md | 54 ++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/generated/competitive/joern.md b/generated/competitive/joern.md index 0b7d0487..a6960682 100644 --- a/generated/competitive/joern.md +++ b/generated/competitive/joern.md @@ -336,3 +336,57 @@ Codegraph's foundation document defines the problem as: *"Fast local analysis wi **Final score against FOUNDATION.md principles: Codegraph 6, Joern 0, Tie 2.** Joern doesn't compete on codegraph's principles — it competes on analysis depth and security research, which are outside codegraph's stated scope. + +--- + +## Joern-Inspired Feature Candidates + +Features extracted from sections **A. Parsing & Language Support**, **B. Graph Model & Analysis Depth**, and **C. Query Language & Interface** above, assessed using the [BACKLOG.md](../../docs/roadmap/BACKLOG.md) tier and grading system. See the [Scoring Guide](../../docs/roadmap/BACKLOG.md#scoring-guide) for column definitions. + +### Tier 1 — Zero-dep + Foundation-aligned (build these first) + +Non-breaking, ordered by problem-fit: + +| ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | +|----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------| +| J1 | Lightweight call-chain slicing | Extract a bounded subgraph around a function (callers + callees to depth N) as standalone JSON/DOT/Mermaid. Not full PDG slicing — structural BFS on existing edges, exported as a self-contained artifact. Inspired by Joern's `joern-slice`. | Navigation | Agents get precisely-scoped subgraphs that fit context windows instead of full graph dumps — directly reduces token waste | ✓ | ✓ | 4 | No | +| J2 | Type-informed call resolution | Extract type annotations from tree-sitter AST (TypeScript types, Java types, Go types, Python type hints) and use them to disambiguate call targets during import resolution. Improves edge accuracy without full type inference. Inspired by Joern's type-aware language frontends. | Analysis | Call graphs become more precise — fewer false edges means less noise in `fn-impact` and agents don't chase phantom dependencies | ✓ | ✓ | 4 | No | +| J3 | Error-tolerant partial parsing | Leverage tree-sitter's built-in error recovery to extract symbols from syntactically incomplete or broken files instead of skipping them entirely. Surface partial results with a quality indicator per file. Currently codegraph requires syntactically valid input; Joern's fuzzy parsing handles partial/broken code. | Parsing | Agents can analyze WIP branches, partial checkouts, and code mid-refactor — essential for real-world AI-agent loops where code is often in a broken state | ✓ | ✓ | 3 | No | +| J4 | Kotlin language support | Add tree-sitter-kotlin to `LANGUAGE_REGISTRY`. 1 registry entry + 1 extractor function (<100 lines, 2 files). Covers functions, classes, interfaces, objects, data classes, companion objects, call sites. Kotlin is one of Joern's strongest languages (via IntelliJ PSI). | Parsing | Extends coverage to Android/KMP ecosystem — one of the most-requested missing languages and a gap vs. Joern | ✓ | ✓ | 2 | No | +| J5 | Swift language support | Add tree-sitter-swift to `LANGUAGE_REGISTRY`. 1 registry entry + 1 extractor function (<100 lines, 2 files). Covers functions, classes, structs, protocols, enums, extensions, call sites. Joern supports Swift via SwiftSyntax. | Parsing | Extends coverage to Apple/iOS ecosystem — currently a gap vs. Joern. tree-sitter-swift is mature enough for production use | ✓ | ✓ | 2 | No | +| J10 | Regex filtering in queries | Upgrade name filtering from glob-style to full regex on `where`, `list-functions`, `roles`, and other symbol-listing commands. Add `--regex` flag alongside existing glob behavior. Joern supports full regex in all CPGQL query steps. | Query | Agents and power users can express precise symbol patterns (e.g. `--regex "^(get\|set)[A-Z]"`) — reduces result noise and round-trips for targeted queries | ✓ | ✓ | 3 | No | +| J11 | Query script execution | Simple `.codegraph` script format: a sequence of CLI commands executed in order, with variable substitution and JSON piping between steps. Not a DSL — just a thin automation layer over existing commands. Inspired by Joern's `--script test.sc` with params and imports. | Automation | CI pipelines and agent orchestrators can run multi-step analysis sequences in one invocation instead of chaining shell commands — reduces boilerplate and ensures consistent execution | ✓ | ✓ | 2 | No | + +Breaking (penalized to end of tier): + +| ID | Title | Description | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | +|----|-------|-------------|----------|---------|----------|-------------------|-------------------|----------| +| J6 | Expanded node types | Extract parameters, local variables, return types, and control structures as first-class graph nodes. Expands from 10 `SYMBOL_KINDS` to ~20. Enables richer queries like "which functions take a `Request` parameter?" without reading source. Inspired by Joern's 45+ node types across 18 layers. | Graph Model | Agents can answer structural questions about function signatures and internal shape from the graph alone — fewer source-reading round-trips | ✓ | ✓ | 3 | Yes | +| J7 | Expanded edge types | Add `contains`, `parameter_of`, `return_type`, `receiver`, `type_of` edges alongside existing `calls`/`imports`. Expands from 2 edge types to ~7. Enables structural queries across containment and type relationships. Inspired by Joern's 20+ edge types (AST, CDG, REACHING_DEF, ARGUMENT, RECEIVER, etc.). | Graph Model | Richer graph structure supports more precise impact analysis and enables queries that currently require source reading | ✓ | ✓ | 3 | Yes | +| J8 | Intraprocedural control flow graph | Build lightweight CFG within functions from tree-sitter AST: basic blocks, branches, loops, early returns. Store as edges with type `cfg`. Does not require language-specific compiler frontends — tree-sitter control structure nodes are sufficient. Prerequisite for dataflow analysis ([BACKLOG ID 14](../../docs/roadmap/BACKLOG.md)). Inspired by Joern's full CFG with dominator/post-dominator trees. | Graph Model | Enables complexity-aware impact analysis and opens the path to lightweight dataflow — bridges the gap between structural-only and Joern's full CPG without violating P1 rebuild speed | ✓ | ✓ | 3 | Yes | +| J9 | Stored queryable AST | Persist selected AST nodes (statements, expressions, literals) in a dedicated SQLite table alongside symbols. Queryable via CLI/MCP for pattern matching (e.g. "find all `eval()` calls", "find hardcoded strings"). Currently AST is extracted for complexity metrics but not stored in the graph. Inspired by Joern's full AST storage and queryability. | Graph Model | Enables lightweight AST-based pattern detection (security patterns, anti-patterns) without re-parsing source files — foundation for [BACKLOG ID 7](../../docs/roadmap/BACKLOG.md) (OWASP/CWE patterns) | ✓ | ✓ | 3 | Yes | + +### Not adopted (violates FOUNDATION.md) + +These Joern features were evaluated and deliberately excluded: + +| Joern Feature | Section | Why Not | +|---------------|---------|---------| +| **Full CPG (AST + CFG + PDG merged)** | B | Would require fundamentally different parsing — we'd be rebuilding Joern. Violates P1 (rebuild speed) and P6 (one registry). Tree-sitter + lightweight dataflow is the pragmatic path | +| **Interprocedural taint analysis** | B | Requires control-flow and data-dependence graphs we don't have. Adding these would 10-100x build time, violating P1. Joern's killer feature, but outside our scope | +| **Program Dependence Graph (PDG)** | B | Combined control + data dependence requires full CFG + DDG. The lightweight CFG in J8 is a deliberate subset — full PDG is Joern territory | +| **Custom data-flow semantics** | B | User-defined taint propagation rules require the taint infrastructure we've chosen not to build. Joern's `Semantics` DSL is powerful but orthogonal to our goals | +| **JVM bytecode analysis** | A | Violates P8 (honest about what we're not) — we're a source code tool. Requires Soot or equivalent JVM dependency | +| **LLVM bitcode analysis** | A | Violates P8 — requires LLVM toolchain. We analyze source, not compiler intermediate representations | +| **Binary analysis (x86/x64)** | A | Violates P8 — requires Ghidra or equivalent disassembler. Fundamentally different problem domain | +| **Language-specific compiler frontends** | A | Violates P6 (one registry, one schema, no magic). Joern uses Eclipse CDT for C/C++, JavaParser for Java, Roslyn for C#, IntelliJ PSI for Kotlin — each is a separate, heavyweight parser. Tree-sitter uniformity is a deliberate advantage worth preserving | +| **Plugin system (JVM plugins, DiffGraph API)** | C | Premature complexity. Programmatic JS API + MCP tools are sufficient extension interfaces today. JVM-style plugin architecture (ZIP/JAR, schema extension) adds maintenance burden without clear user demand. Revisit if extension points become a bottleneck | + +### Cross-references to existing BACKLOG items + +These Joern-inspired capabilities are already tracked in [BACKLOG.md](../../docs/roadmap/BACKLOG.md): + +| BACKLOG ID | Title | Joern Equivalent | Relationship | +|------------|-------|------------------|--------------| +| 14 | Dataflow analysis | Data Dependence Graph (def-use chains) | The lightweight codegraph equivalent of Joern's DDG — `flows_to`/`returns`/`mutates` edge types. Already Tier 1 Breaking. J8 (intraprocedural CFG) is a prerequisite | +| 7 | OWASP/CWE pattern detection | Vulnerability scanning (`joern-scan`) | Lightweight AST-based security checks — the codegraph-appropriate alternative to Joern's taint-based vulnerability scanning. Already Tier 3. J9 (stored queryable AST) is a prerequisite | From 0faf02c9e60a4c225d1d01ae71915f069cead9bf Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 18:31:40 -0700 Subject: [PATCH 4/8] docs: add competitive deep-dive for Narsil-MCP with feature candidates Comprehensive comparison across 10 dimensions: parsing (32 vs 11 languages), graph model (CFG/DFG/type inference vs complexity/roles/ communities), search (similarity/chunking vs RRF hybrid), security (147 rules vs none), queries (90 tools vs 21 + compound commands), performance (cold start vs incremental), install, MCP integration, developer productivity, and ecosystem. Feature candidates section covers all comparison sections: - Tier 1 non-breaking (10): MCP presets, AST chunking, code similarity, git blame/symbol history, remote repo indexing, config wizard, Kotlin, Swift, Bash, Scala language support - Tier 1 breaking (1): export map per module - Tier 2 (2): interactive HTML viz, multiple embedding backends - Tier 3 (2): OWASP patterns, SBOM generation - Not adopted (10): taint, type inference, SPARQL/RDF, CCG, in-memory arch, 90-tool surface, browser WASM, Forgemax, LSP, license scanning - Cross-references to BACKLOG IDs 7, 8, 10, 14 and Joern candidates J4, J5, J8, J9 --- generated/competitive/narsil-mcp.md | 415 ++++++++++++++++++++++++++++ 1 file changed, 415 insertions(+) create mode 100644 generated/competitive/narsil-mcp.md diff --git a/generated/competitive/narsil-mcp.md b/generated/competitive/narsil-mcp.md new file mode 100644 index 00000000..0bab58d0 --- /dev/null +++ b/generated/competitive/narsil-mcp.md @@ -0,0 +1,415 @@ +# Competitive Deep-Dive: Codegraph vs Narsil-MCP + +**Date:** 2026-03-02 +**Competitors:** `@optave/codegraph` v2.x (Apache-2.0) vs `postrv/narsil-mcp` v1.6 (Apache-2.0 OR MIT) +**Context:** Both are Apache-2.0-licensed code analysis tools with MCP interfaces. Narsil-MCP is ranked #2 in our [competitive analysis](./COMPETITIVE_ANALYSIS.md) with a score of 4.5 vs codegraph's 4.0 at #8. + +--- + +## Executive Summary + +Narsil-MCP and codegraph share more DNA than any other pair in the competitive landscape — both use tree-sitter, both serve AI agents via MCP, both are local-first. But they diverge sharply in philosophy: + +| Dimension | Narsil-MCP | Codegraph | +|-----------|------------|-----------| +| **Primary mission** | Maximum-breadth code intelligence in a single binary | Always-current structural intelligence with sub-second rebuilds | +| **Target user** | AI agents needing comprehensive analysis (security, types, dataflow) | Developers, AI coding agents, CI pipelines needing fast feedback | +| **Architecture** | MCP-first, no standalone CLI queries | Full CLI + MCP server + programmatic JS API | +| **Core question answered** | "Tell me everything about this code" (90 tools) | "What breaks if I change this function?" (focused commands) | +| **Rebuild model** | In-memory index, opt-in persistence, file watcher | SQLite-persisted, incremental hash-based rebuilds | +| **Runtime** | Single Rust binary (~30 MB) | Node.js + optional native Rust addon | + +**Bottom line:** Narsil-MCP is broader (90 tools, 32 languages, security scanning, taint analysis, SBOM, type inference). Codegraph is deeper on developer productivity (impact analysis, complexity metrics, community detection, architecture boundaries, manifesto rules) and faster for iterative workflows (incremental rebuilds, CI gates). Where they overlap (call graphs, dead code, search, MCP), narsil has more tools while codegraph has more purpose-built commands. They are the closest competitors in the landscape. + +--- + +## Problem Alignment with FOUNDATION.md + +Codegraph's foundation document defines the problem as: *"Fast local analysis with no AI, or powerful AI features that require full re-indexing through cloud APIs on every change. None of them give you an always-current graph."* + +### Principle-by-principle evaluation + +| # | Principle | Codegraph | Narsil-MCP | Verdict | +|---|-----------|-----------|------------|---------| +| 1 | **The graph is always current** — rebuild on every commit/save/agent loop | File-level MD5 hashing, SQLite persistence. Change 1 file → <500ms rebuild. Watch mode, commit hooks, agent loops all practical | In-memory by default. `--watch` flag for auto-reindex. `--persist` for disk saves. Indexing is fast (2.1s for 50K symbols) but full re-index, not incremental | **Codegraph wins.** Narsil is fast but re-indexes everything. Codegraph only re-parses changed files — orders of magnitude faster for single-file changes in large repos | +| 2 | **Native speed, universal reach** — dual engine (Rust + WASM) | Native napi-rs with rayon parallelism + automatic WASM fallback. `npm install` on any platform | Pure Rust binary. Prebuilt for macOS/Linux/Windows. Also has WASM build (~3 MB) for browsers | **Tie.** Different approaches, both effective. Narsil is a single binary; codegraph is an npm package with native addon. Both have WASM stories | +| 3 | **Confidence over noise** — scored results | 6-level import resolution with 0.0-1.0 confidence on every edge. Graph quality score. Relevance-ranked search | BM25 ranking on search. No confidence scores on call graph edges. No graph quality metric | **Codegraph wins.** Every edge has a trust score; narsil's call graph edges are unscored | +| 4 | **Zero-cost core, LLM-enhanced when you choose** | Full pipeline local, zero API keys. Optional embeddings with user's LLM provider | Core is local. Neural search requires `--neural` flag + API key (Voyage AI/OpenAI) or local ONNX model | **Tie.** Both are local-first with optional AI enhancement. Narsil offers more backend choices (Voyage AI, OpenAI, ONNX); codegraph uses HuggingFace Transformers locally | +| 5 | **Functional CLI, embeddable API** | 35+ CLI commands + 18-tool MCP server + full programmatic JS API | MCP-first with 90 tools. `narsil-mcp config/tools` management commands but no standalone query CLI. No programmatic library API | **Codegraph wins.** Full CLI experience + embeddable API. Narsil is MCP-only for queries — useless without an MCP client | +| 6 | **One registry, one schema, no magic** | `LANGUAGE_REGISTRY` — add a language in <100 lines, 2 files | Tree-sitter for all 32 languages. Unified parser, but extractors are in compiled Rust — harder to contribute | **Codegraph wins slightly.** Both use tree-sitter uniformly. Codegraph's JS extractors are more accessible to contributors than narsil's compiled Rust | +| 7 | **Security-conscious defaults** — multi-repo opt-in | Single-repo MCP default. `apiKeyCommand` for secrets. `--multi-repo` opt-in | Multi-repo by default (`--repos` accepts multiple paths). `discover_repos` auto-finds repos. No sandboxing concept | **Codegraph wins.** Single-repo isolation by default vs. multi-repo by default | +| 8 | **Honest about what we're not** | Code intelligence engine. Not an app, not a coding tool, not an agent | Code intelligence MCP server. Also not an agent — but the open-core model adds commercial cloud features (narsil-cloud) | **Tie.** Both are honest about scope. Narsil's commercial layer is a legitimate business model | + +**Score: Codegraph 4, Narsil 0, Tie 4** — codegraph wins on its own principles but the gap is much smaller than vs. Joern. Narsil is the closest philosophical competitor. + +--- + +## Feature-by-Feature Comparison + +### A. Parsing & Language Support + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Parser technology** | tree-sitter (WASM + native Rust) | tree-sitter (compiled Rust) | **Tie** — same parser, different build strategies | +| **JavaScript/TypeScript/TSX** | First-class, separate grammars | Supported (JS + TS) | **Codegraph** — explicit TSX support | +| **Python** | tree-sitter | tree-sitter | **Tie** | +| **Go** | tree-sitter | tree-sitter | **Tie** | +| **Rust** | tree-sitter | tree-sitter | **Tie** | +| **Java** | tree-sitter | tree-sitter | **Tie** | +| **C/C++** | tree-sitter | tree-sitter | **Tie** | +| **C#** | tree-sitter | tree-sitter | **Tie** | +| **PHP** | tree-sitter | tree-sitter | **Tie** | +| **Ruby** | tree-sitter | tree-sitter | **Tie** | +| **Terraform/HCL** | tree-sitter | Not listed | **Codegraph** | +| **Kotlin** | Not supported | tree-sitter | **Narsil** | +| **Swift** | Not supported | tree-sitter | **Narsil** | +| **Scala** | Not supported | tree-sitter | **Narsil** | +| **Lua** | Not supported | tree-sitter | **Narsil** | +| **Haskell** | Not supported | tree-sitter | **Narsil** | +| **Elixir/Erlang** | Not supported | tree-sitter | **Narsil** | +| **Dart** | Not supported | tree-sitter | **Narsil** | +| **Julia/R/Perl** | Not supported | tree-sitter | **Narsil** | +| **Zig** | Not supported | tree-sitter | **Narsil** | +| **Verilog/SystemVerilog** | Not supported | tree-sitter | **Narsil** | +| **Fortran/PowerShell/Nix** | Not supported | tree-sitter | **Narsil** | +| **Bash** | Not supported | tree-sitter | **Narsil** | +| **Language count** | 11 | 32 | **Narsil** (3x more languages) | +| **Adding a new language** | 1 registry entry + 1 JS extractor (<100 lines, 2 files) | Rust code + recompile binary | **Codegraph** — dramatically lower barrier for contributors | +| **Incremental parsing** | File-level hash tracking — only changed files re-parsed | Full re-index (fast but complete) | **Codegraph** — orders of magnitude faster for single-file changes | +| **Callback pattern extraction** | Commander `.command().action()`, Express routes, event handlers | Not documented | **Codegraph** — framework-aware symbol extraction | + +**Summary:** Narsil covers 3x more languages (32 vs 11) using the same parser technology (tree-sitter). Codegraph has better incremental parsing, easier extensibility, and unique framework callback extraction. For codegraph's target users (JS/TS/Python/Go developers), codegraph's coverage is sufficient. Narsil's breadth matters for polyglot enterprises. + +--- + +### B. Graph Model & Analysis Depth + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Graph type** | Structural dependency graph (symbols + edges) in SQLite | In-memory symbol/file caches (DashMap) + optional RDF knowledge graph | **Codegraph** for persistence; **Narsil** for RDF expressiveness | +| **Node types** | 10 kinds: `function`, `method`, `class`, `interface`, `type`, `struct`, `enum`, `trait`, `record`, `module` | Functions, classes, methods, variables, imports, exports + more | **Narsil** — more granular | +| **Edge types** | `calls`, `imports` (with confidence scores) | Calls, imports, data flow, control flow, type relationships | **Narsil** — fundamentally more edge types | +| **Call graph** | Import-aware resolution with 6-level confidence scoring, qualified call filtering | `get_call_graph`, `get_callers`, `get_callees`, `find_call_path` | **Codegraph** for precision (confidence scoring); **Narsil** for completeness | +| **Control flow graph** | Not available | `get_control_flow` — basic blocks + branch conditions | **Narsil** | +| **Data flow analysis** | `flows_to`/`returns`/`mutates` edges (BACKLOG ID 14, recently shipped) | `get_data_flow`, `get_reaching_definitions`, `find_uninitialized`, `find_dead_stores` | **Narsil** — more mature with 4 dedicated tools | +| **Type inference** | Not available | `infer_types`, `check_type_errors` for Python/JS/TS | **Narsil** | +| **Dead code detection** | `roles --role dead` — unreferenced non-exported symbols | `find_dead_code` — unreachable code paths via CFG | **Both** — complementary approaches (structural vs. control-flow) | +| **Complexity metrics** | Cognitive, cyclomatic, Halstead, MI, nesting depth per function | Cyclomatic complexity only | **Codegraph** — 5 metrics vs 1 | +| **Node role classification** | Auto-tags: `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` | Not available | **Codegraph** | +| **Community detection** | Louvain algorithm with drift analysis | Not available | **Codegraph** | +| **Impact analysis** | `fn-impact`, `diff-impact` (git-aware), `impact` (file-level) | Not purpose-built | **Codegraph** — first-class impact commands | +| **Shortest path** | `path ` — BFS between symbols | `find_call_path` — between functions | **Tie** | +| **SPARQL / Knowledge graph** | Not available | RDF graph via Oxigraph, SPARQL queries, predefined templates | **Narsil** — unique capability | +| **Code Context Graph (CCG)** | Not available | 4-layer hierarchical context (L0-L3) with JSON-LD/N-Quads export | **Narsil** — unique capability | + +**Summary:** Narsil has broader analysis (CFG, dataflow, type inference, SPARQL, CCG). Codegraph is deeper on developer-facing metrics (5 complexity metrics, node roles, community detection, Louvain drift) and has unique impact analysis commands. Narsil's knowledge graph and CCG layering are genuinely novel features with no codegraph equivalent. + +--- + +### C. Search & Retrieval + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Keyword search** | BM25 via SQLite FTS5 | BM25 via Tantivy | **Tie** — different engines, same algorithm | +| **Semantic search** | HuggingFace Transformers (local, ~500 MB model) | TF-IDF (local) or neural (Voyage AI/OpenAI/ONNX) | **Narsil** — more backend choices | +| **Hybrid search** | BM25 + semantic with Reciprocal Rank Fusion | BM25 + TF-IDF hybrid | **Codegraph** — RRF fusion with full embeddings is higher quality | +| **Code similarity** | Not available | `find_similar_code`, `find_similar_to_symbol` | **Narsil** | +| **Semantic clone detection** | Not available | `find_semantic_clones` (Type-3/4 clones) | **Narsil** | +| **AST-aware chunking** | Not available | `get_chunks`, `get_chunk_stats` — respects AST boundaries | **Narsil** | +| **Symbol search** | `where` with name, kind, file, role filters | `find_symbols`, `workspace_symbol_search`, `find_references`, `find_symbol_usages` | **Narsil** — more search modes | +| **Export map** | `list-functions` with filters | `get_export_map` — all exported symbols per module | **Tie** — different interfaces, similar data | +| **Search latency** | Depends on FTS5/embedding model | <1μs exact, 16μs fuzzy, 80μs BM25, 130μs TF-IDF, 151μs hybrid | **Narsil** — published sub-millisecond benchmarks | + +**Summary:** Narsil has more search tools (similarity, clone detection, AST chunking) and more embedding backends. Codegraph has higher-quality hybrid search (RRF with full transformer embeddings vs. TF-IDF). For AI agent context preparation, narsil's AST-aware chunking is a notable gap. + +--- + +### D. Security Analysis + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Taint analysis** | Not available | `trace_taint`, `get_taint_sources`, `get_typed_taint_flow` | **Narsil** | +| **Vulnerability scanning** | Not available | `scan_security` with 147 built-in YAML rules | **Narsil** | +| **OWASP Top 10** | Not available | `check_owasp_top10` — dedicated compliance check | **Narsil** | +| **CWE Top 25** | Not available | `check_cwe_top25` — dedicated compliance check | **Narsil** | +| **Secret scanning** | Not available | Rules in `secrets.yaml` | **Narsil** | +| **SBOM generation** | Not available | `generate_sbom` — Software Bill of Materials | **Narsil** | +| **License compliance** | Not available | `check_licenses` | **Narsil** | +| **Dependency vulnerabilities** | Not available | `check_dependencies` — CVE checking | **Narsil** | +| **Vulnerability explanation** | Not available | `explain_vulnerability`, `suggest_fix` | **Narsil** | +| **Crypto misuse detection** | Not available | Rules in `crypto.yaml` | **Narsil** | +| **IaC security** | Not available | Rules in `iac.yaml` | **Narsil** | +| **Language-specific rules** | Not available | Rust, Elixir, Go, Java, C#, Kotlin, Bash rule files | **Narsil** | + +**Summary:** Narsil dominates security analysis completely with 147 rules across 12+ rule files. Codegraph has zero security features today — by design (FOUNDATION.md P8). OWASP pattern detection is on the roadmap as lightweight AST-based checks (BACKLOG ID 7), not taint analysis. + +--- + +### E. Query Language & Interface + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Primary interface** | Full CLI with 35+ commands + MCP server | MCP server (primary) + config management CLI | **Codegraph** — usable without MCP client | +| **Standalone CLI queries** | `where`, `fn`, `explain`, `context`, `deps`, `impact`, `map`, etc. | Not available — all queries via MCP tools | **Codegraph** — narsil requires an MCP client for any query | +| **MCP tools count** | 21 purpose-built tools | 90 tools across 14 categories | **Narsil** — 4x more tools | +| **Compound queries** | `context` (source + deps + callers + tests), `explain`, `audit` | No compound tools — each tool is atomic | **Codegraph** — purpose-built for agent token efficiency | +| **Batch queries** | `batch` command for multi-target dispatch | No batch mechanism | **Codegraph** | +| **JSON output** | `--json` flag on every command | MCP JSON responses | **Tie** | +| **NDJSON streaming** | `--ndjson` with `--limit`/`--offset` on ~14 commands | `--streaming` flag for large results | **Tie** | +| **Pagination** | Universal `limit`/`offset` on all 21 MCP tools with per-tool defaults | Not documented | **Codegraph** | +| **SPARQL queries** | Not available | `sparql_query`, predefined templates | **Narsil** — unique expressiveness | +| **Configuration presets** | Not available | Minimal (~26 tools), Balanced (~51), Full (75+), Security-focused | **Narsil** — manages token cost per preset | +| **Visualization** | DOT, Mermaid, JSON export | Built-in web UI (Cytoscape.js) with interactive graphs | **Narsil** — interactive browser visualization | +| **Programmatic API** | Full JS API: `import { buildGraph, queryNameData } from '@optave/codegraph'` | No library API | **Codegraph** — embeddable in JS/TS projects | + +**Summary:** Codegraph is more accessible (full CLI + API + MCP). Narsil has more MCP tools (90 vs 21) but no standalone query interface — completely dependent on MCP clients. Codegraph's compound commands (`context`, `explain`, `audit`) reduce agent round-trips; narsil requires multiple atomic tool calls for equivalent context. Narsil's configuration presets are a smart approach to managing MCP tool token costs. + +--- + +### F. Performance & Resource Usage + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Cold build (small, ~50 files)** | <2 seconds | ~220ms | **Narsil** (faster cold start) | +| **Cold build (medium, ~3,000 files)** | 5-15 seconds | ~2 seconds (50K symbols) | **Narsil** (faster cold start) | +| **Incremental rebuild (1 file changed)** | <500ms | Full re-index | **Codegraph** (100-1,000x faster for incremental) | +| **Memory usage** | <100 MB typical (SQLite-backed) | In-memory — grows with codebase size | **Codegraph** — predictable, bounded by SQLite | +| **Persistence** | SQLite by default — always persisted | In-memory by default. `--persist` opt-in | **Codegraph** — survives restarts without flag | +| **Startup time** | <100ms (Node.js, reads existing DB) | Index from scratch unless persisted | **Codegraph** — always has a warm DB | +| **Storage format** | SQLite file (compact, portable, universally readable) | Custom binary format (Tantivy + DashMap serialization) | **Codegraph** — SQLite is universally inspectable | +| **Symbol lookup** | SQL query on indexed column | <1μs (DashMap in-memory) | **Narsil** — in-memory is faster for hot lookups | +| **Search latency** | FTS5/embedding dependent | 80μs BM25, 130μs TF-IDF | **Narsil** — published sub-ms benchmarks | +| **Binary size** | ~50 MB (with WASM grammars) | ~30 MB (native feature set) | **Narsil** (smaller) | +| **Watch mode** | Built-in `watch` command | `--watch` flag | **Tie** | +| **Commit hook viability** | Yes — <500ms incremental rebuilds | Possible but re-indexes fully | **Codegraph** — incremental makes hooks invisible | +| **CI pipeline viability** | `check --staged` returns exit code 0/1 | No CI-specific tooling | **Codegraph** | + +**Summary:** Narsil is faster for cold starts and hot lookups (pure Rust + in-memory). Codegraph is vastly faster for incremental workflows — the 1-file-changed scenario that defines developer loops, commit hooks, and agent iterations. Codegraph's SQLite persistence means no re-indexing on restart; narsil defaults to in-memory and loses state. + +--- + +### G. Installation & Deployment + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Install method** | `npm install @optave/codegraph` | brew, scoop, cargo, npm, AUR, nix, install scripts | **Narsil** — more package managers | +| **Runtime dependency** | Node.js >= 20 | None (single binary) | **Narsil** — zero runtime deps | +| **Docker** | Not required | Not required | **Tie** | +| **Platform binaries** | npm auto-resolves `@optave/codegraph-{platform}-{arch}` | Prebuilt for macOS/Linux/Windows | **Tie** | +| **Browser build** | Not available | WASM package `@narsil-mcp/wasm` (~3 MB) | **Narsil** | +| **Configuration** | `.codegraphrc.json` + env vars + `apiKeyCommand` | `.narsil.yaml` + env vars + presets + interactive wizard | **Narsil** — more options including wizard | +| **Config management** | Manual file editing | `narsil-mcp config init/show/validate` | **Narsil** — built-in config tooling | +| **Editor integration** | Claude Code MCP config | Pre-built configs for Claude Code, Cursor, VS Code, Zed, JetBrains | **Narsil** — more pre-built editor configs | +| **Uninstall** | `npm uninstall` | Package manager dependent | **Tie** | + +**Summary:** Narsil is easier to install (single binary, more package managers, no Node.js required) and has better editor integration configs. Codegraph's npm-based install is simpler for Node.js developers but requires Node.js. Narsil's interactive config wizard and preset system lower the barrier to entry. + +--- + +### H. AI Agent & MCP Integration + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **MCP tools** | 21 purpose-built tools | 90 tools across 14 categories | **Narsil** (4x more tools) | +| **Token efficiency** | `context`/`explain`/`audit` compound commands reduce round-trips 50-80% | Atomic tools only. Forgemax integration collapses 90 → 2 tools (~1,000 vs ~12,000 tokens) | **Codegraph** natively; **Narsil** via Forgemax | +| **Tool token cost** | ~4,000 tokens for 21 tool definitions | ~12,000 tokens for full set. Presets: Minimal ~4,600, Balanced ~8,900 | **Codegraph** — lower base cost. Narsil presets help | +| **Pagination** | Universal `limit`/`offset` on all tools with per-tool defaults, hard cap 1,000 | `--streaming` for large results | **Codegraph** — structured pagination metadata | +| **Multi-repo support** | Registry-based, opt-in via `--multi-repo` or `--repos` | Multi-repo by default, `discover_repos` auto-detection | **Narsil** for convenience; **Codegraph** for security | +| **Single-repo isolation** | Default — tools have no `repo` property unless `--multi-repo` | Not default — multi-repo access is always available | **Codegraph** — security-conscious default | +| **Programmatic embedding** | Full JS API for VS Code extensions, CI pipelines, other MCP servers | No library API | **Codegraph** | +| **CCG context layers** | Not available | L0-L3 hierarchical context for progressive disclosure | **Narsil** — novel approach to context management | +| **Remote repo indexing** | Not available | `add_remote_repo` clones and indexes GitHub repos | **Narsil** | + +**Summary:** Narsil has 4x more MCP tools but higher token overhead. Codegraph's compound commands are more token-efficient per query. Narsil's CCG layering and configuration presets are innovative approaches to managing AI agent context budgets. Codegraph's programmatic API enables embedding scenarios narsil cannot serve. + +--- + +### I. Developer Productivity Features + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **Impact analysis (function-level)** | `fn-impact ` — transitive callers + downstream | Not purpose-built | **Codegraph** | +| **Impact analysis (git-aware)** | `diff-impact --staged` / `diff-impact main` | Not available | **Codegraph** | +| **CI gate** | `check --staged` — exit code 0/1 (cycles, complexity, blast radius, boundaries) | Not available | **Codegraph** | +| **Complexity metrics** | Cognitive, cyclomatic, Halstead, MI, nesting depth per function | Cyclomatic only (`get_complexity`) | **Codegraph** (5 metrics vs 1) | +| **Code health manifesto** | Configurable rule engine with warn/fail thresholds | Not available | **Codegraph** | +| **Structure analysis** | `structure` — directory hierarchy with cohesion scores | `get_project_structure` — directory tree only | **Codegraph** — includes cohesion metrics | +| **Hotspot detection** | `hotspots` — files/dirs with extreme fan-in/fan-out/density | `get_function_hotspots` — most-called/most-complex + git churn hotspots | **Tie** — different hotspot types | +| **Co-change analysis** | `co-change` — git history for files that change together | Not available | **Codegraph** | +| **Branch comparison** | `branch-compare` — structural diff between branches | Not available | **Codegraph** | +| **Triage/risk ranking** | `triage` — ranked audit queue by composite risk score | Not available | **Codegraph** | +| **CODEOWNERS integration** | `owners` — maps functions to code owners | Not available | **Codegraph** | +| **Semantic search** | `search` — BM25 + semantic with RRF | `semantic_search`, `hybrid_search` | **Tie** | +| **Watch mode** | `watch` — live incremental rebuilds | `--watch` flag for auto-reindex | **Tie** | +| **Snapshot management** | `snapshot save/restore` — DB backup/restore | Not available | **Codegraph** | +| **Execution flow tracing** | `flow` — from entry points through callees | `get_control_flow` — within a function | **Codegraph** for cross-function; **Narsil** for intraprocedural | +| **Module overview** | `map` — high-level module map with most-connected nodes | Not purpose-built | **Codegraph** | +| **Cycle detection** | `cycles` — circular dependency detection | `find_circular_imports` — circular import chains | **Tie** | +| **Architecture boundaries** | Configurable rules with onion preset | Not available | **Codegraph** | +| **Node role classification** | `entry`/`core`/`utility`/`adapter`/`dead`/`leaf` per symbol | Not available | **Codegraph** | +| **Audit command** | `audit` — explain + impact + health in one call | Not available | **Codegraph** | +| **Git integration** | `diff-impact`, `co-change`, `branch-compare` | `get_blame`, `get_file_history`, `get_recent_changes`, `get_symbol_history`, `get_contributors`, `get_hotspots` | **Narsil** for git data breadth; **Codegraph** for git-aware analysis | +| **Export formats** | DOT, Mermaid, JSON | Cytoscape.js interactive UI, JSON-LD, N-Quads, RDF | **Narsil** — more formats + interactive visualization | + +**Summary:** Codegraph has 15+ purpose-built developer productivity commands that narsil lacks (impact analysis, manifesto, triage, boundaries, co-change, branch-compare, audit, structure, CODEOWNERS). Narsil has richer git integration tools (blame, contributors, symbol history) and interactive visualization. For the "what breaks if I change this?" workflow, codegraph is the clear choice. + +--- + +### J. Ecosystem & Community + +| Feature | Codegraph | Narsil-MCP | Best Approach | +|---------|-----------|------------|---------------| +| **GitHub stars** | Growing | 120 | **Narsil** (slightly) | +| **License** | Apache-2.0 | Apache-2.0 OR MIT (dual) | **Narsil** — dual license is more permissive | +| **Release cadence** | As needed | Regular (v1.6.1 latest, Feb 2026) | **Tie** | +| **Test suite** | Vitest | 1,763+ tests + criterion benchmarks | **Narsil** — more tests, published benchmarks | +| **Documentation** | CLAUDE.md + CLI `--help` | narsilmcp.com + README + editor configs | **Narsil** — dedicated docs site | +| **Commercial backing** | Optave AI Solutions Inc. | Open-core model (narsil-cloud private repo) | **Both** — different business models | +| **Integration ecosystem** | MCP + programmatic API | Forgemax, Ralph, Claude Code plugin | **Narsil** — more third-party integrations | +| **Browser story** | Not available | WASM package for browser-based analysis | **Narsil** | +| **CCG standard** | Not available | Code Context Graph — a proposed standard for AI code context | **Narsil** — potential industry standard | + +**Summary:** Narsil has a more developed ecosystem (docs site, editor configs, third-party integrations, browser build, CCG standard). Both are commercially backed. Narsil's open-core model (commercial cloud features in private repo) is a viable business approach. + +--- + +## Where Each Tool is the Better Choice + +### Choose Codegraph when: + +1. **You need the graph to stay current in tight feedback loops** — commit hooks, watch mode, AI agent loops. Codegraph's incremental <500ms rebuilds vs. narsil's full re-index. +2. **You need a standalone CLI** — `codegraph where`, `codegraph explain`, `codegraph context` work without any MCP client. Narsil requires an MCP client for all queries. +3. **You need impact analysis** — `diff-impact --staged` tells you what breaks before committing. Narsil has no equivalent. +4. **You need CI gates** — `check --staged` returns exit 0/1 for cycles, complexity, blast radius, boundaries. Narsil has no CI tooling. +5. **You need developer productivity features** — complexity metrics (5 types), manifesto rules, architecture boundaries, co-change analysis, triage. These don't exist in narsil. +6. **You want confidence-scored results** — every call edge has a 0.0-1.0 confidence score. Narsil's edges are unscored. +7. **You're embedding in a JS/TS project** — full programmatic API. Narsil has no library API. +8. **You want single-repo security by default** — codegraph's MCP exposes only one repo unless you opt in to multi-repo. + +### Choose Narsil-MCP when: + +1. **You need security analysis** — taint tracking, OWASP/CWE compliance, SBOM, license scanning, 147 built-in rules. Codegraph has zero security features. +2. **You need broad language coverage** — 32 languages vs 11. Critical for polyglot enterprises. +3. **You need control flow or data flow analysis** — CFG, reaching definitions, dead stores, uninitialized variables. Codegraph's dataflow is nascent. +4. **You need type inference** — infer types for untyped Python/JS/TS code. Codegraph has no type analysis. +5. **You want interactive visualization** — built-in Cytoscape.js web UI with drill-down, overlays, and clustering. Codegraph exports static DOT/Mermaid. +6. **You need a single binary with no runtime deps** — `brew install narsil-mcp` and done. No Node.js required. +7. **You're building an MCP-first agent pipeline** — 90 tools cover nearly every code analysis need. One server, one config. +8. **You want a browser-based analysis tool** — narsil's WASM build runs analysis in the browser. +9. **You need SPARQL/RDF knowledge graph** — unique capability for semantic code querying. +10. **You need code similarity / clone detection** — `find_similar_code`, `find_semantic_clones`. Codegraph has no similarity tools. + +### Use both together when: + +- **CI pipeline**: Codegraph for fast structural checks on every commit (`check --staged`), narsil for periodic security scans. +- **AI agent workflow**: Codegraph's compound commands for fast structural context; narsil's security tools for vulnerability assessment. +- **Pre-commit + periodic audit**: Codegraph in commit hooks (fast, incremental), narsil for weekly security/compliance reports. + +--- + +## Key Metrics Summary + +| Metric | Codegraph | Narsil-MCP | Winner | +|--------|-----------|------------|--------| +| Incremental rebuild speed | <500ms | N/A (full re-index) | Codegraph | +| Cold build speed | Seconds | Sub-seconds to seconds | Narsil | +| Memory usage | <100 MB typical | Grows with codebase (in-memory) | Codegraph | +| Install complexity | `npm install` (requires Node.js) | Single binary (brew/scoop/cargo) | Narsil | +| Analysis depth (structural) | High (impact, complexity, roles) | High (CFG, DFG, type inference) | Tie | +| Analysis depth (security) | None | Best in class (147 rules, taint) | Narsil | +| AI agent integration | 21-tool MCP + compound commands | 90-tool MCP + presets + CCG | Narsil for breadth; Codegraph for efficiency | +| Developer productivity | 15+ purpose-built commands | Git tools only | Codegraph | +| Language support | 11 | 32 | Narsil | +| Standalone CLI | Full CLI experience | Config/tools management only | Codegraph | +| Programmatic API | Full JS API | None | Codegraph | +| Community & maturity | New | Newer (Dec 2025), growing fast | Tie | +| CI/CD readiness | Yes (`check --staged`) | No CI tooling | Codegraph | +| Visualization | DOT/Mermaid/JSON export | Interactive Cytoscape.js web UI | Narsil | +| Search backends | FTS5 + HuggingFace local | Tantivy + TF-IDF + Voyage/OpenAI/ONNX | Narsil | + +**Final score against FOUNDATION.md principles: Codegraph 4, Narsil 0, Tie 4.** +Narsil competes much more closely on codegraph's principles than Joern does. The gap is in incremental rebuilds (P1), confidence scoring (P3), CLI + API (P5), and single-repo isolation (P7). + +--- + +## Narsil-Inspired Feature Candidates + +Features extracted from **all comparison sections** above, assessed using the [BACKLOG.md](../../docs/roadmap/BACKLOG.md) tier and grading system. See the [Scoring Guide](../../docs/roadmap/BACKLOG.md#scoring-guide) for column definitions. + +### Tier 1 — Zero-dep + Foundation-aligned (build these first) + +Non-breaking, ordered by problem-fit: + +| ID | Title | Description | Source | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | +|----|-------|-------------|--------|----------|---------|----------|-------------------|-------------------|----------| +| N1 | MCP tool presets | Configurable MCP tool subsets (minimal/balanced/full/custom) that control which tools are registered. Reduces tool-definition token cost from ~4,000 to ~2,000 for minimal sets. Inspired by narsil's preset system (Minimal ~4,600 tokens, Balanced ~8,900, Full ~12,000). | E, H | Embeddability | Agents with small context windows get only the tools they need — directly reduces token waste on tool definitions | ✓ | ✓ | 5 | No | +| N2 | AST-aware code chunking | Split files into semantic chunks that respect AST boundaries (functions, classes, blocks) instead of naive line splits. Expose as MCP tool and CLI command. Inspired by narsil's `get_chunks`/`get_chunk_stats`. | C | Navigation | Agents get correctly-bounded code snippets for context windows — no more mid-function splits that confuse LLMs | ✓ | ✓ | 5 | No | +| N3 | Code similarity search | Find code structurally similar to a given snippet or symbol using AST fingerprinting or embedding cosine similarity on existing search infrastructure. Inspired by narsil's `find_similar_code`/`find_similar_to_symbol`. | C | Search | Agents can find related implementations for refactoring, deduplication, and pattern learning — reduces re-invention and catches copy-paste drift | ✓ | ✓ | 4 | No | +| N4 | Git blame & symbol history | Surface `git blame` data per function and track how symbols change over commits. Complement existing `co-change` with per-symbol history. Inspired by narsil's `get_blame`/`get_symbol_history`/`get_contributors`. | I | Analysis | Agents know who last touched a function and how it evolved — critical context for review, ownership, and understanding intent behind changes | ✓ | ✓ | 4 | No | +| N5 | Remote repo indexing | Allow `codegraph build ` to clone and index a remote repository. Useful for comparing dependencies, upstream libraries, or reviewing PRs on forks. Inspired by narsil's `add_remote_repo`. | H | Developer Experience | Agents can analyze dependencies and upstream repos without manual cloning — enables cross-repo context gathering in one command | ✓ | ✓ | 3 | No | +| N6 | Configuration wizard | Interactive `codegraph init` that detects project structure, suggests `.codegraphrc.json` settings, and auto-configures MCP for the user's editor. Inspired by narsil's `config init` wizard and pre-built editor configs. | G | Developer Experience | Reduces setup friction — new users get a working config in seconds instead of reading docs | ✓ | ✓ | 2 | No | +| N7 | Kotlin language support | Add tree-sitter-kotlin to `LANGUAGE_REGISTRY`. 1 registry entry + 1 extractor. Narsil covers 32 languages; Kotlin is the highest-value gap for codegraph's target audience (Android/KMP). | A | Parsing | Extends coverage to Android/KMP — closes the most impactful language gap vs. narsil | ✓ | ✓ | 2 | No | +| N8 | Swift language support | Add tree-sitter-swift to `LANGUAGE_REGISTRY`. 1 registry entry + 1 extractor. Narsil covers Swift; codegraph does not. | A | Parsing | Extends coverage to Apple/iOS — closes a visible language gap | ✓ | ✓ | 2 | No | +| N9 | Bash language support | Add tree-sitter-bash to `LANGUAGE_REGISTRY`. 1 registry entry + 1 extractor. Bash scripts are ubiquitous in CI/CD and developer tooling. | A | Parsing | Covers CI scripts, Dockerfiles, and developer tooling — commonly co-located with source code | ✓ | ✓ | 2 | No | +| N10 | Scala language support | Add tree-sitter-scala to `LANGUAGE_REGISTRY`. 1 registry entry + 1 extractor. Relevant for JVM ecosystem coverage. | A | Parsing | Closes language gap for JVM polyglot codebases | ✓ | ✓ | 2 | No | + +Breaking (penalized to end of tier): + +| ID | Title | Description | Source | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | +|----|-------|-------------|--------|----------|---------|----------|-------------------|-------------------|----------| +| N11 | Export map per module | Dedicated `exports ` command listing all exported symbols with types, roles, and consumers. Inspired by narsil's `get_export_map`. Currently inferable from `explain` but not first-class. | B | Navigation | Agents quickly understand a module's public API surface without reading source — useful for import resolution and interface discovery | ✓ | ✓ | 3 | Yes | + +### Tier 2 — Foundation-aligned, needs dependencies + +Ordered by problem-fit: + +| ID | Title | Description | Source | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | +|----|-------|-------------|--------|----------|---------|----------|-------------------|-------------------|----------| +| N12 | Interactive HTML visualization | `codegraph viz` opens a browser-based interactive graph (Cytoscape.js or vis.js) with drill-down, clustering, complexity overlays, and vulnerability highlighting. Inspired by narsil's built-in visualization frontend. Already on roadmap (BACKLOG ID 10). | E, J | Visualization | Developers and teams visually explore architecture — useful for onboarding, code reviews, and spotting structural problems | ✗ | ✓ | 1 | No | +| N13 | Multiple embedding backends | Support Voyage AI, OpenAI, and ONNX as alternative embedding providers alongside existing HuggingFace Transformers. Inspired by narsil's `--neural-backend api\|onnx` with model selection. Already partially on roadmap (BACKLOG ID 8). | C | Search | Users who already pay for an LLM provider get better embeddings at no extra cost — and local ONNX gives a lighter alternative to the 500MB transformer model | ✗ | ✓ | 3 | No | + +### Tier 3 — Not foundation-aligned (needs deliberate exception) + +| ID | Title | Description | Source | Category | Benefit | Zero-dep | Foundation-aligned | Problem-fit (1-5) | Breaking | +|----|-------|-------------|--------|----------|---------|----------|-------------------|-------------------|----------| +| N14 | OWASP/CWE pattern detection | Lightweight AST-based security scanning using YAML rule files. Not taint analysis — pattern matching on AST nodes (e.g. `eval()`, hardcoded secrets, SQL string concatenation). Inspired by narsil's 147-rule security engine. Already on roadmap (BACKLOG ID 7). | D | Security | Catches low-hanging security issues during `diff-impact`; agents flag risky patterns before they're committed | ✓ | ✗ | 1 | No | +| N15 | SBOM generation | Generate a Software Bill of Materials from `package.json`/`requirements.txt`/`go.mod`. Lightweight — parse manifest files already in scope. Inspired by narsil's `generate_sbom`. | D | Security | Supply chain visibility without external tools — useful for compliance audits | ✓ | ✗ | 1 | No | + +### Not adopted (violates FOUNDATION.md) + +These narsil-mcp features were evaluated and deliberately excluded: + +| Narsil Feature | Section | Why Not | +|----------------|---------|---------| +| **Taint analysis** | D | Requires control-flow and data-dependence infrastructure. Would 10-100x build time, violating P1. Narsil's tree-sitter-based taint is impressive but trades performance for depth | +| **Type inference engine** | B | Requires language-specific type solvers beyond tree-sitter AST. Violates P6 (one registry, no magic). Lightweight type annotation extraction (Joern-inspired J2) is the pragmatic alternative | +| **SPARQL / RDF knowledge graph** | B, E | Requires Oxigraph dependency. SQLite + existing query commands serve our use case. RDF/SPARQL is overkill for structural code intelligence — powerful but orthogonal to our goals | +| **Code Context Graph (CCG) standard** | B, H | Interesting concept but tightly coupled to narsil's architecture and commercial model. Our MCP pagination + compound commands solve the progressive-disclosure problem differently | +| **In-memory-first architecture** | F | Violates P1 (graph must survive restarts to stay always-current). SQLite persistence is a deliberate choice — narsil's opt-in persistence means state loss on every restart by default | +| **90-tool MCP surface** | E, H | More tools = more token overhead per agent session. Our 21 purpose-built tools + compound commands are more token-efficient. Narsil compensates with presets; we compensate with fewer, smarter tools | +| **Browser WASM build** | G, J | Different product category. We're a CLI/MCP engine, not a browser tool (P8). Narsil's WASM build is a legitimate capability, but building a browser runtime is outside our scope | +| **Forgemax-style tool collapsing** | H | Collapses 90 tools to 2 (`search`/`execute`). We don't need this because we already have ~21 tools — small enough that collapsing adds complexity without meaningful savings | +| **LSP integration** | B | Requires running language servers alongside codegraph. Violates zero-dependency goal. Tree-sitter + confidence scoring is our approach; LSP is a different architectural bet | +| **License compliance scanning** | D | Tangential to code intelligence. Better served by dedicated tools (FOSSA, Snyk, etc.) | + +### Cross-references to existing BACKLOG items + +These narsil-inspired capabilities are already tracked in [BACKLOG.md](../../docs/roadmap/BACKLOG.md): + +| BACKLOG ID | Title | Narsil Equivalent | Relationship | +|------------|-------|-------------------|--------------| +| 7 | OWASP/CWE pattern detection | `scan_security` with 147 rules | Lightweight AST-based alternative to narsil's full rule engine. N14 above. Already Tier 3 | +| 8 | Optional LLM provider integration | `--neural-backend api\|onnx` | Multiple embedding providers. N13 above. Already Tier 2 | +| 10 | Interactive HTML visualization | Built-in Cytoscape.js frontend | Browser-based interactive graph. N12 above. Already Tier 3 | +| 14 | Dataflow analysis | `get_data_flow`, `get_reaching_definitions` | Lightweight def-use chains. Narsil has 4 dedicated dataflow tools. Already Tier 1 Breaking | + +### Cross-references to Joern-inspired candidates + +Some features identified in this analysis overlap with [Joern-inspired candidates](./joern.md#joern-inspired-feature-candidates): + +| Joern ID | Title | Narsil Equivalent | Note | +|----------|-------|-------------------|------| +| J4 | Kotlin language support | Narsil's 32-language coverage | Same feature, dual motivation. Listed here as N7 | +| J5 | Swift language support | Narsil's 32-language coverage | Same feature, dual motivation. Listed here as N8 | +| J8 | Intraprocedural CFG | `get_control_flow` | Narsil has it; validates priority of J8 | +| J9 | Stored queryable AST | AST-aware chunking + pattern matching | N2 (chunking) is a lighter alternative; J9 is the deeper version | From a6013cd52a407ebf290bfbc2dc18d9944b19d43b Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 19:12:51 -0700 Subject: [PATCH 5/8] feat: add dedicated `exports ` command with per-symbol consumers Implements feature N11 from the Narsil competitive analysis. The new command provides a focused export map showing which symbols a file exports and who calls each one, filling the gap between `explain` (public/internal split without consumers) and `where --file` (just export names). Adds exportsData/fileExports to queries.js, CLI command, MCP tool, batch support, programmatic API, and integration tests. Impact: 7 functions changed, 15 affected --- src/batch.js | 2 + src/cli.js | 31 +++++- src/index.js | 10 +- src/mcp.js | 67 +++++++++++++ src/paginate.js | 1 + src/queries.js | 160 ++++++++++++++++++++++++++++++ tests/integration/queries.test.js | 78 +++++++++++++++ tests/unit/mcp.test.js | 23 +++++ 8 files changed, 370 insertions(+), 2 deletions(-) diff --git a/src/batch.js b/src/batch.js index 2a703a3c..17494dc0 100644 --- a/src/batch.js +++ b/src/batch.js @@ -11,6 +11,7 @@ import { flowData } from './flow.js'; import { contextData, explainData, + exportsData, fileDepsData, fnDepsData, fnImpactData, @@ -34,6 +35,7 @@ export const BATCH_COMMANDS = { query: { fn: fnDepsData, sig: 'name' }, impact: { fn: impactAnalysisData, sig: 'file' }, deps: { fn: fileDepsData, sig: 'file' }, + exports: { fn: exportsData, sig: 'file' }, flow: { fn: flowData, sig: 'name' }, dataflow: { fn: dataflowData, sig: 'name' }, complexity: { fn: complexityData, sig: 'dbOnly' }, diff --git a/src/cli.js b/src/cli.js index ddd853aa..bd3daa79 100644 --- a/src/cli.js +++ b/src/cli.js @@ -25,6 +25,7 @@ import { diffImpact, explain, fileDeps, + fileExports, fnDeps, fnImpact, impactAnalysis, @@ -97,10 +98,18 @@ program .description('Parse repo and build graph in .codegraph/graph.db') .option('--no-incremental', 'Force full rebuild (ignore file hashes)') .option('--dataflow', 'Extract data flow edges (flows_to, returns, mutates)') + .option('--scope ', 'Rebuild only specified files (for agent-level rollback)') + .option('--no-reverse-deps', 'Skip reverse dependency cascade (only meaningful with --scope)') .action(async (dir, opts) => { const root = path.resolve(dir || '.'); const engine = program.opts().engine; - await buildGraph(root, { incremental: opts.incremental, engine, dataflow: opts.dataflow }); + await buildGraph(root, { + incremental: opts.incremental, + engine, + dataflow: opts.dataflow, + scope: opts.scope, + noReverseDeps: opts.reverseDeps === false, + }); }); program @@ -217,6 +226,26 @@ program }); }); +program + .command('exports ') + .description('Show exported symbols with per-symbol consumers (who calls each export)') + .option('-d, --db ', 'Path to graph.db') + .option('-T, --no-tests', 'Exclude test/spec files from results') + .option('--include-tests', 'Include test/spec files (overrides excludeTests config)') + .option('-j, --json', 'Output as JSON') + .option('--limit ', 'Max results to return') + .option('--offset ', 'Skip N results (default: 0)') + .option('--ndjson', 'Newline-delimited JSON output') + .action((file, opts) => { + fileExports(file, opts.db, { + noTests: resolveNoTests(opts), + json: opts.json, + limit: opts.limit ? parseInt(opts.limit, 10) : undefined, + offset: opts.offset ? parseInt(opts.offset, 10) : undefined, + ndjson: opts.ndjson, + }); + }); + program .command('fn-impact ') .description('Function-level impact: what functions break if this one changes') diff --git a/src/index.js b/src/index.js index 03be6853..594eed2e 100644 --- a/src/index.js +++ b/src/index.js @@ -21,7 +21,13 @@ export { evaluateBoundaries, PRESETS, validateBoundaryConfig } from './boundarie // Branch comparison export { branchCompareData, branchCompareMermaid } from './branch-compare.js'; // Graph building -export { buildGraph, collectFiles, loadPathAliases, resolveImportPath } from './builder.js'; +export { + buildGraph, + collectFiles, + loadPathAliases, + purgeFilesFromGraph, + resolveImportPath, +} from './builder.js'; // Check (CI validation predicates) export { check, checkData } from './check.js'; // Co-change analysis @@ -111,9 +117,11 @@ export { diffImpactData, diffImpactMermaid, explainData, + exportsData, FALSE_POSITIVE_CALLER_THRESHOLD, FALSE_POSITIVE_NAMES, fileDepsData, + fileExports, fnDepsData, fnImpactData, impactAnalysisData, diff --git a/src/mcp.js b/src/mcp.js index 405b09c2..416e8077 100644 --- a/src/mcp.js +++ b/src/mcp.js @@ -82,6 +82,20 @@ const BASE_TOOLS = [ required: ['file'], }, }, + { + name: 'file_exports', + description: + 'Show exported symbols of a file with per-symbol consumers — who calls each export and from where', + inputSchema: { + type: 'object', + properties: { + file: { type: 'string', description: 'File path (partial match supported)' }, + no_tests: { type: 'boolean', description: 'Exclude test files', default: false }, + ...PAGINATION_PROPS, + }, + required: ['file'], + }, + }, { name: 'impact_analysis', description: 'Show files affected by changes to a given file (transitive)', @@ -667,6 +681,31 @@ const BASE_TOOLS = [ }, }, }, + // Write tool — intentional for multi-agent orchestration (Titan Paradigm). + // Allows an agent to surgically rebuild only its changed files without + // nuking every other agent's graph state. + { + name: 'scoped_rebuild', + description: + 'Rebuild the graph for specific files only, leaving all other data untouched. Designed for agent-level rollback: revert source files via git, then call this to update the graph surgically.', + inputSchema: { + type: 'object', + properties: { + files: { + type: 'array', + items: { type: 'string' }, + description: 'Relative file paths to rebuild (deleted files are purged from graph)', + }, + no_reverse_deps: { + type: 'boolean', + description: + 'Skip reverse dependency cascade — use when exports did not change (e.g. reverting to the exact same version)', + default: false, + }, + }, + required: ['files'], + }, + }, ]; const LIST_REPOS_TOOL = { @@ -740,6 +779,7 @@ export async function startMCPServer(customDbPath, options = {}) { fnImpactData, pathData, contextData, + exportsData, explainData, whereData, diffImpactData, @@ -825,6 +865,13 @@ export async function startMCPServer(customDbPath, options = {}) { offset: args.offset ?? 0, }); break; + case 'file_exports': + result = exportsData(args.file, dbPath, { + noTests: args.no_tests, + limit: Math.min(args.limit ?? MCP_DEFAULTS.file_exports, MCP_MAX_LIMIT), + offset: args.offset ?? 0, + }); + break; case 'impact_analysis': result = impactAnalysisData(args.file, dbPath, { noTests: args.no_tests, @@ -1204,6 +1251,26 @@ export async function startMCPServer(customDbPath, options = {}) { }); break; } + case 'scoped_rebuild': { + if (!args.files || args.files.length === 0) { + result = { error: 'files array is required and must not be empty' }; + break; + } + const path = await import('node:path'); + const rootDir = dbPath + ? path.dirname(path.dirname(dbPath)) + : process.cwd(); + const { buildGraph } = await import('./builder.js'); + await buildGraph(rootDir, { + scope: args.files, + noReverseDeps: args.no_reverse_deps, + }); + result = { + rebuilt: args.files, + noReverseDeps: !!args.no_reverse_deps, + }; + break; + } case 'list_repos': { const { listRepos, pruneRegistry } = await import('./registry.js'); pruneRegistry(); diff --git a/src/paginate.js b/src/paginate.js index 8802b65a..79bfaa27 100644 --- a/src/paginate.js +++ b/src/paginate.js @@ -18,6 +18,7 @@ export const MCP_DEFAULTS = { context: 5, explain: 10, file_deps: 20, + file_exports: 20, diff_impact: 30, impact_analysis: 20, semantic_search: 20, diff --git a/src/queries.js b/src/queries.js index 5ee87b0c..7fb28d9c 100644 --- a/src/queries.js +++ b/src/queries.js @@ -3006,6 +3006,166 @@ export function roles(customDbPath, opts = {}) { } } +// ─── exportsData ───────────────────────────────────────────────────── + +function exportsFileImpl(db, target, noTests, getFileLines) { + const fileNodes = db + .prepare(`SELECT * FROM nodes WHERE file LIKE ? AND kind = 'file'`) + .all(`%${target}%`); + if (fileNodes.length === 0) return []; + + return fileNodes.map((fn) => { + const symbols = db + .prepare(`SELECT * FROM nodes WHERE file = ? AND kind != 'file' ORDER BY line`) + .all(fn.file); + + // IDs of symbols that have incoming calls from other files (exported) + const exportedIds = new Set( + db + .prepare( + `SELECT DISTINCT e.target_id FROM edges e + JOIN nodes caller ON e.source_id = caller.id + JOIN nodes target ON e.target_id = target.id + WHERE target.file = ? AND caller.file != ? AND e.kind = 'calls'`, + ) + .all(fn.file, fn.file) + .map((r) => r.target_id), + ); + + const exported = symbols.filter((s) => exportedIds.has(s.id)); + const internalCount = symbols.length - exported.length; + + const results = exported.map((s) => { + const fileLines = getFileLines(fn.file); + + let consumers = db + .prepare( + `SELECT n.name, n.file, n.line FROM edges e JOIN nodes n ON e.source_id = n.id + WHERE e.target_id = ? AND e.kind = 'calls'`, + ) + .all(s.id); + if (noTests) consumers = consumers.filter((c) => !isTestFile(c.file)); + + return { + name: s.name, + kind: s.kind, + line: s.line, + endLine: s.end_line ?? null, + role: s.role || null, + signature: fileLines ? extractSignature(fileLines, s.line) : null, + summary: fileLines ? extractSummary(fileLines, s.line) : null, + consumers: consumers.map((c) => ({ name: c.name, file: c.file, line: c.line })), + consumerCount: consumers.length, + }; + }); + + // Reexport edges from this file node + const reexports = db + .prepare( + `SELECT n.file FROM edges e JOIN nodes n ON e.target_id = n.id + WHERE e.source_id = ? AND e.kind = 'reexports'`, + ) + .all(fn.id) + .map((r) => ({ file: r.file })); + + return { + file: fn.file, + results, + reexports, + totalExported: exported.length, + totalInternal: internalCount, + }; + }); +} + +export function exportsData(file, customDbPath, opts = {}) { + const db = openReadonlyOrFail(customDbPath); + const noTests = opts.noTests || false; + + const dbFilePath = findDbPath(customDbPath); + const repoRoot = path.resolve(path.dirname(dbFilePath), '..'); + + const fileCache = new Map(); + function getFileLines(file) { + if (fileCache.has(file)) return fileCache.get(file); + try { + const absPath = safePath(repoRoot, file); + if (!absPath) { + fileCache.set(file, null); + return null; + } + const lines = fs.readFileSync(absPath, 'utf-8').split('\n'); + fileCache.set(file, lines); + return lines; + } catch { + fileCache.set(file, null); + return null; + } + } + + const fileResults = exportsFileImpl(db, file, noTests, getFileLines); + db.close(); + + if (fileResults.length === 0) { + return paginateResult( + { file, results: [], reexports: [], totalExported: 0, totalInternal: 0 }, + 'results', + { limit: opts.limit, offset: opts.offset }, + ); + } + + // For single-file match return flat; for multi-match return first (like explainData) + const first = fileResults[0]; + const base = { + file: first.file, + results: first.results, + reexports: first.reexports, + totalExported: first.totalExported, + totalInternal: first.totalInternal, + }; + return paginateResult(base, 'results', { limit: opts.limit, offset: opts.offset }); +} + +export function fileExports(file, customDbPath, opts = {}) { + const data = exportsData(file, customDbPath, opts); + if (opts.ndjson) { + printNdjson(data, 'results'); + return; + } + if (opts.json) { + console.log(JSON.stringify(data, null, 2)); + return; + } + + if (data.results.length === 0) { + console.log(`No exported symbols found for "${file}". Run "codegraph build" first.`); + return; + } + + console.log( + `\n# ${data.file} — ${data.totalExported} exported, ${data.totalInternal} internal\n`, + ); + + for (const sym of data.results) { + const icon = kindIcon(sym.kind); + const sig = sym.signature?.params ? `(${sym.signature.params})` : ''; + const role = sym.role ? ` [${sym.role}]` : ''; + console.log(` ${icon} ${sym.name}${sig}${role} :${sym.line}`); + if (sym.consumers.length === 0) { + console.log(' (no consumers)'); + } else { + for (const c of sym.consumers) { + console.log(` <- ${c.name} (${c.file}:${c.line})`); + } + } + } + + if (data.reexports.length > 0) { + console.log(`\n Re-exports: ${data.reexports.map((r) => r.file).join(', ')}`); + } + console.log(); +} + export function fnImpact(name, customDbPath, opts = {}) { const data = fnImpactData(name, customDbPath, opts); if (opts.ndjson) { diff --git a/tests/integration/queries.test.js b/tests/integration/queries.test.js index 0bb3b7dc..e991991c 100644 --- a/tests/integration/queries.test.js +++ b/tests/integration/queries.test.js @@ -28,6 +28,7 @@ import { initSchema } from '../../src/db.js'; import { diffImpactData, explainData, + exportsData, fileDepsData, fnDepsData, fnImpactData, @@ -734,3 +735,80 @@ describe('stable symbol schema', () => { expect(fn.fileHash).toBe('hash_auth_js'); }); }); + +// ─── exportsData ────────────────────────────────────────────────────── + +describe('exportsData', () => { + test('returns exported symbols with consumers for auth.js', () => { + const data = exportsData('auth.js', dbPath); + expect(data.file).toBe('auth.js'); + expect(data.totalExported).toBeGreaterThanOrEqual(2); + + const names = data.results.map((r) => r.name); + expect(names).toContain('authenticate'); + expect(names).toContain('validateToken'); + }); + + test('consumers include cross-file callers', () => { + const data = exportsData('auth.js', dbPath); + const auth = data.results.find((r) => r.name === 'authenticate'); + expect(auth).toBeDefined(); + const consumerNames = auth.consumers.map((c) => c.name); + // authMiddleware calls authenticate from middleware.js (cross-file) + expect(consumerNames).toContain('authMiddleware'); + }); + + test('noTests filters test file consumers', () => { + const all = exportsData('auth.js', dbPath); + const filtered = exportsData('auth.js', dbPath, { noTests: true }); + + const allAuth = all.results.find((r) => r.name === 'authenticate'); + const filteredAuth = filtered.results.find((r) => r.name === 'authenticate'); + + const allConsumers = allAuth.consumers.map((c) => c.name); + const filteredConsumers = filteredAuth.consumers.map((c) => c.name); + + // testAuthenticate should be in unfiltered consumers + expect(allConsumers).toContain('testAuthenticate'); + // testAuthenticate should be excluded with noTests + expect(filteredConsumers).not.toContain('testAuthenticate'); + }); + + test('returns empty results for unknown file', () => { + const data = exportsData('nonexistent.js', dbPath); + expect(data.results).toHaveLength(0); + expect(data.totalExported).toBe(0); + expect(data.totalInternal).toBe(0); + }); + + test('reexports field is present', () => { + const data = exportsData('auth.js', dbPath); + expect(data).toHaveProperty('reexports'); + expect(Array.isArray(data.reexports)).toBe(true); + }); + + test('pagination limits results', () => { + const data = exportsData('auth.js', dbPath, { limit: 1, offset: 0 }); + expect(data.results).toHaveLength(1); + expect(data._pagination).toBeDefined(); + expect(data._pagination.total).toBeGreaterThanOrEqual(2); + expect(data._pagination.hasMore).toBe(true); + }); + + test('result shape has expected fields', () => { + const data = exportsData('auth.js', dbPath); + expect(data.results.length).toBeGreaterThan(0); + const sym = data.results[0]; + expect(sym).toHaveProperty('name'); + expect(sym).toHaveProperty('kind'); + expect(sym).toHaveProperty('line'); + expect(sym).toHaveProperty('consumers'); + expect(sym).toHaveProperty('consumerCount'); + expect(sym).toHaveProperty('role'); + expect(sym).toHaveProperty('signature'); + expect(sym).toHaveProperty('summary'); + expect(sym).toHaveProperty('endLine'); + expect(Array.isArray(sym.consumers)).toBe(true); + expect(typeof sym.consumerCount).toBe('number'); + }); +}); diff --git a/tests/unit/mcp.test.js b/tests/unit/mcp.test.js index fc610c4b..0ae273d6 100644 --- a/tests/unit/mcp.test.js +++ b/tests/unit/mcp.test.js @@ -11,6 +11,7 @@ import { buildToolList, TOOLS } from '../../src/mcp.js'; const ALL_TOOL_NAMES = [ 'query', 'file_deps', + 'file_exports', 'impact_analysis', 'find_cycles', 'module_map', @@ -37,6 +38,7 @@ const ALL_TOOL_NAMES = [ 'branch_compare', 'dataflow', 'check', + 'scoped_rebuild', 'list_repos', ]; @@ -250,6 +252,13 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(() => ({ name: 'test', results: [] })), contextData: vi.fn(() => ({ name: 'test', results: [] })), explainData: vi.fn(() => ({ target: 'test', kind: 'function', results: [] })), + exportsData: vi.fn(() => ({ + file: 'test', + results: [], + reexports: [], + totalExported: 0, + totalInternal: 0, + })), whereData: vi.fn(() => ({ target: 'test', mode: 'symbol', results: [] })), diffImpactData: vi.fn(() => ({ changedFiles: 0, affectedFunctions: [] })), listFunctionsData: vi.fn(() => ({ count: 0, functions: [] })), @@ -313,6 +322,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -372,6 +382,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: fnImpactMock, contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -428,6 +439,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: diffImpactMock, listFunctionsData: vi.fn(), @@ -487,6 +499,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: listFnMock, @@ -547,6 +560,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -605,6 +619,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -657,6 +672,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -711,6 +727,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -775,6 +792,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -832,6 +850,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -880,6 +899,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -928,6 +948,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -976,6 +997,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), @@ -1025,6 +1047,7 @@ describe('startMCPServer handler dispatch', () => { fnImpactData: vi.fn(), contextData: vi.fn(), explainData: vi.fn(), + exportsData: vi.fn(), whereData: vi.fn(), diffImpactData: vi.fn(), listFunctionsData: vi.fn(), From c0178221017afd5514a2489528b0a5b17a55d581 Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 19:23:44 -0700 Subject: [PATCH 6/8] feat: add scoped rebuild for parallel agent rollback Extract purgeFilesFromGraph() from the inline deletion cascade in buildGraph() for reuse. Add opts.scope and opts.noReverseDeps to buildGraph() so agents can surgically rebuild only their changed files without nuking other agents' graph state. - `--scope ` on `build` skips collectFiles/getChangedFiles - `--no-reverse-deps` skips reverse-dep cascade (safe when exports unchanged) - New `scoped_rebuild` MCP tool for multi-agent orchestration - purgeFilesFromGraph exported from programmatic API - Unit tests for purge function, integration tests for scoped rebuild - Documented agent-level rollback workflow in titan-paradigm.md Impact: 3 functions changed, 20 affected --- docs/use-cases/titan-paradigm.md | 27 +++ src/builder.js | 224 ++++++++++++++--------- src/mcp.js | 4 +- tests/integration/scoped-rebuild.test.js | 174 ++++++++++++++++++ tests/unit/purge-files.test.js | 184 +++++++++++++++++++ 5 files changed, 528 insertions(+), 85 deletions(-) create mode 100644 tests/integration/scoped-rebuild.test.js create mode 100644 tests/unit/purge-files.test.js diff --git a/docs/use-cases/titan-paradigm.md b/docs/use-cases/titan-paradigm.md index 3b9402e8..e4096f8e 100644 --- a/docs/use-cases/titan-paradigm.md +++ b/docs/use-cases/titan-paradigm.md @@ -191,6 +191,33 @@ codegraph snapshot save pre-gauntlet codegraph snapshot restore pre-gauntlet ``` +For **agent-level rollback**, use scoped rebuild instead of full snapshot restore. This lets one agent revert its files without nuking every other agent's graph state: + +```bash +# Agent reverts its own files via git +git checkout -- src/parser.js src/resolve.js + +# Rebuild only those files in the graph — other agents' data is untouched +codegraph build --scope src/parser.js src/resolve.js + +# If exports didn't change (exact same version), skip reverse-dep cascade +codegraph build --scope src/parser.js src/resolve.js --no-reverse-deps +``` + +The MCP equivalent for AI agents: + +```json +{ + "tool": "scoped_rebuild", + "arguments": { + "files": ["src/parser.js", "src/resolve.js"], + "no_reverse_deps": false + } +} +``` + +Use full `snapshot save/restore` for orchestrator-level checkpoints (before the Gauntlet starts), and scoped rebuild for per-agent rollback during the Gauntlet. + Use `manifesto` as an additional CI gate — it exits with code 1 when any function exceeds a fail-level threshold: ```bash diff --git a/src/builder.js b/src/builder.js index a9ae11d4..24021f55 100644 --- a/src/builder.js +++ b/src/builder.js @@ -338,6 +338,76 @@ function getChangedFiles(db, allFiles, rootDir) { return { changed, removed, isFullBuild: false }; } +/** + * Purge all graph data for the specified files. + * Deletes: embeddings → edges (in+out) → node_metrics → function_complexity → dataflow → nodes. + * Handles missing tables gracefully (embeddings, complexity, dataflow may not exist in older DBs). + * + * @param {import('better-sqlite3').Database} db - Open writable database + * @param {string[]} files - Relative file paths to purge + * @param {object} [options] + * @param {boolean} [options.purgeHashes=true] - Also delete file_hashes entries + */ +export function purgeFilesFromGraph(db, files, options = {}) { + const { purgeHashes = true } = options; + if (!files || files.length === 0) return; + + // Check if embeddings table exists + let hasEmbeddings = false; + try { + db.prepare('SELECT 1 FROM embeddings LIMIT 1').get(); + hasEmbeddings = true; + } catch { + /* table doesn't exist */ + } + + const deleteEmbeddingsForFile = hasEmbeddings + ? db.prepare('DELETE FROM embeddings WHERE node_id IN (SELECT id FROM nodes WHERE file = ?)') + : null; + const deleteNodesForFile = db.prepare('DELETE FROM nodes WHERE file = ?'); + const deleteEdgesForFile = db.prepare(` + DELETE FROM edges WHERE source_id IN (SELECT id FROM nodes WHERE file = @f) + OR target_id IN (SELECT id FROM nodes WHERE file = @f) + `); + const deleteMetricsForFile = db.prepare( + 'DELETE FROM node_metrics WHERE node_id IN (SELECT id FROM nodes WHERE file = ?)', + ); + let deleteComplexityForFile; + try { + deleteComplexityForFile = db.prepare( + 'DELETE FROM function_complexity WHERE node_id IN (SELECT id FROM nodes WHERE file = ?)', + ); + } catch { + deleteComplexityForFile = null; + } + let deleteDataflowForFile; + try { + deleteDataflowForFile = db.prepare( + 'DELETE FROM dataflow WHERE source_id IN (SELECT id FROM nodes WHERE file = ?) OR target_id IN (SELECT id FROM nodes WHERE file = ?)', + ); + } catch { + deleteDataflowForFile = null; + } + let deleteHashForFile; + if (purgeHashes) { + try { + deleteHashForFile = db.prepare('DELETE FROM file_hashes WHERE file = ?'); + } catch { + deleteHashForFile = null; + } + } + + for (const relPath of files) { + deleteEmbeddingsForFile?.run(relPath); + deleteEdgesForFile.run({ f: relPath }); + deleteMetricsForFile.run(relPath); + deleteComplexityForFile?.run(relPath); + deleteDataflowForFile?.run(relPath, relPath); + deleteNodesForFile.run(relPath); + if (purgeHashes) deleteHashForFile?.run(relPath); + } +} + export async function buildGraph(rootDir, opts = {}) { const dbPath = path.join(rootDir, '.codegraph', 'graph.db'); const db = openDb(dbPath); @@ -384,19 +454,46 @@ export async function buildGraph(rootDir, opts = {}) { ); } - const collected = collectFiles(rootDir, [], config, new Set()); - const files = collected.files; - const discoveredDirs = collected.directories; - info(`Found ${files.length} files to parse`); - - // Check for incremental build - const { changed, removed, isFullBuild } = incremental - ? getChangedFiles(db, files, rootDir) - : { changed: files.map((f) => ({ file: f })), removed: [], isFullBuild: true }; - - // Separate metadata-only updates (mtime/size self-heal) from real changes - const parseChanges = changed.filter((c) => !c.metadataOnly); - const metadataUpdates = changed.filter((c) => c.metadataOnly); + // ── Scoped rebuild: rebuild only specified files ────────────────── + let files, discoveredDirs, parseChanges, metadataUpdates, removed, isFullBuild; + + if (opts.scope) { + const scopedFiles = opts.scope.map((f) => normalizePath(f)); + const existing = []; + const missing = []; + for (const rel of scopedFiles) { + const abs = path.join(rootDir, rel); + if (fs.existsSync(abs)) { + existing.push({ file: abs, relPath: rel }); + } else { + missing.push(rel); + } + } + files = existing.map((e) => e.file); + // Derive discoveredDirs from scoped files' parent directories + discoveredDirs = new Set(existing.map((e) => path.dirname(e.file))); + parseChanges = existing; + metadataUpdates = []; + removed = missing; + isFullBuild = false; + info(`Scoped rebuild: ${existing.length} files to rebuild, ${missing.length} to purge`); + } else { + const collected = collectFiles(rootDir, [], config, new Set()); + files = collected.files; + discoveredDirs = collected.directories; + info(`Found ${files.length} files to parse`); + + // Check for incremental build + const increResult = incremental + ? getChangedFiles(db, files, rootDir) + : { changed: files.map((f) => ({ file: f })), removed: [], isFullBuild: true }; + removed = increResult.removed; + isFullBuild = increResult.isFullBuild; + + // Separate metadata-only updates (mtime/size self-heal) from real changes + parseChanges = increResult.changed.filter((c) => !c.metadataOnly); + metadataUpdates = increResult.changed.filter((c) => c.metadataOnly); + } if (!isFullBuild && parseChanges.length === 0 && removed.length === 0) { // Still update metadata for self-healing even when no real changes @@ -446,29 +543,33 @@ export async function buildGraph(rootDir, opts = {}) { // Find files with edges pointing TO changed/removed files. // Their nodes stay intact (preserving IDs), but outgoing edges are // deleted so they can be rebuilt during the edge-building pass. - const changedRelPaths = new Set(); - for (const item of parseChanges) { - changedRelPaths.add(item.relPath || normalizePath(path.relative(rootDir, item.file))); - } - for (const relPath of removed) { - changedRelPaths.add(relPath); - } - + // When opts.noReverseDeps is true (e.g. agent rollback to same version), + // skip this cascade — the agent knows exports didn't change. const reverseDeps = new Set(); - if (changedRelPaths.size > 0) { - const findReverseDeps = db.prepare(` - SELECT DISTINCT n_src.file FROM edges e - JOIN nodes n_src ON e.source_id = n_src.id - JOIN nodes n_tgt ON e.target_id = n_tgt.id - WHERE n_tgt.file = ? AND n_src.file != n_tgt.file AND n_src.kind != 'directory' - `); - for (const relPath of changedRelPaths) { - for (const row of findReverseDeps.all(relPath)) { - if (!changedRelPaths.has(row.file) && !reverseDeps.has(row.file)) { - // Verify the file still exists on disk - const absPath = path.join(rootDir, row.file); - if (fs.existsSync(absPath)) { - reverseDeps.add(row.file); + if (!opts.noReverseDeps) { + const changedRelPaths = new Set(); + for (const item of parseChanges) { + changedRelPaths.add(item.relPath || normalizePath(path.relative(rootDir, item.file))); + } + for (const relPath of removed) { + changedRelPaths.add(relPath); + } + + if (changedRelPaths.size > 0) { + const findReverseDeps = db.prepare(` + SELECT DISTINCT n_src.file FROM edges e + JOIN nodes n_src ON e.source_id = n_src.id + JOIN nodes n_tgt ON e.target_id = n_tgt.id + WHERE n_tgt.file = ? AND n_src.file != n_tgt.file AND n_src.kind != 'directory' + `); + for (const relPath of changedRelPaths) { + for (const row of findReverseDeps.all(relPath)) { + if (!changedRelPaths.has(row.file) && !reverseDeps.has(row.file)) { + // Verify the file still exists on disk + const absPath = path.join(rootDir, row.file); + if (fs.existsSync(absPath)) { + reverseDeps.add(row.file); + } } } } @@ -482,57 +583,16 @@ export async function buildGraph(rootDir, opts = {}) { debug(`Changed files: ${parseChanges.map((c) => c.relPath).join(', ')}`); if (removed.length > 0) debug(`Removed files: ${removed.join(', ')}`); // Remove embeddings/metrics/edges/nodes for changed and removed files - // Embeddings must be deleted BEFORE nodes (we need node IDs to find them) - const deleteEmbeddingsForFile = hasEmbeddings - ? db.prepare('DELETE FROM embeddings WHERE node_id IN (SELECT id FROM nodes WHERE file = ?)') - : null; - const deleteNodesForFile = db.prepare('DELETE FROM nodes WHERE file = ?'); - const deleteEdgesForFile = db.prepare(` - DELETE FROM edges WHERE source_id IN (SELECT id FROM nodes WHERE file = @f) - OR target_id IN (SELECT id FROM nodes WHERE file = @f) - `); - const deleteOutgoingEdgesForFile = db.prepare( - 'DELETE FROM edges WHERE source_id IN (SELECT id FROM nodes WHERE file = ?)', - ); - const deleteMetricsForFile = db.prepare( - 'DELETE FROM node_metrics WHERE node_id IN (SELECT id FROM nodes WHERE file = ?)', + const changePaths = parseChanges.map( + (item) => item.relPath || normalizePath(path.relative(rootDir, item.file)), ); - let deleteComplexityForFile; - try { - deleteComplexityForFile = db.prepare( - 'DELETE FROM function_complexity WHERE node_id IN (SELECT id FROM nodes WHERE file = ?)', - ); - } catch { - deleteComplexityForFile = null; - } - let deleteDataflowForFile; - try { - deleteDataflowForFile = db.prepare( - 'DELETE FROM dataflow WHERE source_id IN (SELECT id FROM nodes WHERE file = ?) OR target_id IN (SELECT id FROM nodes WHERE file = ?)', - ); - } catch { - deleteDataflowForFile = null; - } - for (const relPath of removed) { - deleteEmbeddingsForFile?.run(relPath); - deleteEdgesForFile.run({ f: relPath }); - deleteMetricsForFile.run(relPath); - deleteComplexityForFile?.run(relPath); - deleteDataflowForFile?.run(relPath, relPath); - deleteNodesForFile.run(relPath); - } - for (const item of parseChanges) { - const relPath = item.relPath || normalizePath(path.relative(rootDir, item.file)); - deleteEmbeddingsForFile?.run(relPath); - deleteEdgesForFile.run({ f: relPath }); - deleteMetricsForFile.run(relPath); - deleteComplexityForFile?.run(relPath); - deleteDataflowForFile?.run(relPath, relPath); - deleteNodesForFile.run(relPath); - } + purgeFilesFromGraph(db, [...removed, ...changePaths], { purgeHashes: false }); // Process reverse deps: delete only outgoing edges (nodes/IDs preserved) // then add them to the parse list so they participate in edge building + const deleteOutgoingEdgesForFile = db.prepare( + 'DELETE FROM edges WHERE source_id IN (SELECT id FROM nodes WHERE file = ?)', + ); for (const relPath of reverseDeps) { deleteOutgoingEdgesForFile.run(relPath); } diff --git a/src/mcp.js b/src/mcp.js index 416e8077..14aa7a35 100644 --- a/src/mcp.js +++ b/src/mcp.js @@ -1257,9 +1257,7 @@ export async function startMCPServer(customDbPath, options = {}) { break; } const path = await import('node:path'); - const rootDir = dbPath - ? path.dirname(path.dirname(dbPath)) - : process.cwd(); + const rootDir = dbPath ? path.dirname(path.dirname(dbPath)) : process.cwd(); const { buildGraph } = await import('./builder.js'); await buildGraph(rootDir, { scope: args.files, diff --git a/tests/integration/scoped-rebuild.test.js b/tests/integration/scoped-rebuild.test.js new file mode 100644 index 00000000..fd4d8a12 --- /dev/null +++ b/tests/integration/scoped-rebuild.test.js @@ -0,0 +1,174 @@ +/** + * Integration tests for scoped rebuild (opts.scope + opts.noReverseDeps). + * + * Uses the sample-project fixture (math.js, utils.js, index.js) to build + * a real graph, then verifies that scoped rebuilds surgically update only + * targeted files while leaving everything else intact. + */ + +import fs from 'node:fs'; +import os from 'node:os'; +import path from 'node:path'; +import { afterAll, beforeAll, describe, expect, test } from 'vitest'; +import { buildGraph } from '../../src/builder.js'; + +const FIXTURE_DIR = path.join(import.meta.dirname, '..', 'fixtures', 'sample-project'); + +let tmpDir; + +function copyFixture() { + const dir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-scoped-')); + for (const file of fs.readdirSync(FIXTURE_DIR)) { + fs.copyFileSync(path.join(FIXTURE_DIR, file), path.join(dir, file)); + } + return dir; +} + +function openDb(dir) { + const Database = require('better-sqlite3'); + return new Database(path.join(dir, '.codegraph', 'graph.db'), { readonly: true }); +} + +function nodeCount(db, file) { + return db.prepare('SELECT COUNT(*) as c FROM nodes WHERE file = ?').get(file).c; +} + +function edgeCount(db) { + return db.prepare('SELECT COUNT(*) as c FROM edges').get().c; +} + +beforeAll(async () => { + tmpDir = copyFixture(); + // Build the initial full graph + await buildGraph(tmpDir, { incremental: false }); +}); + +afterAll(() => { + if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true }); +}); + +describe('scoped rebuild', () => { + test('scoped rebuild updates only targeted file, preserves others', async () => { + const db1 = openDb(tmpDir); + const mathNodesBefore = nodeCount(db1, 'math.js'); + const utilsNodesBefore = nodeCount(db1, 'utils.js'); + const indexNodesBefore = nodeCount(db1, 'index.js'); + db1.close(); + + expect(mathNodesBefore).toBeGreaterThan(0); + expect(utilsNodesBefore).toBeGreaterThan(0); + + // Scoped rebuild only math.js (no content change — should re-parse same result) + await buildGraph(tmpDir, { scope: ['math.js'] }); + + const db2 = openDb(tmpDir); + const mathNodesAfter = nodeCount(db2, 'math.js'); + const utilsNodesAfter = nodeCount(db2, 'utils.js'); + const indexNodesAfter = nodeCount(db2, 'index.js'); + db2.close(); + + // math.js should be rebuilt with same node count + expect(mathNodesAfter).toBe(mathNodesBefore); + // utils.js and index.js should be untouched + expect(utilsNodesAfter).toBe(utilsNodesBefore); + expect(indexNodesAfter).toBe(indexNodesBefore); + }); + + test('scoped rebuild with deleted file purges it from graph', async () => { + // Create a temporary extra file, build it in, then delete and scope-rebuild + const extraPath = path.join(tmpDir, 'extra.js'); + fs.writeFileSync(extraPath, 'function extra() { return 1; }\nmodule.exports = { extra };\n'); + + // Full rebuild to pick up the new file + await buildGraph(tmpDir, { incremental: false }); + + const db1 = openDb(tmpDir); + const extraBefore = nodeCount(db1, 'extra.js'); + const mathBefore = nodeCount(db1, 'math.js'); + db1.close(); + expect(extraBefore).toBeGreaterThan(0); + + // Delete the file and scope-rebuild it + fs.unlinkSync(extraPath); + await buildGraph(tmpDir, { scope: ['extra.js'] }); + + const db2 = openDb(tmpDir); + const extraAfter = nodeCount(db2, 'extra.js'); + const mathAfter = nodeCount(db2, 'math.js'); + db2.close(); + + // extra.js should be completely purged + expect(extraAfter).toBe(0); + // math.js should be untouched + expect(mathAfter).toBe(mathBefore); + }); + + test('reverse-dep cascade rebuilds importers edges', async () => { + // Full rebuild to get clean state + await buildGraph(tmpDir, { incremental: false }); + + const db1 = openDb(tmpDir); + const edgesBefore = edgeCount(db1); + db1.close(); + + // Scoped rebuild of math.js with default (reverse deps enabled) + // utils.js and index.js import math.js, so their edges should be rebuilt + await buildGraph(tmpDir, { scope: ['math.js'] }); + + const db2 = openDb(tmpDir); + const edgesAfter = edgeCount(db2); + db2.close(); + + // Edge count should be comparable (rebuilt edges for math.js + reverse deps) + expect(edgesAfter).toBeGreaterThan(0); + // Should not lose edges dramatically + expect(edgesAfter).toBeGreaterThanOrEqual(edgesBefore - 2); + }); + + test('noReverseDeps: true skips the cascade', async () => { + // Full rebuild to get clean state + await buildGraph(tmpDir, { incremental: false }); + + // Scoped rebuild with noReverseDeps — only math.js edges are rebuilt + await buildGraph(tmpDir, { scope: ['math.js'], noReverseDeps: true }); + + const db2 = openDb(tmpDir); + const edgesAfter = edgeCount(db2); + const mathNodes = nodeCount(db2, 'math.js'); + const utilsNodes = nodeCount(db2, 'utils.js'); + db2.close(); + + // math.js and utils.js should still have nodes + expect(mathNodes).toBeGreaterThan(0); + expect(utilsNodes).toBeGreaterThan(0); + // With noReverseDeps, we may lose some edges because importers weren't rebuilt + // but the graph should still be valid + expect(edgesAfter).toBeGreaterThan(0); + }); + + test('multiple files in scope', async () => { + // Full rebuild to get clean state + await buildGraph(tmpDir, { incremental: false }); + + const db1 = openDb(tmpDir); + const mathBefore = nodeCount(db1, 'math.js'); + const utilsBefore = nodeCount(db1, 'utils.js'); + const indexBefore = nodeCount(db1, 'index.js'); + db1.close(); + + // Scope both math.js and utils.js + await buildGraph(tmpDir, { scope: ['math.js', 'utils.js'] }); + + const db2 = openDb(tmpDir); + const mathAfter = nodeCount(db2, 'math.js'); + const utilsAfter = nodeCount(db2, 'utils.js'); + const indexAfter = nodeCount(db2, 'index.js'); + db2.close(); + + // Both scoped files should be rebuilt with same counts + expect(mathAfter).toBe(mathBefore); + expect(utilsAfter).toBe(utilsBefore); + // index.js untouched + expect(indexAfter).toBe(indexBefore); + }); +}); diff --git a/tests/unit/purge-files.test.js b/tests/unit/purge-files.test.js new file mode 100644 index 00000000..9702899a --- /dev/null +++ b/tests/unit/purge-files.test.js @@ -0,0 +1,184 @@ +/** + * Unit tests for purgeFilesFromGraph() — the extracted deletion cascade. + */ + +import fs from 'node:fs'; +import os from 'node:os'; +import path from 'node:path'; +import Database from 'better-sqlite3'; +import { afterEach, describe, expect, test } from 'vitest'; +import { purgeFilesFromGraph } from '../../src/builder.js'; +import { initSchema } from '../../src/db.js'; + +// ─── Helpers ─────────────────────────────────────────────────────────── + +function insertNode(db, name, kind, file, line) { + return db + .prepare('INSERT INTO nodes (name, kind, file, line) VALUES (?, ?, ?, ?)') + .run(name, kind, file, line).lastInsertRowid; +} + +function insertEdge(db, sourceId, targetId, kind, confidence = 1.0) { + db.prepare( + 'INSERT INTO edges (source_id, target_id, kind, confidence, dynamic) VALUES (?, ?, ?, ?, 0)', + ).run(sourceId, targetId, kind, confidence); +} + +// ─── Fixture ─────────────────────────────────────────────────────────── + +// Track open DBs for cleanup (Windows locks DB files) +let openDbs = []; + +afterEach(() => { + for (const db of openDbs) { + try { + db.close(); + } catch { + /* already closed */ + } + } + openDbs = []; +}); + +function makeDb() { + const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-purge-')); + const dbPath = path.join(tmpDir, 'graph.db'); + const db = new Database(dbPath); + db.pragma('journal_mode = WAL'); + initSchema(db); + openDbs.push(db); + return db; +} + +function seedGraph(db) { + // Two files: auth.js and utils.js + const fAuth = insertNode(db, 'auth.js', 'file', 'auth.js', 0); + const fUtils = insertNode(db, 'utils.js', 'file', 'utils.js', 0); + const authenticate = insertNode(db, 'authenticate', 'function', 'auth.js', 10); + const validate = insertNode(db, 'validateToken', 'function', 'auth.js', 25); + const format = insertNode(db, 'formatResponse', 'function', 'utils.js', 5); + + insertEdge(db, authenticate, validate, 'calls'); + insertEdge(db, fAuth, fUtils, 'imports'); + + // node_metrics (columns: node_id, fan_in, fan_out, etc.) + db.prepare('INSERT INTO node_metrics (node_id, fan_in) VALUES (?, ?)').run(fAuth, 2); + db.prepare('INSERT INTO node_metrics (node_id, fan_in) VALUES (?, ?)').run(fUtils, 1); + + // file_hashes + try { + db.prepare( + 'INSERT OR REPLACE INTO file_hashes (file, hash, mtime, size) VALUES (?, ?, 0, 0)', + ).run('auth.js', 'abc123'); + db.prepare( + 'INSERT OR REPLACE INTO file_hashes (file, hash, mtime, size) VALUES (?, ?, 0, 0)', + ).run('utils.js', 'def456'); + } catch { + /* table may not exist in very old schemas */ + } + + return { fAuth, fUtils, authenticate, validate, format }; +} + +// ─── Tests ───────────────────────────────────────────────────────────── + +describe('purgeFilesFromGraph', () => { + test('purges nodes/edges/metrics for specified files, leaves others untouched', () => { + const db = makeDb(); + seedGraph(db); + + // Purge only auth.js + purgeFilesFromGraph(db, ['auth.js']); + + // auth.js nodes should be gone + const authNodes = db.prepare("SELECT * FROM nodes WHERE file = 'auth.js'").all(); + expect(authNodes).toHaveLength(0); + + // utils.js nodes should remain + const utilsNodes = db.prepare("SELECT * FROM nodes WHERE file = 'utils.js'").all(); + expect(utilsNodes.length).toBeGreaterThan(0); + + // Edges involving auth.js nodes should be gone + const edges = db.prepare('SELECT * FROM edges').all(); + // The only remaining nodes are from utils.js, so no edges should reference auth.js nodes + for (const edge of edges) { + const src = db.prepare('SELECT file FROM nodes WHERE id = ?').get(edge.source_id); + const tgt = db.prepare('SELECT file FROM nodes WHERE id = ?').get(edge.target_id); + if (src) expect(src.file).not.toBe('auth.js'); + if (tgt) expect(tgt.file).not.toBe('auth.js'); + } + + // Metrics for auth.js file node should be gone (we inserted metrics for file node IDs) + // Since auth.js nodes are deleted, their metrics should also be gone + const remainingMetrics = db.prepare('SELECT * FROM node_metrics').all(); + // Only the utils.js file node metric should remain + expect(remainingMetrics).toHaveLength(1); + + // file_hashes for auth.js should be gone (purgeHashes defaults to true) + const authHash = db.prepare("SELECT * FROM file_hashes WHERE file = 'auth.js'").all(); + expect(authHash).toHaveLength(0); + + // utils.js hash should remain + const utilsHash = db.prepare("SELECT * FROM file_hashes WHERE file = 'utils.js'").all(); + expect(utilsHash).toHaveLength(1); + }); + + test('respects purgeHashes: false', () => { + const db = makeDb(); + seedGraph(db); + + purgeFilesFromGraph(db, ['auth.js'], { purgeHashes: false }); + + // Nodes should be gone + const authNodes = db.prepare("SELECT * FROM nodes WHERE file = 'auth.js'").all(); + expect(authNodes).toHaveLength(0); + + // But file_hashes should remain + const authHash = db.prepare("SELECT * FROM file_hashes WHERE file = 'auth.js'").all(); + expect(authHash).toHaveLength(1); + }); + + test('handles missing optional tables gracefully', () => { + const db = makeDb(); + seedGraph(db); + + // Drop optional tables to simulate pre-migration DB + try { + db.exec('DROP TABLE IF EXISTS function_complexity'); + } catch { + /* ignore */ + } + try { + db.exec('DROP TABLE IF EXISTS dataflow'); + } catch { + /* ignore */ + } + + // Should not throw + expect(() => purgeFilesFromGraph(db, ['auth.js'])).not.toThrow(); + + const authNodes = db.prepare("SELECT * FROM nodes WHERE file = 'auth.js'").all(); + expect(authNodes).toHaveLength(0); + }); + + test('no-ops on empty file list', () => { + const db = makeDb(); + seedGraph(db); + + const beforeCount = db.prepare('SELECT COUNT(*) as c FROM nodes').get().c; + purgeFilesFromGraph(db, []); + const afterCount = db.prepare('SELECT COUNT(*) as c FROM nodes').get().c; + expect(afterCount).toBe(beforeCount); + }); + + test('no-ops on null/undefined file list', () => { + const db = makeDb(); + seedGraph(db); + + const beforeCount = db.prepare('SELECT COUNT(*) as c FROM nodes').get().c; + purgeFilesFromGraph(db, null); + purgeFilesFromGraph(db, undefined); + const afterCount = db.prepare('SELECT COUNT(*) as c FROM nodes').get().c; + expect(afterCount).toBe(beforeCount); + }); +}); From 651ddb25664a2327529587dfbc2b231a36bfdf7b Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 19:35:22 -0700 Subject: [PATCH 7/8] fix: remove leaked scoped_rebuild changes from another session Reverts purgeFilesFromGraph export, --scope/--no-reverse-deps CLI options, scoped_rebuild MCP tool+handler, and test list entry that were accidentally included from a concurrent session's dirty worktree. Impact: 2 functions changed, 1 affected --- src/cli.js | 10 +--------- src/index.js | 8 +------- src/mcp.js | 43 ------------------------------------------ tests/unit/mcp.test.js | 1 - 4 files changed, 2 insertions(+), 60 deletions(-) diff --git a/src/cli.js b/src/cli.js index bd3daa79..1ca7f974 100644 --- a/src/cli.js +++ b/src/cli.js @@ -98,18 +98,10 @@ program .description('Parse repo and build graph in .codegraph/graph.db') .option('--no-incremental', 'Force full rebuild (ignore file hashes)') .option('--dataflow', 'Extract data flow edges (flows_to, returns, mutates)') - .option('--scope ', 'Rebuild only specified files (for agent-level rollback)') - .option('--no-reverse-deps', 'Skip reverse dependency cascade (only meaningful with --scope)') .action(async (dir, opts) => { const root = path.resolve(dir || '.'); const engine = program.opts().engine; - await buildGraph(root, { - incremental: opts.incremental, - engine, - dataflow: opts.dataflow, - scope: opts.scope, - noReverseDeps: opts.reverseDeps === false, - }); + await buildGraph(root, { incremental: opts.incremental, engine, dataflow: opts.dataflow }); }); program diff --git a/src/index.js b/src/index.js index 594eed2e..047e247b 100644 --- a/src/index.js +++ b/src/index.js @@ -21,13 +21,7 @@ export { evaluateBoundaries, PRESETS, validateBoundaryConfig } from './boundarie // Branch comparison export { branchCompareData, branchCompareMermaid } from './branch-compare.js'; // Graph building -export { - buildGraph, - collectFiles, - loadPathAliases, - purgeFilesFromGraph, - resolveImportPath, -} from './builder.js'; +export { buildGraph, collectFiles, loadPathAliases, resolveImportPath } from './builder.js'; // Check (CI validation predicates) export { check, checkData } from './check.js'; // Co-change analysis diff --git a/src/mcp.js b/src/mcp.js index 14aa7a35..89a7e07a 100644 --- a/src/mcp.js +++ b/src/mcp.js @@ -681,31 +681,6 @@ const BASE_TOOLS = [ }, }, }, - // Write tool — intentional for multi-agent orchestration (Titan Paradigm). - // Allows an agent to surgically rebuild only its changed files without - // nuking every other agent's graph state. - { - name: 'scoped_rebuild', - description: - 'Rebuild the graph for specific files only, leaving all other data untouched. Designed for agent-level rollback: revert source files via git, then call this to update the graph surgically.', - inputSchema: { - type: 'object', - properties: { - files: { - type: 'array', - items: { type: 'string' }, - description: 'Relative file paths to rebuild (deleted files are purged from graph)', - }, - no_reverse_deps: { - type: 'boolean', - description: - 'Skip reverse dependency cascade — use when exports did not change (e.g. reverting to the exact same version)', - default: false, - }, - }, - required: ['files'], - }, - }, ]; const LIST_REPOS_TOOL = { @@ -1251,24 +1226,6 @@ export async function startMCPServer(customDbPath, options = {}) { }); break; } - case 'scoped_rebuild': { - if (!args.files || args.files.length === 0) { - result = { error: 'files array is required and must not be empty' }; - break; - } - const path = await import('node:path'); - const rootDir = dbPath ? path.dirname(path.dirname(dbPath)) : process.cwd(); - const { buildGraph } = await import('./builder.js'); - await buildGraph(rootDir, { - scope: args.files, - noReverseDeps: args.no_reverse_deps, - }); - result = { - rebuilt: args.files, - noReverseDeps: !!args.no_reverse_deps, - }; - break; - } case 'list_repos': { const { listRepos, pruneRegistry } = await import('./registry.js'); pruneRegistry(); diff --git a/tests/unit/mcp.test.js b/tests/unit/mcp.test.js index 0ae273d6..b5fa4174 100644 --- a/tests/unit/mcp.test.js +++ b/tests/unit/mcp.test.js @@ -38,7 +38,6 @@ const ALL_TOOL_NAMES = [ 'branch_compare', 'dataflow', 'check', - 'scoped_rebuild', 'list_repos', ]; From 817b579bc5ec0e160c48337f8b47cb919b480baa Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 20:09:24 -0700 Subject: [PATCH 8/8] fix: remove stale scoped-rebuild docs from titan-paradigm The scoped_rebuild feature (--scope, --no-reverse-deps CLI options and scoped_rebuild MCP tool) was removed in 651ddb2 but the documentation in titan-paradigm.md still referenced it. Addresses Greptile review feedback on PR #269. --- docs/use-cases/titan-paradigm.md | 27 --------------------------- 1 file changed, 27 deletions(-) diff --git a/docs/use-cases/titan-paradigm.md b/docs/use-cases/titan-paradigm.md index e4096f8e..3b9402e8 100644 --- a/docs/use-cases/titan-paradigm.md +++ b/docs/use-cases/titan-paradigm.md @@ -191,33 +191,6 @@ codegraph snapshot save pre-gauntlet codegraph snapshot restore pre-gauntlet ``` -For **agent-level rollback**, use scoped rebuild instead of full snapshot restore. This lets one agent revert its files without nuking every other agent's graph state: - -```bash -# Agent reverts its own files via git -git checkout -- src/parser.js src/resolve.js - -# Rebuild only those files in the graph — other agents' data is untouched -codegraph build --scope src/parser.js src/resolve.js - -# If exports didn't change (exact same version), skip reverse-dep cascade -codegraph build --scope src/parser.js src/resolve.js --no-reverse-deps -``` - -The MCP equivalent for AI agents: - -```json -{ - "tool": "scoped_rebuild", - "arguments": { - "files": ["src/parser.js", "src/resolve.js"], - "no_reverse_deps": false - } -} -``` - -Use full `snapshot save/restore` for orchestrator-level checkpoints (before the Gauntlet starts), and scoped rebuild for per-agent rollback during the Gauntlet. - Use `manifesto` as an additional CI gate — it exits with code 1 when any function exceeds a fail-level threshold: ```bash