From 21a670847ccfdf96befc7f3ca2142df1ade8a018 Mon Sep 17 00:00:00 2001
From: "github-actions[bot]" <github-actions[bot]@users.noreply.github.com>
Date: Sun, 22 Feb 2026 02:49:26 -0700
Subject: [PATCH 1/3] docs: add competitive analysis and foundation principles

Analyze 21 code intelligence tools, rank codegraph #7/22, and
establish 8 core principles (zero-infrastructure, dual engine,
confidence scoring, incremental builds, embeddable-first,
single registry, security defaults, scope boundaries).
---
 COMPETITIVE_ANALYSIS.md | 177 ++++++++++++++++++++++++++++++++++++++++
 FOUNDATION.md           | 169 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 346 insertions(+)
 create mode 100644 COMPETITIVE_ANALYSIS.md
 create mode 100644 FOUNDATION.md

diff --git a/COMPETITIVE_ANALYSIS.md b/COMPETITIVE_ANALYSIS.md
new file mode 100644
index 00000000..d2cf7d94
--- /dev/null
+++ b/COMPETITIVE_ANALYSIS.md
@@ -0,0 +1,177 @@
+# Competitive Analysis — Code Graph / Code Intelligence Tools
+
+**Date:** 2026-02-22
+**Scope:** 21 code analysis tools compared against `@optave/codegraph`
+
+---
+
+## Overall Ranking
+
+Ranked by weighted score across 6 dimensions (each 1–5):
+
+| # | Score | Project | Stars | Lang | License | Summary |
+|---|-------|---------|-------|------|---------|---------|
+| 1 | 4.5 | [vitali87/code-graph-rag](https://github.com/vitali87/code-graph-rag) | 1,916 | Python | MIT | Graph RAG with Memgraph, multi-provider AI, code editing, semantic search, MCP |
+| 2 | 4.2 | [seatedro/glimpse](https://github.com/seatedro/glimpse) | 349 | Rust | MIT | Clipboard-first codebase-to-LLM tool with call graphs, token counting, LSP resolution |
+| 3 | 4.0 | [SimplyLiz/CodeMCP (CKB)](https://github.com/SimplyLiz/CodeMCP) | 59 | Go | Custom | SCIP-based indexing, compound operations (83% token savings), CODEOWNERS, secret scanning |
+| 4 | 3.9 | [harshkedia177/axon](https://github.com/harshkedia177/axon) | 29 | Python | None | 11-phase pipeline, KuzuDB, Leiden community detection, dead code, change coupling |
+| 5 | 3.8 | [anrgct/autodev-codebase](https://github.com/anrgct/autodev-codebase) | 111 | TypeScript | None | 40+ languages, 7 embedding providers, Cytoscape.js visualization, LLM reranking |
+| 6 | 3.7 | [Anandb71/arbor](https://github.com/Anandb71/arbor) | 85 | Rust | MIT | Native GUI, confidence scoring, architectural role classification, fuzzy search, MCP |
+| **7** | **3.6** | **[@optave/codegraph](https://github.com/optave/codegraph)** | — | **JS/Rust** | **Apache-2.0** | **Dual engine (native Rust + WASM), 11 languages, SQLite, MCP, semantic search, zero-cloud** |
+| 8 | 3.4 | [Durafen/Claude-code-memory](https://github.com/Durafen/Claude-code-memory) | 72 | Python | None | Memory Guard quality gate, persistent codebase memory, Voyage AI + Qdrant |
+| 9 | 3.3 | [NeuralRays/codexray](https://github.com/NeuralRays/codexray) | 2 | TypeScript | MIT | 16 MCP tools, TF-IDF semantic search (~50MB), dead code, complexity, path finding |
+| 10 | 3.2 | [al1-nasir/codegraph-cli](https://github.com/al1-nasir/codegraph-cli) | 11 | Python | MIT | CrewAI multi-agent system, 6 LLM providers, browser explorer, DOCX export |
+| 11 | 3.1 | [anasdayeh/claude-context-local](https://github.com/anasdayeh/claude-context-local) | 0 | Python | None | 100% local, Merkle DAG incremental indexing, sharded FAISS, hybrid BM25+vector, GPU accel |
+| 12 | 3.0 | [Vasu014/loregrep](https://github.com/Vasu014/loregrep) | 12 | Rust | Apache-2.0 | In-memory index library, Rust + Python bindings, AI-tool-ready schemas |
+| 13 | 2.9 | [rahulvgmail/CodeInteliMCP](https://github.com/rahulvgmail/CodeInteliMCP) | 8 | Python | None | DuckDB + ChromaDB (zero Docker), multi-repo, lightweight embedded DBs |
+| 14 | 2.8 | [Bikach/codeGraph](https://github.com/Bikach/codeGraph) | 6 | TypeScript | MIT | Neo4j graph, Claude Code slash commands, Kotlin support, 40-50% cost reduction |
+| 15 | 2.7 | [yumeiriowl/repo-graphrag-mcp](https://github.com/yumeiriowl/repo-graphrag-mcp) | 3 | Python | MIT | LightRAG + tree-sitter, entity merge (code ↔ docs), implementation planning tool |
+| 16 | 2.6 | [0xjcf/MCP_CodeAnalysis](https://github.com/0xjcf/MCP_CodeAnalysis) | 7 | Python/TS | None | Stateful tools (XState), Redis sessions, socio-technical analysis, dual language impl |
+| 17 | 2.5 | [RaheesAhmed/code-context-mcp](https://github.com/RaheesAhmed/code-context-mcp) | 0 | Python | MIT | Security pattern detection, auto architecture diagrams, code flow tracing |
+| 18 | 2.4 | [shantham/codegraph](https://github.com/shantham/codegraph) | 0 | TypeScript | MIT | Polished `npx` one-command installer, sqlite-vss, 7 MCP tools |
+| 19 | 2.3 | [0xd219b/codegraph](https://github.com/0xd219b/codegraph) | 0 | Rust | None | Pure Rust, HTTP server mode, Java + Go support |
+| 20 | 2.1 | [floydw1234/badger-graph](https://github.com/floydw1234/badger-graph) | 0 | Python | None | Dgraph backend (Docker), C struct field access tracking |
+| 21 | 2.0 | [khushil/code-graph-rag](https://github.com/khushil/code-graph-rag) | 0 | Python | MIT | Fork of vitali87/code-graph-rag with no modifications |
+| 22 | 1.8 | [m3et/CodeRAG](https://github.com/m3et/CodeRAG) | 0 | Python | None | Iterative RAG with self-reflection, ChromaDB, Azure OpenAI dependent |
+
+---
+
+## Scoring Breakdown
+
+| # | Project | Features | Analysis Depth | Deploy Simplicity | Lang Support | Code Quality | Community |
+|---|---------|----------|---------------|-------------------|-------------|-------------|-----------|
+| 1 | code-graph-rag | 5 | 4 | 3 | 4 | 4 | 5 |
+| 2 | glimpse | 4 | 4 | 5 | 3 | 5 | 5 |
+| 3 | CKB | 5 | 5 | 4 | 3 | 4 | 3 |
+| 4 | axon | 5 | 5 | 4 | 2 | 4 | 2 |
+| 5 | autodev-codebase | 5 | 3 | 3 | 5 | 3 | 4 |
+| 6 | arbor | 4 | 4 | 5 | 4 | 5 | 3 |
+| **7** | **codegraph (us)** | **3** | **3** | **5** | **4** | **4** | **2** |
+| 8 | Claude-code-memory | 4 | 3 | 3 | 3 | 4 | 3 |
+| 9 | codexray | 5 | 4 | 4 | 4 | 3 | 1 |
+| 10 | codegraph-cli | 5 | 3 | 3 | 2 | 3 | 2 |
+| 11 | claude-context-local | 4 | 3 | 3 | 4 | 4 | 1 |
+| 12 | loregrep | 3 | 3 | 4 | 3 | 5 | 2 |
+| 13 | CodeInteliMCP | 3 | 3 | 4 | 3 | 3 | 1 |
+| 14 | Bikach/codeGraph | 3 | 3 | 3 | 2 | 3 | 1 |
+| 15 | repo-graphrag-mcp | 3 | 3 | 3 | 4 | 3 | 1 |
+| 16 | MCP_CodeAnalysis | 4 | 3 | 3 | 2 | 3 | 1 |
+| 17 | code-context-mcp | 4 | 2 | 3 | 2 | 2 | 1 |
+| 18 | shantham/codegraph | 3 | 2 | 4 | 4 | 3 | 1 |
+| 19 | 0xd219b/codegraph | 2 | 3 | 4 | 1 | 4 | 1 |
+| 20 | badger-graph | 2 | 2 | 2 | 1 | 2 | 1 |
+| 21 | khushil/code-graph-rag | 5 | 4 | 3 | 4 | 4 | 1 |
+| 22 | CodeRAG | 3 | 2 | 2 | 1 | 2 | 1 |
+
+**Scoring criteria:**
+- **Features** (1-5): breadth of tools, MCP integration, search, visualization, export
+- **Analysis Depth** (1-5): how deep the code analysis goes (dead code, complexity, flow tracing, coupling)
+- **Deploy Simplicity** (1-5): ease of setup — zero Docker = 5, requires Docker = 3, complex multi-service = 1
+- **Lang Support** (1-5): number of well-supported programming languages
+- **Code Quality** (1-5): architecture, performance characteristics, engineering rigor
+- **Community** (1-5): stars, contributors, activity, documentation quality
+
+---
+
+## Where Codegraph Wins
+
+| Strength | Details |
+|----------|---------|
+| **Zero-dependency deployment** | `npm install` and done. No Docker, no cloud, no API keys needed. Most competitors require Docker (Memgraph, Neo4j, Dgraph, Qdrant) or cloud APIs |
+| **Dual engine architecture** | Only project with native Rust (napi-rs) + automatic WASM fallback. Others are pure Rust OR pure JS/Python — never both |
+| **Single-repo MCP isolation** | Security-conscious default: tools have no `repo` property unless `--multi-repo` is explicitly enabled. Most competitors default to exposing everything |
+| **Incremental builds** | File-hash-based skip of unchanged files. Some competitors re-index everything |
+| **Platform binaries** | Published `@optave/codegraph-{platform}-{arch}` optional packages — true npm-native distribution |
+| **Import resolution depth** | 6-level priority system with confidence scoring — more sophisticated than most competitors' resolution |
+
+---
+
+## Where Codegraph Loses
+
+### vs code-graph-rag (#1, 1916 stars)
+- **Graph query expressiveness**: Memgraph + Cypher enables arbitrary graph traversals; our SQL queries are more rigid
+- **AI-powered code editing**: they can surgically edit functions via AST targeting with visual diffs
+- **Provider flexibility**: they support Gemini/OpenAI/Claude/Ollama and can mix providers per task
+- **Community**: 1,916 stars — orders of magnitude more traction
+
+### vs glimpse (#2, 349 stars)
+- **LLM workflow optimization**: clipboard-first output + token counting + XML output mode — purpose-built for "code → LLM context"
+- **LSP-based call resolution**: compiler-grade accuracy vs our tree-sitter heuristic approach
+- **Web content processing**: can fetch URLs and convert HTML to markdown for context
+
+### vs CKB (#3, 59 stars)
+- **Indexing accuracy**: SCIP provides compiler-grade cross-file references (type-aware), fundamentally more accurate than tree-sitter for supported languages
+- **Compound operations**: `explore`/`understand`/`prepareChange` batch multiple queries into one call — 83% token reduction, 60-70% fewer tool calls
+- **CODEOWNERS + secret scanning**: enterprise features we lack entirely
+
+### vs axon (#4, 29 stars)
+- **Analysis depth**: their 11-phase pipeline includes community detection (Leiden), execution flow tracing, git change coupling, dead code detection — all features we lack
+- **Graph database**: KuzuDB with native Cypher is more expressive for complex graph queries than our SQLite
+- **Branch structural diff**: compares code structure between branches using git worktrees
+
+### vs autodev-codebase (#5, 111 stars)
+- **Language breadth**: 40+ languages vs our 11
+- **Interactive visualization**: Cytoscape.js call graph explorer in the browser — we only have static DOT/Mermaid
+- **LLM reranking**: secondary LLM pass to improve search relevance — more sophisticated retrieval pipeline
+
+### vs arbor (#6, 85 stars)
+- **Native GUI**: desktop app for interactive impact analysis (we're CLI/MCP only)
+- **Confidence scoring surfaced to users**: every result shows High/Medium/Low confidence
+- **Architectural role classification**: auto-tags symbols as Entry Point / Core Logic / Utility / Adapter
+- **Fuzzy symbol search**: typo tolerance with Jaro-Winkler matching
+
+---
+
+## Features to Adopt — Priority Roadmap
+
+### Tier 1: High impact, low effort
+| Feature | Inspired by | Why |
+|---------|------------|-----|
+| **Dead code detection** | axon, codexray, CKB | We have the graph — find nodes with zero incoming edges (minus entry points/exports). Agents constantly ask "is this used?" |
+| **Fuzzy symbol search** | arbor | Add Levenshtein/Jaro-Winkler to `fn` command. Currently requires exact match |
+| **Expose confidence scores** | arbor | Already computed internally in import resolution — just surface them |
+| **Shortest path A→B** | codexray, arbor | BFS on existing edges table. We have `fn` for single chains but no A→B pathfinding |
+
+### Tier 2: High impact, medium effort
+| Feature | Inspired by | Why |
+|---------|------------|-----|
+| **Compound MCP tools** | CKB | `explore`/`understand` meta-tools that batch deps + fn + map into single responses. Biggest token-savings opportunity |
+| **Token counting on responses** | glimpse, arbor | tiktoken-based counts so agents know context budget consumed |
+| **Node classification** | arbor | Auto-tag Entry Point / Core / Utility / Adapter from in-degree/out-degree patterns |
+| **TF-IDF lightweight search** | codexray | SQLite FTS5 + TF-IDF as a middle tier (~50MB) between "no search" and full transformers (~500MB) |
+
+### Tier 3: High impact, high effort
+| Feature | Inspired by | Why |
+|---------|------------|-----|
+| **Interactive HTML visualization** | autodev-codebase, codegraph-cli | `codegraph viz` → opens interactive vis.js/Cytoscape.js graph in browser |
+| **Git change coupling** | axon | Analyze git history for files that always change together — enhances `diff-impact` |
+| **Community detection** | axon | Leiden algorithm to discover natural module boundaries vs actual file organization |
+| **Execution flow tracing** | axon, code-context-mcp | Framework-aware entry point detection + BFS flow tracing |
+| **Security pattern scanning** | CKB, code-context-mcp | Detect hardcoded secrets, SQL injection patterns, XSS in parsed code |
+
+### Not worth copying
+| Feature | Why skip |
+|---------|----------|
+| Memgraph/Neo4j/KuzuDB | Our SQLite = zero Docker, simpler deployment. Query gap matters less than simplicity |
+| Multi-provider AI | We're deliberately cloud-free — that's a feature, not a limitation |
+| SCIP indexing | Would require maintaining SCIP toolchains per language. Tree-sitter + native Rust is the right bet |
+| CrewAI multi-agent | Overengineered for a code analysis tool. Keep the scope focused |
+| Clipboard/LLM-dump mode | Different product category (glimpse). We're a graph tool, not a context-packer |
+
+---
+
+## Irrelevant Repos (excluded from ranking)
+
+These repos from the initial list were not code analysis / graph tools:
+
+| Repo | What it actually is |
+|------|-------------------|
+| [susliko/tla.nvim](https://github.com/susliko/tla.nvim) | TLA+/PlusCal Neovim plugin for formal verification |
+| [akaash-nigam/AxionApps](https://github.com/akaash-nigam/AxionApps) | Portfolio of 17 Indian social impact mobile apps |
+| [jasonjckn/tree-sitter-clojure](https://github.com/jasonjckn/tree-sitter-clojure) | Fork of Clojure tree-sitter grammar, inactive since 2022 |
+| [omkargade04/sentinel-agent](https://github.com/omkargade04/sentinel-agent) | AI-powered GitHub PR reviewer agent |
+| [rupurt/tree-sitter-graph-nix](https://github.com/rupurt/tree-sitter-graph-nix) | Nix flake packaging for tree-sitter-graph (1.8KB of Nix) |
+| [shandianchengzi/tree_sitter_DataExtractor](https://github.com/shandianchengzi/tree_sitter_DataExtractor) | Academic research on program graph representations for GNNs |
+| [hasssanezzz/GoTypeGraph](https://github.com/hasssanezzz/GoTypeGraph) | Go-only struct/interface relationship visualizer |
+| [romiras/py-cmm-parser](https://github.com/romiras/py-cmm-parser) | Python-only canonical metadata parser with Pyright LSP |
+| [OrkeeAI/orkee](https://github.com/OrkeeAI/orkee) | AI agent orchestration platform (CLI/TUI/Web/Desktop) — adjacent but different category |
diff --git a/FOUNDATION.md b/FOUNDATION.md
new file mode 100644
index 00000000..1e078bf8
--- /dev/null
+++ b/FOUNDATION.md
@@ -0,0 +1,169 @@
+# Codegraph Foundation Document
+
+**Project:** `@optave/codegraph`
+**License:** Apache-2.0
+**Established:** 2026 | Optave AI Solutions Inc.
+
+---
+
+## Why Codegraph Exists
+
+There are 20+ code analysis and code graph tools in the open-source ecosystem. Most require Docker, Python environments, cloud API keys, or external databases. None of them ship as a single npm package with native performance.
+
+Codegraph exists to be **the code intelligence engine for the JavaScript ecosystem** — the one you `npm install` and it just works, on every platform, with nothing else to set up.
+
+---
+
+## Core Principles
+
+These principles define what codegraph is and is not. Every feature decision, PR review, and architectural choice should be measured against them.
+
+### 1. Zero-infrastructure deployment
+
+**Codegraph must never require anything beyond `npm install`.**
+
+No Docker. No external databases. No cloud accounts. No API keys for core functionality. No Python. No Go toolchain. No manual compilation steps.
+
+SQLite is our database because it's embedded. WASM grammars are our fallback because they run everywhere Node.js runs. Optional dependencies (`@huggingface/transformers`, `@modelcontextprotocol/sdk`) are lazy-loaded and degrade gracefully.
+
+This is our single most important differentiator. Every competitor that adds Docker to their install instructions loses users we should capture.
+
+*Test: can a developer on a fresh machine run `npm install @optave/codegraph && codegraph build .` with zero prior setup? If not, we broke this principle.*
+
+### 2. Native speed, universal reach
+
+**The dual engine is our architectural moat.**
+
+Native Rust via napi-rs (rayon-parallelized tree-sitter) for platforms we support. Automatic WASM fallback for everything else. The user never chooses — `--engine auto` detects the right path.
+
+We publish platform-specific optional packages (`@optave/codegraph-{platform}-{arch}`) that npm resolves automatically. This gives us 10-100x parsing speed on supported platforms with zero configuration, while never breaking on unsupported ones.
+
+No other tool in this space has both native performance and universal portability in a single npm package.
+
+*Test: does `codegraph build .` work on macOS ARM, macOS x64, Linux x64, and Windows x64 with native speed — and still work (slower) on any other Node.js-capable platform?*
+
+### 3. Confidence over noise
+
+**Every result should tell you how much to trust it.**
+
+Our 6-level import resolution scores every edge 0.0-1.0. Most tools return all matches (noise) or pick the first one (often wrong). We quantify uncertainty.
+
+This principle extends beyond import resolution. When we add features — dead code detection, impact analysis, search results — they should include confidence or relevance scores. AI agents and developers both benefit from ranked, scored results over raw dumps.
+
+*Test: does every query result include enough context for the consumer to judge its reliability?*
+
+### 4. Incremental by default
+
+**Never re-parse what hasn't changed.**
+
+File-level MD5 hashing tracks what changed between builds. Only modified files get re-parsed, and their stale nodes/edges are cleaned before re-insertion. This makes watch-mode and AI-agent loops practical — rebuilds drop from seconds to milliseconds.
+
+This is not a feature flag. It's the default behavior. The graph is always fresh with minimum work.
+
+*Test: after changing one file in a 1000-file project, does `codegraph build .` complete in under 500ms?*
+
+### 5. Embeddable first, CLI second
+
+**Codegraph is a library that happens to have a CLI, not the other way around.**
+
+Every capability is available through the programmatic API (`src/index.js`). The CLI (`src/cli.js`) and MCP server (`src/mcp.js`) are thin wrappers. This means codegraph can be imported into VS Code extensions, Electron apps, CI pipelines, other MCP servers, and any JavaScript tooling.
+
+Most competitors are CLI-first or server-first. We are library-first. The API surface is the product; the CLI is a convenience.
+
+*Test: can another npm package `import { buildGraph, queryFunction } from '@optave/codegraph'` and use the full feature set programmatically?*
+
+### 6. One registry, one schema, no magic
+
+**Adding a language is one data entry, not an architecture change.**
+
+`LANGUAGE_REGISTRY` in `parser.js` is a declarative list mapping each language to `{ id, extensions, grammarFile, extractor, required }`. `EXTENSIONS` in `constants.js` is derived from it. `SYMBOL_KINDS` in `queries.js` is the exhaustive list of node types.
+
+No language gets special-cased. No hidden configuration. No scattered if-else chains. When someone wants to add Kotlin or Swift support, they add one registry entry and one extractor function.
+
+*Test: can a contributor add a new language in under 100 lines of code, touching at most 2 files?*
+
+### 7. Security-conscious defaults
+
+**Multi-repo access is opt-in, never opt-on.**
+
+The MCP server defaults to single-repo mode. Tools have no `repo` property and `list_repos` is not exposed. Only explicit `--multi-repo` or `--repos` flags enable cross-repo access. `allowedRepos` restricts what an MCP client can see.
+
+Credentials are resolved through `apiKeyCommand` (shelling out to external secret managers via `execFileSync` with no shell) — never stored in config files.
+
+This matters because codegraph runs inside AI agents that have broad tool access. Leaking cross-repo data or credentials through an MCP server is a real attack surface.
+
+*Test: does a default `codegraph mcp` invocation expose only the single repo it was pointed at?*
+
+### 8. Honest about what we're not
+
+**We are not a graph database. We are not a RAG system. We are not an AI agent.**
+
+We use SQLite, not Neo4j/Memgraph/KuzuDB. Our queries are hand-written SQL, not Cypher. This is intentional — it keeps us at zero infrastructure.
+
+We offer semantic search via optional embeddings, but we are not a RAG pipeline. We don't generate code, answer questions, or translate natural language to queries.
+
+We expose tools to AI agents via MCP, but we are not an agent ourselves. We don't make decisions, run multi-step workflows, or modify code.
+
+Staying in our lane means we can be embedded inside tools that do those things — without competing with them or duplicating their responsibilities.
+
+---
+
+## What We Build vs. What We Don't
+
+### We will build
+
+- Features that deepen **structural code understanding**: dead code detection, complexity metrics, path finding, community detection — all derivable from our existing graph
+- Features that improve **result quality**: fuzzy search, confidence scoring, node classification, compound queries that reduce agent round-trips
+- Features that improve **speed**: faster native parsing, smarter incremental builds, lighter-weight search alternatives (FTS5/TF-IDF alongside full embeddings)
+- Features that improve **embeddability**: better programmatic API, streaming results, output format options
+
+### We will not build
+
+- External database backends (Memgraph, Neo4j, Qdrant, etc.) — violates Principle 1
+- Cloud API integrations for core functionality — violates Principle 1
+- AI-powered code generation or editing — violates Principle 8
+- Multi-agent orchestration — violates Principle 8
+- Native desktop GUI — outside our lane; we're a library
+- Features that require non-npm dependencies — violates Principle 1
+
+---
+
+## Competitive Position
+
+As of February 2026, codegraph is **#7 out of 22** in the code intelligence tool space (see [COMPETITIVE_ANALYSIS.md](./COMPETITIVE_ANALYSIS.md)).
+
+Six tools rank above us on feature breadth and community size. But none of them occupy our niche: **the npm-native, zero-config, dual-engine code intelligence library.**
+
+| What competitors need | What codegraph needs |
+|-----------------------|----------------------|
+| Docker (Memgraph, Neo4j, Qdrant, Dgraph) | Nothing |
+| Python environment | Nothing |
+| Cloud API keys (OpenAI, Gemini, Voyage AI) | Nothing |
+| Manual Rust/Go compilation | Nothing |
+| External secret management setup | Nothing |
+| `npm install @optave/codegraph` | That's it |
+
+Our path to #1 is not feature parity with every competitor. It's making codegraph **the obvious default for any JavaScript developer or tool that needs code intelligence** — because it's the only one that doesn't ask them to leave the npm ecosystem.
+
+---
+
+## Landscape License Overview
+
+How the competitive field is licensed (relevant for understanding what's available to learn from, fork, or integrate):
+
+| License | Count | Projects |
+|---------|-------|----------|
+| **MIT** | 10 | [code-graph-rag](https://github.com/vitali87/code-graph-rag), [glimpse](https://github.com/seatedro/glimpse), [arbor](https://github.com/Anandb71/arbor), [codexray](https://github.com/NeuralRays/codexray), [codegraph-cli](https://github.com/al1-nasir/codegraph-cli), [Bikach/codeGraph](https://github.com/Bikach/codeGraph), [repo-graphrag-mcp](https://github.com/yumeiriowl/repo-graphrag-mcp), [code-context-mcp](https://github.com/RaheesAhmed/code-context-mcp), [shantham/codegraph](https://github.com/shantham/codegraph), [khushil/code-graph-rag](https://github.com/khushil/code-graph-rag) |
+| **Apache-2.0** | 2 | **[@optave/codegraph](https://github.com/optave/codegraph)** (us), [loregrep](https://github.com/Vasu014/loregrep) |
+| **Custom/Other** | 1 | [CodeMCP/CKB](https://github.com/SimplyLiz/CodeMCP) (non-standard license) |
+| **No license** | 9 | [axon](https://github.com/harshkedia177/axon), [autodev-codebase](https://github.com/anrgct/autodev-codebase), [Claude-code-memory](https://github.com/Durafen/Claude-code-memory), [claude-context-local](https://github.com/anasdayeh/claude-context-local), [CodeInteliMCP](https://github.com/rahulvgmail/CodeInteliMCP), [MCP_CodeAnalysis](https://github.com/0xjcf/MCP_CodeAnalysis), [0xd219b/codegraph](https://github.com/0xd219b/codegraph), [badger-graph](https://github.com/floydw1234/badger-graph), [CodeRAG](https://github.com/m3et/CodeRAG) |
+
+**Key implications:**
+- MIT-licensed projects (10/22) are fully open — their approaches, algorithms, and code can be studied and adapted freely
+- 9 projects have **no license at all**, meaning they are proprietary by default under copyright law — their code cannot legally be copied or forked, even though it's publicly visible on GitHub
+- CKB (CodeMCP) has a custom license that should be reviewed before any integration or inspiration
+- Our Apache-2.0 license provides patent protection to users (stronger than MIT) while remaining fully open source — a deliberate choice for enterprise adoption
+
+---
+
+*This document should be revisited when the competitive landscape shifts meaningfully, or when a proposed feature contradicts one of the core principles above.*

From aacb44c9f560b3a26269b50cbd5b50839cdeeb3a Mon Sep 17 00:00:00 2001
From: "github-actions[bot]" <github-actions[bot]@users.noreply.github.com>
Date: Sun, 22 Feb 2026 02:54:05 -0700
Subject: [PATCH 2/3] docs: add v1.5.0 release notes to CHANGELOG

---
 CHANGELOG.md | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index e1d53e5e..d81a8f2d 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,42 @@
 
 All notable changes to this project will be documented in this file. See [commit-and-tag-version](https://github.com/absolute-version/commit-and-tag-version) for commit guidelines.
 
+## [1.5.0](https://github.com/optave/codegraph/compare/v1.4.0...v1.5.0) (2026-02-22)
+
+**Phase 2.5 — Multi-Repo MCP & Structural Analysis.** This release adds multi-repo support for AI agents, structural analysis with architectural metrics, and hardens security across the MCP server and SQL layers.
+
+### ⚠ BREAKING CHANGES
+
+* **parser:** Node kinds now use language-native types — Go structs → `struct`, Rust structs/enums/traits → `struct`/`enum`/`trait`, Java enums → `enum`, C# structs/records/enums → `struct`/`record`/`enum`, PHP traits/enums → `trait`/`enum`, Ruby modules → `module`. Rebuild required: `codegraph build --no-incremental`. ([72535fb](https://github.com/optave/codegraph/commit/72535fba44e56312fb8d5b21e19bdcbec1ea9f5e))
+
+### Features
+
+* **mcp:** add multi-repo MCP support with global registry at `~/.codegraph/registry.json` — optional `repo` param on all 11 tools, new `list_repos` tool, auto-register on build ([54ea9f6](https://github.com/optave/codegraph/commit/54ea9f6c497f1c7ad4c2f0199b4a951af0a51c62))
+* **mcp:** default MCP server to single-repo mode for security isolation — multi-repo access requires explicit `--multi-repo` or `--repos` opt-in ([49c07ad](https://github.com/optave/codegraph/commit/49c07ad725421710af3dd3cce5b3fc7028ab94a8))
+* **registry:** harden multi-repo registry — `pruneRegistry()` removes stale entries, `--repos` allowlist for repo-level access control, auto-suffix name collisions ([a413ea7](https://github.com/optave/codegraph/commit/a413ea73ff2ab12b4d500d07bd7f71bc319c9f54))
+* **structure:** add structural analysis with directory nodes, containment edges, and metrics (symbol density, avg fan-out, cohesion scores) ([a413ea7](https://github.com/optave/codegraph/commit/a413ea73ff2ab12b4d500d07bd7f71bc319c9f54))
+* **cli:** add `codegraph structure [dir]`, `codegraph hotspots`, and `codegraph registry list|add|remove|prune` commands ([a413ea7](https://github.com/optave/codegraph/commit/a413ea73ff2ab12b4d500d07bd7f71bc319c9f54))
+* **export:** extend DOT/Mermaid export with directory clusters ([a413ea7](https://github.com/optave/codegraph/commit/a413ea73ff2ab12b4d500d07bd7f71bc319c9f54))
+* **parser:** add `SYMBOL_KINDS` constant and granular node types across both WASM and native Rust extractors ([72535fb](https://github.com/optave/codegraph/commit/72535fba44e56312fb8d5b21e19bdcbec1ea9f5e))
+
+### Bug Fixes
+
+* **security:** eliminate SQL interpolation in `hotspotsData` — replace dynamic string interpolation with static map of pre-built prepared statements ([f8790d7](https://github.com/optave/codegraph/commit/f8790d772989070903adbeeb30720789890591d9))
+* **parser:** break `parser.js` ↔ `constants.js` circular dependency by inlining path normalization ([36239e9](https://github.com/optave/codegraph/commit/36239e91de43a6c6747951a84072953ea05e2321))
+* **structure:** add `NULLS LAST` to hotspots `ORDER BY` clause ([a41668f](https://github.com/optave/codegraph/commit/a41668f55ff8c18acb6dde883b9e98c3113abf7d))
+* **ci:** add license scan allowlist for `@img/sharp-*` dual-licensed packages ([9fbb084](https://github.com/optave/codegraph/commit/9fbb0848b4523baca71b94e7bceeb569773c8b45))
+
+### Testing
+
+* add 18 unit tests for registry, 4 MCP integration tests, 4 CLI integration tests for multi-repo ([54ea9f6](https://github.com/optave/codegraph/commit/54ea9f6c497f1c7ad4c2f0199b4a951af0a51c62))
+* add 277 unit tests and 182 integration tests for structural analysis ([a413ea7](https://github.com/optave/codegraph/commit/a413ea73ff2ab12b4d500d07bd7f71bc319c9f54))
+* add MCP single-repo / multi-repo mode tests ([49c07ad](https://github.com/optave/codegraph/commit/49c07ad725421710af3dd3cce5b3fc7028ab94a8))
+* add registry hardening tests (pruning, allowlist, name collision) ([a413ea7](https://github.com/optave/codegraph/commit/a413ea73ff2ab12b4d500d07bd7f71bc319c9f54))
+
+### Documentation
+
+* add dogfooding guide for self-analysis with codegraph ([36239e9](https://github.com/optave/codegraph/commit/36239e91de43a6c6747951a84072953ea05e2321))
+
 ## [1.4.0](https://github.com/optave/codegraph/compare/v1.3.0...v1.4.0) (2026-02-22)
 
 **Phase 2 — Foundation Hardening** is complete. This release hardens the core infrastructure: a declarative parser registry, a full MCP server, significantly improved test coverage, and secure credential management.
@@ -31,7 +67,6 @@ All notable changes to this project will be documented in this file. See [commit
 * add license compliance workflow and CI testing pipeline ([eeeb68b](https://github.com/optave/codegraph/commit/eeeb68b))
 * add OIDC trusted publishing with `--provenance` for npm packages ([bc595f7](https://github.com/optave/codegraph/commit/bc595f7))
 * add automated semantic versioning and commit enforcement ([b8e5277](https://github.com/optave/codegraph/commit/b8e5277))
-* add Claude Code review action for PRs ([eb5d9f2](https://github.com/optave/codegraph/commit/eb5d9f2))
 * add Biome linter and formatter ([a6e6bd4](https://github.com/optave/codegraph/commit/a6e6bd4))
 
 ### Bug Fixes

From 1571f2a864cca2ae812327b180d63b46ab465077 Mon Sep 17 00:00:00 2001
From: "github-actions[bot]" <github-actions[bot]@users.noreply.github.com>
Date: Sun, 22 Feb 2026 03:11:27 -0700
Subject: [PATCH 3/3] fix: harden publish workflow version resolution

The release trigger had no access to version-override inputs, causing
commit-and-tag-version to fall through to auto-detect which silently
produced the stale version. Now extracts version from the release tag,
verifies the bump actually happened, and checks npm registry before
publishing to catch version conflicts early.
---
 .github/workflows/publish.yml | 33 ++++++++++++++++++++++++++--
 README.md                     | 41 +++++++++++++++++------------------
 2 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
index cd51fdf0..11ab9062 100644
--- a/.github/workflows/publish.yml
+++ b/.github/workflows/publish.yml
@@ -125,7 +125,15 @@ jobs:
         run: |
           git checkout -- package-lock.json
           CURRENT=$(node -p "require('./package.json').version")
-          OVERRIDE="${{ inputs.version-override }}"
+
+          # For release trigger, extract version from tag; for workflow_dispatch, use input
+          if [ "${{ github.event_name }}" = "release" ]; then
+            OVERRIDE=$(echo "${{ github.event.release.tag_name }}" | sed 's/^v//')
+            echo "Release trigger — using version from tag: $OVERRIDE"
+          else
+            OVERRIDE="${{ inputs.version-override }}"
+          fi
+
           if [ -n "$OVERRIDE" ] && [ "$CURRENT" = "$OVERRIDE" ]; then
             echo "Version already at $OVERRIDE — skipping bump"
           elif [ -n "$OVERRIDE" ]; then
@@ -133,13 +141,34 @@ jobs:
           else
             npx commit-and-tag-version
           fi
-          echo "new_version=$(node -p "require('./package.json').version")" >> "$GITHUB_OUTPUT"
+
+          NEW_VERSION=$(node -p "require('./package.json').version")
+          echo "new_version=$NEW_VERSION" >> "$GITHUB_OUTPUT"
+
+          # Verify the version was actually bumped (unless it already matched the override)
+          if [ "$NEW_VERSION" = "$CURRENT" ] && [ "$CURRENT" != "$OVERRIDE" ]; then
+            echo "::error::Version was not bumped (still $CURRENT). Check commit history or provide a version-override."
+            exit 1
+          fi
+
+          echo "Will publish version $NEW_VERSION (was $CURRENT)"
 
       - name: Download native artifacts
         uses: actions/download-artifact@v4
         with:
           path: artifacts/
 
+      - name: Verify version not already on npm
+        run: |
+          VERSION="${{ steps.version.outputs.new_version }}"
+          PKG="@optave/codegraph"
+          echo "Checking if $PKG@$VERSION already exists on npm..."
+          if npm view "$PKG@$VERSION" version 2>/dev/null; then
+            echo "::error::$PKG@$VERSION is already published on npm. Bump to a higher version."
+            exit 1
+          fi
+          echo "$PKG@$VERSION is not yet published — proceeding"
+
       - name: Publish platform packages
         shell: bash
         run: |
diff --git a/README.md b/README.md
index b1f23880..49d12e8e 100644
--- a/README.md
+++ b/README.md
@@ -45,20 +45,19 @@ Most dependency graph tools only tell you which **files** import which — codeg
 
 ### Feature comparison
 
-| Capability | codegraph | Madge | dep-cruiser | Skott | Nx graph | Sourcetrail | GitNexus |
-|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
-| Function-level analysis | **Yes** | — | — | — | — | **Yes** | **Yes** |
-| Multi-language | **11** | 1 | 1 | 1 | Any (project) | 4 | 9 |
-| Semantic search | **Yes** | — | — | — | — | — | **Yes** |
-| MCP / AI agent support | **Yes** | — | — | — | — | — | **Yes** |
-| Git diff impact | **Yes** | — | — | — | Partial | — | **Yes** |
-| Persistent database | **Yes** | — | — | — | — | Yes | **Yes** |
-| Watch mode | **Yes** | — | — | — | Daemon | — | — |
-| CI workflow included | **Yes** | — | Rules | — | Yes | — | — |
-| Cycle detection | **Yes** | Yes | Yes | Yes | — | — | — |
-| Zero config | **Yes** | Yes | — | Yes | — | — | **Yes** |
-| Fully local / no telemetry | **Yes** | Yes | Yes | Yes | Partial | Yes | **Yes** |
-| Free & open source | **Yes** | Yes | Yes | Yes | Partial | Archived | No |
+| Capability | codegraph | [code-graph-rag](https://github.com/vitali87/code-graph-rag) | [glimpse](https://github.com/seatedro/glimpse) | [CodeMCP](https://github.com/SimplyLiz/CodeMCP) | [axon](https://github.com/harshkedia177/axon) | [autodev-codebase](https://github.com/anrgct/autodev-codebase) | [arbor](https://github.com/Anandb71/arbor) | [Claude-code-memory](https://github.com/Durafen/Claude-code-memory) |
+|---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
+| Function-level analysis | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | **Yes** | — |
+| Multi-language | **11** | Multi | Multi | SCIP langs | Few | **40+** | Multi | — |
+| Semantic search | **Yes** | **Yes** | — | — | — | **Yes** | **Yes** | **Yes** |
+| MCP / AI agent support | **Yes** | **Yes** | — | **Yes** | — | — | **Yes** | **Yes** |
+| Git diff impact | **Yes** | — | — | — | **Yes** | — | — | — |
+| Watch mode | **Yes** | — | — | — | — | — | — | — |
+| CI workflow included | **Yes** | — | — | — | — | — | — | — |
+| Cycle detection | **Yes** | — | — | — | **Yes** | — | — | — |
+| Zero config | **Yes** | — | **Yes** | — | — | — | **Yes** | — |
+| Fully local / no telemetry | **Yes** | Partial | **Yes** | **Yes** | **Yes** | Partial | **Yes** | — |
+| Free & open source | **Yes** | Yes | Yes | Custom | — | — | Yes | — |
 
 ### What makes codegraph different
 
@@ -78,17 +77,17 @@ Many tools in this space are cloud-based or SaaS — meaning your code leaves yo
 
 | Tool | What it does well | Where it falls short |
 |---|---|---|
+| [code-graph-rag](https://github.com/vitali87/code-graph-rag) | Graph RAG with Memgraph, multi-provider AI, semantic search, code editing via AST | Requires Docker (Memgraph), depends on cloud AI providers, complex setup |
+| [glimpse](https://github.com/seatedro/glimpse) | Clipboard-first LLM context tool, call graphs, LSP resolution, token counting | Context-packing tool, not a dependency graph — no persistence, no queries |
+| [CodeMCP](https://github.com/SimplyLiz/CodeMCP) | SCIP compiler-grade indexing, compound operations (83% token savings), secret scanning | Custom license, requires SCIP toolchains per language, limited language coverage |
+| [axon](https://github.com/harshkedia177/axon) | 11-phase pipeline, KuzuDB, community detection, dead code, change coupling | No license, Python-focused, limited language support |
+| [autodev-codebase](https://github.com/anrgct/autodev-codebase) | 40+ languages, interactive Cytoscape.js visualization, LLM reranking | No license, some embedding providers require cloud APIs, complex setup |
+| [arbor](https://github.com/Anandb71/arbor) | Native GUI, confidence scoring, architectural role classification, fuzzy search | GUI-focused — no CLI pipeline, no watch mode, no CI integration |
+| [Claude-code-memory](https://github.com/Durafen/Claude-code-memory) | Persistent codebase memory for Claude Code, Memory Guard quality gate | Cloud-dependent (Voyage AI), requires Qdrant, not a code analysis tool |
 | [Madge](https://github.com/pahen/madge) | Simple file-level JS/TS dependency graphs | No function-level analysis, no impact tracing, JS/TS only |
 | [dependency-cruiser](https://github.com/sverweij/dependency-cruiser) | Architectural rule validation for JS/TS | Module-level only (function-level explicitly out of scope), requires config |
-| [Skott](https://github.com/antoine-music/skott) | Module graph with unused code detection | File-level only, JS/TS only, no persistent database |
 | [Nx graph](https://nx.dev/) | Monorepo project-level dependency graph | Requires Nx workspace, project-level only (not file or function) |
-| [Sourcetrail](https://github.com/CoatiSoftware/Sourcetrail) | Rich GUI with symbol-level graphs | Archived/discontinued (2021), no JS/TS, no CLI |
-| [Sourcegraph](https://sourcegraph.com/) | Enterprise code search and navigation | Cloud/SaaS — code sent to servers, $19+/user/mo, no longer open source |
-| [CodeSee](https://www.codesee.io/) | Visual codebase maps | Cloud-based — code leaves your machine, acquired by GitKraken |
-| [Understand](https://scitools.com/) | Deep multi-language static analysis | $100+/month per seat, proprietary, GUI-only, no CI or AI integration |
-| [Snyk Code](https://snyk.io/) | AI-powered security scanning | Cloud-based — code sent to Snyk servers for analysis, not a dependency graph tool |
 | [pyan](https://github.com/Technologicat/pyan) / [cflow](https://www.gnu.org/software/cflow/) | Function-level call graphs | Single-language each (Python / C only), no persistence, no queries |
-| [GitNexus](https://gitnexus.dev/) | Function-level graph with hybrid search and MCP | PolyForm Noncommercial license, no watch mode, no cycle detection, no CI workflow |
 
 ---