From 521038469b858fc103020a1d4a11e3015802ae75 Mon Sep 17 00:00:00 2001 From: carlos-alm <127798846+carlos-alm@users.noreply.github.com> Date: Mon, 2 Mar 2026 18:00:11 -0700 Subject: [PATCH] docs: revise architecture audit and roadmap for v2.6.0 Re-evaluate all architectural recommendations against the actual codebase as it grew from v1.4.0 (5K lines, 12 modules) to v2.6.0 (17,830 lines, 35 modules). Architecture audit: - Reprioritize: dual-function anti-pattern across 15 modules is now #1 (was analysis/formatting split at #3) - Downgrade parser plugin system from #1 to #20 (parser.js shrank to 404 lines after native engine took over) - Add 3 new recommendations: decompose complexity.js (2,163 lines), unified graph model for structure/cochange/communities, pagination standardization - Update all metrics and line counts to current state Roadmap: - Add Phase 2.5 (Analysis Expansion) documenting 18 modules shipped across v2.0.0-v2.6.0 (complexity, communities, structure, flow, cochange, manifesto, boundaries, check, audit, batch, triage, hybrid search, owners, snapshot, etc.) - Mark Phase 5.3 (Hybrid Search) as completed early in Phase 2.5 - Update Phase 3 priorities based on revised architecture analysis - Update version to 2.6.0, language count to 11, phase count to 10 - Add Phase 8 note referencing check command foundation from 2.5 --- docs/roadmap/ROADMAP.md | 755 ++++++++++++++++++++-------------- generated/architecture.md | 828 +++++++++++++++++++------------------- 2 files changed, 879 insertions(+), 704 deletions(-) diff --git a/docs/roadmap/ROADMAP.md b/docs/roadmap/ROADMAP.md index 4f484509..2da92156 100644 --- a/docs/roadmap/ROADMAP.md +++ b/docs/roadmap/ROADMAP.md @@ -1,8 +1,8 @@ # Codegraph Roadmap -> **Current version:** 1.4.0 | **Status:** Active development | **Updated:** February 2026 +> **Current version:** 2.6.0 | **Status:** Active development | **Updated:** March 2026 -Codegraph is a strong local-first code graph CLI. This roadmap describes planned improvements across nine phases — closing gaps with commercial code intelligence platforms while preserving codegraph's core strengths: fully local, open source, zero cloud dependency by default. +Codegraph is a strong local-first code graph CLI. This roadmap describes planned improvements across ten phases -- closing gaps with commercial code intelligence platforms while preserving codegraph's core strengths: fully local, open source, zero cloud dependency by default. **LLM strategy:** All LLM-powered features are **optional enhancements**. Everything works without an API key. When configured (OpenAI, Anthropic, Ollama, or any OpenAI-compatible endpoint), users unlock richer semantic search and natural language queries. @@ -14,11 +14,12 @@ Codegraph is a strong local-first code graph CLI. This roadmap describes planned |-------|-------|-----------------|--------| | [**1**](#phase-1--rust-core) | Rust Core | Rust parsing engine via napi-rs, parallel parsing, incremental tree-sitter, JS orchestration layer | **Complete** (v1.3.0) | | [**2**](#phase-2--foundation-hardening) | Foundation Hardening | Parser registry, complete MCP, test coverage, enhanced config, multi-repo MCP | **Complete** (v1.4.0) | -| [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring | Parser plugin system, repository pattern, pipeline builder, engine strategy, analysis/formatting split, domain errors, CLI commands, composable MCP, curated API | Planned | -| [**4**](#phase-4--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf → core → orchestration module migration, test migration | Planned | -| [**5**](#phase-5--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, hybrid search, build-time semantic metadata, module summaries | Planned | +| [**2.5**](#phase-25--analysis-expansion) | Analysis Expansion | Complexity metrics, community detection, flow tracing, co-change, manifesto, boundary rules, check, triage, audit, batch, hybrid search | **Complete** (v2.6.0) | +| [**3**](#phase-3--architectural-refactoring) | Architectural Refactoring | Command/query separation, repository pattern, queries.js decomposition, composable MCP, CLI commands, domain errors, curated API, unified graph model | Planned | +| [**4**](#phase-4--typescript-migration) | TypeScript Migration | Project setup, core type definitions, leaf -> core -> orchestration module migration, test migration | Planned | +| [**5**](#phase-5--intelligent-embeddings) | Intelligent Embeddings | LLM-generated descriptions, enhanced embeddings, build-time semantic metadata, module summaries | Planned | | [**6**](#phase-6--natural-language-queries) | Natural Language Queries | `ask` command, conversational sessions, LLM-narrated graph queries, onboarding tools | Planned | -| [**7**](#phase-7--expanded-language-support) | Expanded Language Support | 8 new languages (12 → 20), parser utilities | Planned | +| [**7**](#phase-7--expanded-language-support) | Expanded Language Support | 8 new languages (11 -> 19), parser utilities | Planned | | [**8**](#phase-8--github-integration--ci) | GitHub Integration & CI | Reusable GitHub Action, LLM-enhanced PR review, visual impact graphs, SARIF output | Planned | | [**9**](#phase-9--interactive-visualization--advanced-features) | Visualization & Advanced | Web UI, dead code detection, monorepo, agentic search, refactoring analysis | Planned | @@ -26,36 +27,37 @@ Codegraph is a strong local-first code graph CLI. This roadmap describes planned ``` Phase 1 (Rust Core) - └──→ Phase 2 (Foundation Hardening) - └──→ Phase 3 (Architectural Refactoring) - └──→ Phase 4 (TypeScript Migration) - ├──→ Phase 5 (Embeddings + Metadata) ──→ Phase 6 (NL Queries + Narration) - ├──→ Phase 7 (Languages) - └──→ Phase 8 (GitHub/CI) ←── Phase 5 (risk_score, side_effects) -Phases 1-6 ──→ Phase 9 (Visualization + Refactoring Analysis) + |--> Phase 2 (Foundation Hardening) + |--> Phase 2.5 (Analysis Expansion) + |--> Phase 3 (Architectural Refactoring) + |--> Phase 4 (TypeScript Migration) + |--> Phase 5 (Embeddings + Metadata) --> Phase 6 (NL Queries + Narration) + |--> Phase 7 (Languages) + |--> Phase 8 (GitHub/CI) <-- Phase 5 (risk_score, side_effects) +Phases 1-6 --> Phase 9 (Visualization + Refactoring Analysis) ``` --- -## Phase 1 — Rust Core ✅ +## Phase 1 -- Rust Core ✅ -> **Status:** Complete — shipped in v1.3.0 +> **Status:** Complete -- shipped in v1.3.0 **Goal:** Move the CPU-intensive parsing and graph engine to Rust, keeping JS for CLI orchestration, MCP, and embeddings. This unlocks parallel parsing, incremental tree-sitter, lower memory usage, and optional standalone binary distribution. -### 1.1 — Rust Workspace & napi-rs Setup ✅ +### 1.1 -- Rust Workspace & napi-rs Setup ✅ Bootstrap the Rust side of the project. - Create `crates/codegraph-core/` with a Cargo workspace -- Set up [napi-rs](https://napi.rs/) to compile Rust → `.node` native addon +- Set up [napi-rs](https://napi.rs/) to compile Rust -> `.node` native addon - Configure CI matrix for prebuilt binaries: `linux-x64`, `darwin-arm64`, `darwin-x64`, `win32-x64` - Add npm optionalDependencies for platform-specific packages (same pattern as SWC/esbuild) - Fallback to existing JS/WASM path if native addon is unavailable **Result:** `npm install` pulls a prebuilt binary; no Rust toolchain required for end users. -### 1.2 — Native tree-sitter Parsing ✅ +### 1.2 -- Native tree-sitter Parsing ✅ Replace WASM-based parsing with native tree-sitter in Rust. @@ -68,7 +70,7 @@ Replace WASM-based parsing with native tree-sitter in Rust. **Affected files:** `src/parser.js` (becomes a thin JS wrapper over native addon) -### 1.3 — Incremental Parsing ✅ +### 1.3 -- Incremental Parsing ✅ Leverage native tree-sitter's `edit + re-parse` API. @@ -80,7 +82,7 @@ Leverage native tree-sitter's `edit + re-parse` API. **Affected files:** `src/watcher.js`, `src/parser.js` -### 1.4 — Import Resolution & Graph Algorithms in Rust ✅ +### 1.4 -- Import Resolution & Graph Algorithms in Rust ✅ Move the hot-path graph logic to Rust. @@ -91,12 +93,12 @@ Move the hot-path graph logic to Rust. **Result:** Import resolution and cycle detection run in Rust with full type safety. Complex state machines benefit from Rust's type system. -### 1.5 — Graceful Degradation & Migration ✅ +### 1.5 -- Graceful Degradation & Migration ✅ Ensure the transition is seamless. - Keep the existing JS/WASM parser as a fallback when the native addon is unavailable -- Auto-detect at startup: native addon available → use Rust path; otherwise → WASM path +- Auto-detect at startup: native addon available -> use Rust path; otherwise -> WASM path - No breaking changes to CLI, MCP, or programmatic API - Add `--engine native|wasm` flag for explicit selection - Migrate existing tests to validate both engines produce identical output @@ -105,13 +107,13 @@ Ensure the transition is seamless. --- -## Phase 2 — Foundation Hardening ✅ +## Phase 2 -- Foundation Hardening ✅ -> **Status:** Complete — shipped in v1.4.0 +> **Status:** Complete -- shipped in v1.4.0 **Goal:** Fix structural issues that make subsequent phases harder. -### 2.1 — Language Parser Registry ✅ +### 2.1 -- Language Parser Registry ✅ Replace scattered parser init/selection logic with a single declarative registry. @@ -125,9 +127,9 @@ Replace scattered parser init/selection logic with a single declarative registry **Affected files:** `src/parser.js`, `src/constants.js` -### 2.2 — Complete MCP Server ✅ +### 2.2 -- Complete MCP Server ✅ -Expose all CLI capabilities through MCP, going from 5 → 11 tools. +Expose all CLI capabilities through MCP, going from 5 -> 11 tools. | New tool | Wraps | Description | |----------|-------|-------------| @@ -136,11 +138,11 @@ Expose all CLI capabilities through MCP, going from 5 → 11 tools. | ✅ `diff_impact` | `diffImpactData` | Git diff impact analysis | | ✅ `semantic_search` | `searchData` | Embedding-powered search | | ✅ `export_graph` | export functions | DOT/Mermaid/JSON export | -| ✅ `list_functions` | — | List functions in a file or by pattern | +| ✅ `list_functions` | -- | List functions in a file or by pattern | **Affected files:** `src/mcp.js` -### 2.3 — Test Coverage Gaps ✅ +### 2.3 -- Test Coverage Gaps ✅ Add tests for currently untested modules. @@ -149,9 +151,9 @@ Add tests for currently untested modules. | ✅ `tests/unit/mcp.test.js` | All MCP tools (mock stdio transport) | | ✅ `tests/unit/config.test.js` | Config loading, defaults, env overrides, apiKeyCommand | | ✅ `tests/integration/cli.test.js` | End-to-end CLI smoke tests | -| ✅ `tests/unit/*.test.js` | Unit tests for 8 core modules (coverage 62% → 75%) | +| ✅ `tests/unit/*.test.js` | Unit tests for 8 core modules (coverage 62% -> 75%) | -### 2.4 — Enhanced Configuration ✅ +### 2.4 -- Enhanced Configuration ✅ New configuration options in `.codegraphrc.json`: @@ -171,11 +173,11 @@ New configuration options in `.codegraphrc.json`: ``` - ✅ Environment variable fallbacks: `CODEGRAPH_LLM_PROVIDER`, `CODEGRAPH_LLM_API_KEY`, `CODEGRAPH_LLM_MODEL` -- ✅ `apiKeyCommand` — shell out to external secret managers (1Password, Bitwarden, Vault, pass, macOS Keychain) at runtime via `execFileSync` (no shell injection). Priority: command output > env var > file config > defaults. Graceful fallback on failure. +- ✅ `apiKeyCommand` -- shell out to external secret managers (1Password, Bitwarden, Vault, pass, macOS Keychain) at runtime via `execFileSync` (no shell injection). Priority: command output > env var > file config > defaults. Graceful fallback on failure. **Affected files:** `src/config.js` -### 2.5 — Multi-Repo MCP ✅ +### 2.5 -- Multi-Repo MCP ✅ Support querying multiple codebases from a single MCP server instance. @@ -191,299 +193,457 @@ Support querying multiple codebases from a single MCP server instance. --- -## Phase 3 — Architectural Refactoring +## Phase 2.5 -- Analysis Expansion ✅ -**Goal:** Restructure the codebase for modularity, testability, and long-term maintainability. These are internal improvements — no new user-facing features, but they make every subsequent phase easier to build and maintain. +> **Status:** Complete -- shipped across v2.0.0 -> v2.6.0 -> Reference: [generated/architecture.md](../generated/architecture.md) — full analysis with code examples and rationale. +**Goal:** Build a comprehensive analysis toolkit on top of the graph -- complexity metrics, community detection, risk triage, architecture boundary enforcement, CI validation, and hybrid search. This phase emerged organically as features were needed and wasn't in the original roadmap. -### 3.1 — Parser Plugin System +### 2.5.1 -- Complexity Metrics ✅ -Split `parser.js` (2,200+ lines) into a modular directory structure with isolated per-language extractors. +Per-function complexity analysis using language-specific AST rules. -``` -src/parser/ - index.js # Public API: parseFileAuto, parseFilesAuto - registry.js # LANGUAGE_REGISTRY + extension mapping - engine.js # Native/WASM init, engine resolution, grammar loading - tree-utils.js # findChild, findParentClass, walkTree helpers - base-extractor.js # Shared walk loop + accumulator framework - extractors/ - javascript.js # JS/TS/TSX - python.js - go.js - rust.js - java.js - csharp.js - ruby.js - php.js - hcl.js -``` +- ✅ Cognitive complexity, cyclomatic complexity, max nesting depth for 8 languages +- ✅ Halstead metrics (vocabulary, volume, difficulty, effort, bugs) +- ✅ LOC, SLOC, comment lines per function +- ✅ Maintainability Index (MI) computation +- ✅ Native Rust engine support for all complexity metrics +- ✅ CLI: `codegraph complexity [target]` with `--sort`, `--limit`, `--kind` options +- ✅ `function_complexity` DB table for persistent storage -Introduce a `BaseExtractor` that owns the tree walk loop. Each language extractor declares a `nodeType → handler` map instead of reimplementing the traversal. Eliminates repeated walk-and-switch boilerplate across 9+ extractors. +**New file:** `src/complexity.js` (2,163 lines) -**Affected files:** `src/parser.js` → split into `src/parser/` +### 2.5.2 -- Community Detection & Drift ✅ -### 3.2 — Repository Pattern for Data Access +Louvain community detection at file or function level. -Consolidate all SQL into a single `Repository` class. Currently SQL is scattered across `builder.js`, `queries.js`, `embedder.js`, `watcher.js`, and `cycles.js`. +- ✅ Graphology-based Louvain algorithm for community assignment +- ✅ Modularity score computation +- ✅ Drift analysis: identify split/merge candidates between communities +- ✅ CLI: `codegraph communities` with `--level file|function` -``` -src/db/ - connection.js # Open, WAL mode, pragma tuning - migrations.js # Schema versions - repository.js # ALL data access methods (reads + writes) -``` +**New file:** `src/communities.js` (310 lines) -All prepared statements, index tuning, and schema knowledge live in one place. Consumers never see SQL. Enables an `InMemoryRepository` for fast unit tests. +### 2.5.3 -- Structure & Role Classification ✅ -**Affected files:** `src/db.js` → split into `src/db/`, SQL extracted from `builder.js`, `queries.js`, `embedder.js`, `watcher.js`, `cycles.js` +Directory structure graph with node role classification. -### 3.3 — Analysis / Formatting Separation +- ✅ Directory nodes and edges with cohesion, density, fan-in/fan-out metrics +- ✅ Node role classification: entry, core, utility, adapter, leaf, dead +- ✅ Framework entry point detection (route:, event:, command: prefixes) +- ✅ Hotspot detection: high fan-in x high complexity +- ✅ Module boundary analysis: high-cohesion directories with cross-boundary imports +- ✅ CLI: `codegraph structure`, `codegraph hotspots`, `codegraph roles` -Split `queries.js` (800+ lines) into pure analysis modules and presentation formatters. +**New file:** `src/structure.js` (668 lines) -``` -src/analysis/ # Pure data: take repository, return typed results - impact.js - call-chain.js - diff-impact.js - module-map.js - class-hierarchy.js +### 2.5.4 -- Execution Flow Tracing ✅ -src/formatters/ # Presentation: take data, produce strings - cli-formatter.js - json-formatter.js - table-formatter.js -``` +Forward BFS from framework entry points through callees to leaves. -Analysis modules return pure data. The CLI, MCP server, and programmatic API each pick their own formatter (or none). Eliminates the `*Data()` / `*()` dual-function pattern. +- ✅ Entry point enumeration with type classification +- ✅ Forward BFS trace with cycle detection +- ✅ CLI: `codegraph flow [name]` with `--list` and `--depth` options -**Affected files:** `src/queries.js` → split into `src/analysis/` + `src/formatters/` +**New file:** `src/flow.js` (362 lines) -### 3.4 — Builder Pipeline Architecture +### 2.5.5 -- Temporal Coupling (Co-change Analysis) ✅ -Refactor `buildGraph()` from a monolithic mega-function into explicit, independently testable pipeline stages. +Git history analysis for temporal file coupling. -```js -const pipeline = [ - collectFiles, // (rootDir, config) => filePaths[] - detectChanges, // (filePaths, db) => { changed, removed, isFullBuild } - parseFiles, // (filePaths, engineOpts) => Map - insertNodes, // (symbolMap, db) => nodeIndex - resolveImports, // (symbolMap, rootDir, aliases) => importEdges[] - buildCallEdges, // (symbolMap, nodeIndex) => callEdges[] - buildClassEdges, // (symbolMap, nodeIndex) => classEdges[] - resolveBarrels, // (edges, symbolMap) => resolvedEdges[] - insertEdges, // (allEdges, db) => stats -] -``` +- ✅ Jaccard similarity computation from commit history +- ✅ `co_changes`, `co_change_meta`, `file_commit_counts` DB tables +- ✅ Per-file and global co-change queries +- ✅ CLI: `codegraph co-change [file]` -Watch mode reuses the same stages (triggered per-file instead of per-project), eliminating the divergence between `watcher.js` and `builder.js` where bug fixes must be applied separately. +**New file:** `src/cochange.js` (502 lines) -**Affected files:** `src/builder.js`, `src/watcher.js` +### 2.5.6 -- Manifesto Rule Engine ✅ -### 3.5 — Unified Engine Interface +Configurable rule engine with warn/fail thresholds for function, file, and graph rules. -Replace scattered `engine.name === 'native'` branching with a Strategy pattern. Every consumer receives an engine object with the same API regardless of backend. +- ✅ Function rules: cognitive, cyclomatic, nesting depth +- ✅ File rules: imports, exports, LOC, fan-in, fan-out +- ✅ Graph rules: cycles, boundary violations +- ✅ Configurable via `.codegraphrc.json` `manifesto` section +- ✅ CLI: `codegraph manifesto` with table format -```js -const engine = createEngine(opts) // returns same interface for native or WASM -engine.parseFile(path, source) -engine.resolveImports(batch, rootDir, aliases) -engine.detectCycles(db) -``` +**New file:** `src/manifesto.js` (511 lines) -Consumers never branch on native vs WASM. Adding a third backend (e.g., remote parsing service) requires zero consumer changes. +### 2.5.7 -- Architecture Boundary Rules ✅ -**Affected files:** `src/parser.js`, `src/resolve.js`, `src/cycles.js`, `src/builder.js`, `src/native.js` +Architecture enforcement using glob patterns and presets. -### 3.6 — Qualified Names & Hierarchical Scoping +- ✅ Presets: hexagonal, layered, clean, onion +- ✅ Custom boundary definitions with allow/deny rules +- ✅ Violation detection from DB edges +- ✅ Integration with manifesto and check commands -Enrich the node model with scope information to reduce ambiguity. +**New file:** `src/boundaries.js` (347 lines) -```sql -ALTER TABLE nodes ADD COLUMN qualified_name TEXT; -- 'DateHelper.format' -ALTER TABLE nodes ADD COLUMN scope TEXT; -- 'DateHelper' -ALTER TABLE nodes ADD COLUMN visibility TEXT; -- 'public' | 'private' | 'protected' -``` +### 2.5.8 -- CI Validation Predicates (`check`) ✅ -Enables queries like "all methods of class X" without traversing edges. Reduces reliance on heuristic confidence scoring for name collisions. +Structured pass/fail checks for CI pipelines. -**Affected files:** `src/db.js`, `src/parser.js` (extractors), `src/queries.js`, `src/builder.js` +- ✅ `checkNoNewCycles` -- cycle predicate +- ✅ `checkMaxBlastRadius` -- blast radius predicate +- ✅ `checkNoSignatureChanges` -- signature stability predicate +- ✅ `checkNoBoundaryViolations` -- architecture predicate +- ✅ Composable result objects with pass/fail semantics +- ✅ MCP tool: `check` +- ✅ CLI: `codegraph check [ref]` with exit code 0/1 + +**New file:** `src/check.js` (433 lines) + +### 2.5.9 -- Composite Analysis Commands ✅ + +High-level commands that compose multiple analysis steps. + +- ✅ **Audit:** explain + impact + health + manifesto breaches in one call +- ✅ **Batch:** run same query against multiple targets for multi-agent dispatch +- ✅ **Triage:** risk-ranked audit queue using normalized fan-in, complexity, churn, MI signals + +**New files:** `src/audit.js` (424 lines), `src/batch.js` (91 lines), `src/triage.js` (274 lines) + +### 2.5.10 -- Hybrid Search ✅ + +BM25 keyword search + semantic vector search with RRF fusion. + +- ✅ FTS5 full-text index on node names and source previews +- ✅ BM25 keyword search via `ftsSearchData()` +- ✅ Hybrid search with configurable RRF fusion via `hybridSearchData()` +- ✅ Three search modes: `hybrid` (default), `semantic`, `keyword` +- ✅ 8 embedding model options (minilm, jina-small/base/code, nomic/v1.5, bge-large) + +**Affected file:** `src/embedder.js` (grew from 525 -> 1,113 lines) -### 3.7 — Composable MCP Tool Registry +### 2.5.11 -- Supporting Infrastructure ✅ -Replace the monolithic `TOOLS` array + `switch` dispatch in `mcp.js` with self-contained tool modules. +Cross-cutting utilities added during the expansion. + +- ✅ **Pagination:** offset/limit with MCP defaults per command (`src/paginate.js`, 106 lines) +- ✅ **Snapshot:** SQLite DB backup/restore via VACUUM INTO (`src/snapshot.js`, 150 lines) +- ✅ **CODEOWNERS:** ownership integration for boundary analysis (`src/owners.js`, 360 lines) +- ✅ **Branch Compare:** structural diff between git refs (`src/branch-compare.js`, 569 lines) +- ✅ **Change Journal:** NDJSON event log for watch mode (`src/change-journal.js`, 131 lines) +- ✅ **Journal:** change journal validation/management (`src/journal.js`, 110 lines) +- ✅ **Update Check:** npm registry polling with 24h cache (`src/update-check.js`, 161 lines) + +### 2.5.12 -- MCP Tool Expansion ✅ + +MCP grew from 12 -> 25 tools, covering all new analysis capabilities. + +| New tool | Wraps | +|----------|-------| +| ✅ `structure` | `structureData` | +| ✅ `node_roles` | `rolesData` | +| ✅ `hotspots` | `hotspotsData` | +| ✅ `co_changes` | `coChangeData` | +| ✅ `execution_flow` | `flowData` | +| ✅ `list_entry_points` | `listEntryPointsData` | +| ✅ `complexity` | `complexityData` | +| ✅ `manifesto` | `manifestoData` | +| ✅ `communities` | `communitiesData` | +| ✅ `code_owners` | `ownersData` | +| ✅ `audit` | `auditData` | +| ✅ `batch_query` | `batchData` | +| ✅ `triage` | `triageData` | +| ✅ `branch_compare` | `branchCompareData` | +| ✅ `check` | `checkData` | + +**Affected file:** `src/mcp.js` (grew from 354 -> 1,212 lines) + +--- + +## Phase 3 -- Architectural Refactoring + +**Goal:** Restructure the codebase for modularity, testability, and long-term maintainability. These are internal improvements -- no new user-facing features, but they make every subsequent phase easier to build and maintain. + +> Reference: [generated/architecture.md](../../generated/architecture.md) -- full analysis with code examples and rationale. + +**Context:** Phase 2.5 added 18 modules and doubled the codebase without introducing shared abstractions. The original Phase 3 recommendations (designed for a 5K-line codebase) are now even more urgent at 17,830 lines. The priority ordering has been revised based on the actual growth patterns. + +### 3.1 -- Command/Query Separation ★ Critical + +Eliminate the `*Data()` / `*()` dual-function pattern replicated across 15 modules. Every analysis module (queries, audit, batch, check, cochange, communities, complexity, flow, manifesto, owners, structure, triage, branch-compare) currently implements both data extraction AND CLI formatting. + +Introduce a shared `CommandRunner` that handles the open-DB -> validate -> execute -> format -> paginate -> output lifecycle. Each command only implements unique query + analysis logic. Formatting is always separate and pluggable (CLI text, JSON, NDJSON, Mermaid). ``` -src/mcp/ - server.js # MCP server setup, transport, lifecycle - tool-registry.js # Dynamic tool registration + auto-discovery - tools/ - query-function.js # { schema, handler } per tool - file-deps.js - impact-analysis.js +src/ + commands/ # One file per command + query.js # { execute(args, ctx) -> data, format(data, opts) -> string } + impact.js + audit.js + check.js ... + + infrastructure/ + command-runner.js # Shared lifecycle + result-formatter.js # Shared formatting: table, JSON, NDJSON, Mermaid + test-filter.js # Shared --no-tests / isTestFile logic ``` -Adding a new MCP tool = adding a file. No other files change. +**Affected files:** All 15 modules with dual-function pattern, `src/cli.js`, `src/mcp.js` + +### 3.2 -- Repository Pattern for Data Access ★ Critical + +Consolidate all SQL into a single `Repository` class. Currently SQL is scattered across 20+ modules that each independently open the DB and write raw SQL inline. + +``` +src/ + db/ + connection.js # Open, WAL mode, pragma tuning + migrations.js # Schema versions (currently 9 migrations) + repository.js # ALL data access methods across all 9+ tables + query-builder.js # Lightweight SQL builder for common filtered queries +``` -**Affected files:** `src/mcp.js` → split into `src/mcp/` +Add a query builder for the common pattern "find nodes WHERE kind IN (...) AND file NOT LIKE '%test%' ORDER BY ... LIMIT ? OFFSET ?". Not an ORM -- a thin SQL builder that eliminates string construction across 20 modules. -### 3.8 — CLI Command Objects +**Affected files:** `src/db.js` -> split into `src/db/`, SQL extracted from all modules -Move from inline Commander chains in `cli.js` to self-contained command modules. +### 3.3 -- Decompose queries.js (3,110 Lines) + +Split into pure analysis modules that return data and share no formatting concerns. ``` -src/cli/ - index.js # Commander setup, auto-discover commands - commands/ - build.js # { name, description, options, validate, execute } - query.js - impact.js - ... +src/ + analysis/ + symbol-lookup.js # queryNameData, whereData, listFunctionsData + impact.js # impactAnalysisData, fnImpactData, diffImpactData + dependencies.js # fileDepsData, fnDepsData, pathData + module-map.js # moduleMapData, statsData + context.js # contextData, explainData + roles.js # rolesData + + shared/ + constants.js # SYMBOL_KINDS, ALL_SYMBOL_KINDS, VALID_ROLES + filters.js # isTestFile, normalizeSymbol, kindIcon + generators.js # iterListFunctions, iterRoles, iterWhere ``` -Each command is independently testable by calling `execute()` directly. The CLI index auto-discovers and registers them. +**Affected files:** `src/queries.js` -> split into `src/analysis/` + `src/shared/` -**Affected files:** `src/cli.js` → split into `src/cli/` +### 3.4 -- Composable MCP Tool Registry -### 3.9 — Domain Error Hierarchy +Replace the monolithic 1,212-line `mcp.js` (25 tools in one switch dispatch) with self-contained tool modules. -Replace ad-hoc error handling (mix of thrown `Error`, returned `null`, `logger.warn()`, `process.exit(1)`) with structured domain errors. +``` +src/ + mcp/ + server.js # MCP server setup, transport, lifecycle + tool-registry.js # Auto-discovery + dynamic registration + middleware.js # Pagination, error handling, repo resolution + tools/ + query-function.js # { schema, handler } -- one per tool (25 files) + ... +``` + +Adding a new MCP tool = adding a file. No other files change. + +**Affected files:** `src/mcp.js` -> split into `src/mcp/` + +### 3.5 -- CLI Command Objects + +Move from 1,285 lines of inline Commander chains to self-contained command modules. -```js -class CodegraphError extends Error { constructor(message, { code, file, cause }) { ... } } -class ParseError extends CodegraphError { code = 'PARSE_FAILED' } -class DbError extends CodegraphError { code = 'DB_ERROR' } -class ConfigError extends CodegraphError { code = 'CONFIG_INVALID' } -class ResolutionError extends CodegraphError { code = 'RESOLUTION_FAILED' } -class EngineError extends CodegraphError { code = 'ENGINE_UNAVAILABLE' } +``` +src/ + cli/ + index.js # Commander setup, auto-discover commands + shared/ + output.js # --json, --ndjson, table, plain text + options.js # Shared options (--no-tests, --json, --db, etc.) + commands/ # 45 files, one per command + build.js # { name, description, options, validate, execute } + ... ``` -CLI catches domain errors and formats for humans. MCP returns structured error responses. No more `process.exit()` from library code. +Each command is independently testable by calling `execute()` directly. -**New file:** `src/errors.js` +**Affected files:** `src/cli.js` -> split into `src/cli/` -### 3.10 — Curated Public API Surface +### 3.6 -- Curated Public API Surface -Reduce `index.js` from ~40 re-exports to a curated public API. Use `package.json` `exports` field to enforce module boundaries. +Reduce `index.js` from 120+ exports to ~30 curated exports. Use `package.json` `exports` field to enforce module boundaries. ```json { "exports": { ".": "./src/index.js", "./cli": "./src/cli.js" } } ``` -Internal modules become truly internal. Consumers can only import from documented entry points. +Export only `*Data()` functions (the command execute functions). Never export CLI formatters. Group by domain. **Affected files:** `src/index.js`, `package.json` -### 3.11 — Embedder Subsystem Extraction +### 3.7 -- Domain Error Hierarchy -Restructure `embedder.js` (525 lines) into a standalone subsystem with pluggable vector storage. +Replace ad-hoc error handling (mix of thrown `Error`, returned `null`, `logger.warn()`, `process.exit(1)`) across 35 modules with structured domain errors. -``` -src/embeddings/ - index.js # Public API - model-registry.js # Model definitions, batch sizes, loading - generator.js # Source → text preparation → batch embedding - store.js # Vector storage (pluggable: SQLite blob, HNSW index) - search.js # Similarity search, RRF multi-query fusion +```js +class CodegraphError extends Error { constructor(message, { code, file, cause }) { ... } } +class ParseError extends CodegraphError { code = 'PARSE_FAILED' } +class DbError extends CodegraphError { code = 'DB_ERROR' } +class ConfigError extends CodegraphError { code = 'CONFIG_INVALID' } +class ResolutionError extends CodegraphError { code = 'RESOLUTION_FAILED' } +class EngineError extends CodegraphError { code = 'ENGINE_UNAVAILABLE' } +class AnalysisError extends CodegraphError { code = 'ANALYSIS_FAILED' } +class BoundaryError extends CodegraphError { code = 'BOUNDARY_VIOLATION' } ``` -Decouples embedding schema from the graph DB. The pluggable store interface enables future O(log n) ANN search (e.g., `hnswlib-node`) when symbol counts reach 50K+. +The CLI catches domain errors and formats for humans. MCP returns structured error responses. No more `process.exit()` from library code. -**Affected files:** `src/embedder.js` → split into `src/embeddings/` +**New file:** `src/errors.js` -### 3.12 — Testing Pyramid +### 3.8 -- Decompose complexity.js (2,163 Lines) -Add proper unit test layer below the existing integration tests. +Split the largest source file into a rules/engine architecture mirroring the parser plugin concept. -- Pure unit tests for extractors (pass AST node, assert symbols — no file I/O) -- Pure unit tests for BFS/Tarjan algorithms (pass adjacency list, assert result) -- Pure unit tests for confidence scoring (pass parameters, assert score) -- Repository mock for query tests (in-memory data, no SQLite) -- E2E tests that invoke the CLI binary and assert exit codes + stdout +``` +src/ + complexity/ + index.js # Public API: computeComplexity, complexityData + metrics.js # Halstead, MI, LOC/SLOC (language-agnostic) + engine.js # Walk AST + apply rules -> raw values + rules/ + javascript.js # JS/TS/TSX rules + python.js + go.js + rust.js + java.js + csharp.js + php.js + ruby.js +``` -The repository pattern (3.2) directly enables this: unit tests use `InMemoryRepository`, integration tests use `SqliteRepository`. +**Affected files:** `src/complexity.js` -> split into `src/complexity/` -### 3.13 — Event-Driven Pipeline +### 3.9 -- Builder Pipeline Architecture -Add an event/streaming architecture to the build pipeline for progress reporting, cancellation, and large-repo support. +Refactor `buildGraph()` (1,173 lines) from a mega-function into explicit, independently testable pipeline stages. ```js -pipeline.on('file:parsed', (file, symbols) => { /* progress */ }) -pipeline.on('file:indexed', (file, nodeCount) => { /* progress */ }) -pipeline.on('build:complete', (stats) => { /* summary */ }) -pipeline.on('error', (file, err) => { /* continue or abort */ }) -await pipeline.run(rootDir) +const pipeline = [ + collectFiles, // (rootDir, config) => filePaths[] + detectChanges, // (filePaths, db) => { changed, removed, isFullBuild } + parseFiles, // (filePaths, engineOpts) => Map + insertNodes, // (symbolMap, db) => nodeIndex + resolveImports, // (symbolMap, rootDir, aliases) => importEdges[] + buildCallEdges, // (symbolMap, nodeIndex) => callEdges[] + buildClassEdges, // (symbolMap, nodeIndex) => classEdges[] + resolveBarrels, // (edges, symbolMap) => resolvedEdges[] + insertEdges, // (allEdges, db) => stats + buildStructure, // (db, fileSymbols, rootDir) => structureStats + classifyRoles, // (db) => roleStats + computeComplexity, // (db, rootDir, engine) => complexityStats + emitChangeJournal, // (rootDir, changes) => void +] ``` -Unifies build and watch code paths. Large builds stream results to the DB incrementally instead of buffering in memory. +Watch mode reuses the same stages triggered per-file, eliminating the `watcher.js` divergence. -**Affected files:** `src/builder.js`, `src/watcher.js`, `src/cli.js` +**Affected files:** `src/builder.js`, `src/watcher.js` -### 3.14 — Subgraph Export Filtering +### 3.10 -- Embedder Subsystem Extraction -Add focus/filter options to the export module so visualizations are usable for real projects. +Restructure `embedder.js` (1,113 lines) -- which now contains 3 search engines -- into a standalone subsystem. -```bash -codegraph export --format dot --focus src/builder.js --depth 2 -codegraph export --format mermaid --filter "src/api/**" --kind function -codegraph export --format json --changed +``` +src/ + embeddings/ + index.js # Public API + models.js # 8 model definitions, batch sizes, loading + generator.js # Source -> text preparation -> batch embedding + stores/ + sqlite-blob.js # Current O(n) cosine similarity + fts5.js # BM25 keyword search + search/ + semantic.js # Vector similarity + keyword.js # FTS5 BM25 + hybrid.js # RRF fusion + strategies/ + structured.js # Structured text preparation + source.js # Raw source preparation ``` -The export module receives a subgraph specification (focus node + depth, file pattern, kind filter) and extracts the relevant subgraph before formatting. +The pluggable store interface enables future O(log n) ANN search (e.g., `hnswlib-node`) when symbol counts reach 50K+. -**Affected files:** `src/export.js`, `src/cli.js` +**Affected files:** `src/embedder.js` -> split into `src/embeddings/` -### 3.15 — Transitive Import-Aware Confidence +### 3.11 -- Unified Graph Model -Before falling back to proximity heuristics, walk the import graph from the caller file. If any import path (even indirect through barrel files) reaches a candidate, score it 0.9. Only fall back to proximity when no import path exists. +Unify the three parallel graph representations (structure.js, cochange.js, communities.js) into a shared in-memory graph model. -**Affected files:** `src/resolve.js`, `src/builder.js` +``` +src/ + graph/ + model.js # Shared in-memory graph (nodes + edges + metadata) + builders/ + dependency.js # Build from SQLite edges + structure.js # Build from file/directory hierarchy + temporal.js # Build from git history (co-changes) + algorithms/ + bfs.js # Breadth-first traversal + shortest-path.js # Path finding + tarjan.js # Cycle detection + louvain.js # Community detection + centrality.js # Fan-in/fan-out, betweenness + clustering.js # Cohesion, coupling, density + classifiers/ + roles.js # Node role classification + risk.js # Risk scoring +``` -### 3.16 — Query Result Caching +Algorithms become composable -- run community detection on the dependency graph, the temporal graph, or a merged graph. -Add a TTL/LRU cache between the analysis layer and the repository. Particularly valuable for MCP where an agent session may repeatedly query related symbols. +**Affected files:** `src/structure.js`, `src/cochange.js`, `src/communities.js`, `src/cycles.js`, `src/triage.js` -```js -class QueryCache { - constructor(db, maxAge = 60_000) { ... } - get(key) { ... } // key = query name + args hash - set(key, value) { ... } - invalidate() { ... } // called after any DB mutation -} +### 3.12 -- Qualified Names & Hierarchical Scoping + +Enrich the node model with scope information to reduce ambiguity. + +```sql +ALTER TABLE nodes ADD COLUMN qualified_name TEXT; -- 'DateHelper.format' +ALTER TABLE nodes ADD COLUMN scope TEXT; -- 'DateHelper' +ALTER TABLE nodes ADD COLUMN visibility TEXT; -- 'public' | 'private' | 'protected' ``` -### 3.17 — Configuration Profiles +Enables queries like "all methods of class X" without traversing edges. Reduces reliance on heuristic confidence scoring. -Support profile-based configuration for monorepos with multiple services. +**Affected files:** `src/db.js`, `src/parser.js` (extractors), `src/queries.js`, `src/builder.js` -```json -{ - "profiles": { - "backend": { "include": ["services/api/**"], "build": { "dbPath": ".codegraph/api.db" } }, - "frontend": { "include": ["apps/web/**"], "build": { "dbPath": ".codegraph/web.db" } } - } -} -``` +### 3.13 -- Testing Pyramid with InMemoryRepository -```bash -codegraph build --profile backend -``` +The repository pattern (3.2) enables true unit testing: + +- Pure unit tests for graph algorithms (pass adjacency list, assert result) +- Pure unit tests for risk/confidence scoring (pass parameters, assert score) +- `InMemoryRepository` for query tests (no SQLite, instant setup) +- Existing 59 test files continue as integration tests + +**Current gap:** Many "unit" tests still hit SQLite because there's no repository abstraction. + +### 3.14 -- Remaining Items (Lower Priority) + +These items from the original Phase 3 are still valid but less urgent: -**Affected files:** `src/config.js`, `src/cli.js` +- **Event-driven pipeline:** Add event/streaming architecture for progress reporting, cancellation, and large-repo support. +- **Unified engine interface (Strategy):** Replace scattered `engine.name === 'native'` branching. Less critical now that native is the primary path. +- **Subgraph export filtering:** `codegraph export --focus src/builder.js --depth 2` for usable visualizations. +- **Transitive import-aware confidence:** Walk import graph before falling back to proximity heuristics. +- **Query result caching:** LRU/TTL cache between analysis layer and repository. More valuable now with 25 MCP tools. +- **Configuration profiles:** `--profile backend` for monorepos with multiple services. +- **Pagination standardization:** SQL-level LIMIT/OFFSET in repository + command runner shaping. --- -## Phase 4 — TypeScript Migration +## Phase 4 -- TypeScript Migration **Goal:** Migrate the codebase from plain JavaScript to TypeScript, leveraging the clean module boundaries established in Phase 3. Incremental module-by-module migration starting from leaf modules inward. -**Why after Phase 3:** The architectural refactoring creates small, well-bounded modules with explicit interfaces (Repository, Engine, BaseExtractor, Pipeline stages, Command objects). These are natural type boundaries — typing monolithic 2,000-line files that are about to be split would be double work. +**Why after Phase 3:** The architectural refactoring creates small, well-bounded modules with explicit interfaces (Repository, Engine, BaseExtractor, Pipeline stages, Command objects). These are natural type boundaries -- typing monolithic 2,000-line files that are about to be split would be double work. -### 4.1 — Project Setup +### 4.1 -- Project Setup - Add `typescript` as a devDependency - Create `tsconfig.json` with strict mode, ES module output, path aliases matching the Phase 3 module structure @@ -494,7 +654,7 @@ codegraph build --profile backend **Affected files:** `package.json`, `biome.json`, new `tsconfig.json` -### 4.2 — Core Type Definitions +### 4.2 -- Core Type Definitions Define TypeScript interfaces for all abstractions introduced in Phase 3: @@ -512,28 +672,28 @@ interface Extractor { language: string; handlers: Record; } interface Command { name: string; options: OptionDef[]; validate(args: unknown, opts: unknown): void; execute(args: unknown, opts: unknown): Promise; } ``` -These interfaces serve as the migration contract — each module is migrated to satisfy its interface. +These interfaces serve as the migration contract -- each module is migrated to satisfy its interface. **New file:** `src/types.ts` -### 4.3 — Leaf Module Migration +### 4.3 -- Leaf Module Migration Migrate modules with no internal dependencies first: | Module | Notes | |--------|-------| -| `src/errors.ts` | Domain error hierarchy (Phase 3.9) | +| `src/errors.ts` | Domain error hierarchy (Phase 3.7) | | `src/logger.ts` | Minimal, no internal deps | | `src/constants.ts` | Pure data | | `src/config.ts` | Config types derived from `.codegraphrc.json` schema | | `src/db/connection.ts` | SQLite connection wrapper | | `src/db/migrations.ts` | Schema version management | -| `src/formatters/*.ts` | Pure input→string transforms | +| `src/formatters/*.ts` | Pure input->string transforms | | `src/paginate.ts` | Generic pagination helpers | Allow `.js` and `.ts` to coexist during migration (`allowJs: true` in tsconfig). -### 4.4 — Core Module Migration +### 4.4 -- Core Module Migration Migrate modules that implement Phase 3 interfaces: @@ -548,7 +708,7 @@ Migrate modules that implement Phase 3 interfaces: | `src/analysis/*.ts` | Typed analysis results (impact scores, call chains) | | `src/resolve.ts` | Import resolution with confidence types | -### 4.5 — Orchestration & Public API Migration +### 4.5 -- Orchestration & Public API Migration Migrate top-level orchestration and entry points: @@ -561,7 +721,7 @@ Migrate top-level orchestration and entry points: | `src/cli/*.ts` | Command objects with typed options | | `src/index.ts` | Curated public API with proper export types | -### 4.6 — Test Migration +### 4.6 -- Test Migration - Migrate test files from `.js` to `.ts` - Add type-safe test utilities and fixture builders @@ -570,15 +730,17 @@ Migrate top-level orchestration and entry points: **Verification:** All existing tests pass. `tsc --noEmit` succeeds with zero errors. No `any` escape hatches except at FFI boundaries (napi-rs addon, tree-sitter WASM). -**Affected files:** All `src/**/*.js` → `src/**/*.ts`, all `tests/**/*.js` → `tests/**/*.ts`, `package.json`, `biome.json` +**Affected files:** All `src/**/*.js` -> `src/**/*.ts`, all `tests/**/*.js` -> `tests/**/*.ts`, `package.json`, `biome.json` --- -## Phase 5 — Intelligent Embeddings +## Phase 5 -- Intelligent Embeddings **Goal:** Dramatically improve semantic search quality by embedding natural-language descriptions instead of raw code. -### 5.1 — LLM Description Generator +> **Phase 5.3 (Hybrid Search) was completed early** during Phase 2.5 -- FTS5 BM25 + semantic search with RRF fusion is already shipped in v2.6.0. + +### 5.1 -- LLM Description Generator For each function/method/class node, generate a concise natural-language description: @@ -606,7 +768,7 @@ For each function/method/class node, generate a concise natural-language descrip **New file:** `src/describer.js` -### 5.2 — Enhanced Embedding Pipeline +### 5.2 -- Enhanced Embedding Pipeline - When descriptions exist, embed the description text instead of raw code - Keep raw code as fallback when no description is available @@ -617,41 +779,32 @@ For each function/method/class node, generate a concise natural-language descrip **Affected files:** `src/embedder.js` -### 5.3 — Hybrid Search - -Combine vector similarity with keyword matching. - -- **Vector search:** Cosine similarity against embeddings (existing) -- **Keyword search:** SQLite FTS5 full-text index on `nodes.name` + `descriptions` -- **Fusion:** Weighted RRF — `score = a * vector_rank + (1-a) * keyword_rank` -- Default `a = 0.7` (favor semantic), configurable - -**New DB migration:** Add FTS5 virtual table for text search. +### ~~5.3 -- Hybrid Search~~ ✅ Completed in Phase 2.5 -**Affected files:** `src/embedder.js`, `src/db.js` +Shipped in v2.6.0. FTS5 BM25 keyword search + semantic vector search with RRF fusion. Three search modes: `hybrid` (default), `semantic`, `keyword`. -### 5.4 — Build-time Semantic Metadata +### 5.4 -- Build-time Semantic Metadata Enrich nodes with LLM-generated metadata beyond descriptions. Computed incrementally at build time (only for changed nodes), stored as columns on the `nodes` table. | Column | Content | Example | |--------|---------|---------| | `side_effects` | Mutation/IO tags | `"writes DB"`, `"sends email"`, `"mutates state"` | -| `complexity_notes` | Responsibility count, cohesion rating | `"3 responsibilities, low cohesion — consider splitting"` | +| `complexity_notes` | Responsibility count, cohesion rating | `"3 responsibilities, low cohesion -- consider splitting"` | | `risk_score` | Fragility metric from graph centrality + LLM assessment | `0.82` (high fan-in + complex logic) | -- MCP tool: `assess ` — returns complexity rating + specific concerns +- MCP tool: `assess ` -- returns complexity rating + specific concerns - Cascade invalidation: when a node changes, mark dependents for re-enrichment **Depends on:** 5.1 (LLM provider abstraction) -### 5.5 — Module Summaries +### 5.5 -- Module Summaries Aggregate function descriptions + dependency direction into file-level narratives. -- `module_summaries` table — one entry per file, re-rolled when any contained node changes -- MCP tool: `explain_module ` — returns module purpose, key exports, role in the system -- `naming_conventions` metadata per module — detected patterns (camelCase, snake_case, verb-first), flag outliers +- `module_summaries` table -- one entry per file, re-rolled when any contained node changes +- MCP tool: `explain_module ` -- returns module purpose, key exports, role in the system +- `naming_conventions` metadata per module -- detected patterns (camelCase, snake_case, verb-first), flag outliers **Depends on:** 5.1 (function-level descriptions must exist first) @@ -659,11 +812,11 @@ Aggregate function descriptions + dependency direction into file-level narrative --- -## Phase 6 — Natural Language Queries +## Phase 6 -- Natural Language Queries **Goal:** Allow developers to ask questions about their codebase in plain English. -### 6.1 — Query Engine +### 6.1 -- Query Engine ```bash codegraph ask "How does the authentication flow work?" @@ -685,11 +838,11 @@ codegraph ask "How does the authentication flow work?" - 1-hop caller/callee names for each match - Total context budget: ~8K tokens (configurable) -**Requires:** LLM API key configured (no fallback — this is inherently an LLM feature). +**Requires:** LLM API key configured (no fallback -- this is inherently an LLM feature). **New file:** `src/nlquery.js` -### 6.2 — Conversational Sessions +### 6.2 -- Conversational Sessions Multi-turn conversations with session memory. @@ -703,21 +856,21 @@ codegraph sessions clear - Store conversation history in SQLite table `sessions` - Include prior Q&A pairs in subsequent prompts -### 6.3 — MCP Integration +### 6.3 -- MCP Integration -New MCP tool: `ask_codebase` — natural language query via MCP. +New MCP tool: `ask_codebase` -- natural language query via MCP. Enables AI coding agents (Claude Code, Cursor, etc.) to ask codegraph questions about the codebase. **Affected files:** `src/mcp.js` -### 6.4 — LLM-Narrated Graph Queries +### 6.4 -- LLM-Narrated Graph Queries Graph traversal + LLM narration for questions that require both structural data and natural-language explanation. Each query walks the graph first, then sends the structural result to the LLM for narration. | Query | Graph operation | LLM adds | |-------|----------------|----------| -| `trace_flow ` | BFS from entry point to leaves | Sequential narrative: "1. handler validates → 2. calls createOrder → 3. writes DB" | +| `trace_flow ` | BFS from entry point to leaves | Sequential narrative: "1. handler validates -> 2. calls createOrder -> 3. writes DB" | | `trace_upstream ` | Recursive caller walk | Ranked suspects: "most likely cause is X because it modifies the same state" | | `effect_analysis ` | Full callee tree walk, aggregate `side_effects` | "Calling X will: write to DB (via Y), send email (via Z)" | | `dependency_path ` | Shortest path(s) between two symbols | Narrates each hop: "A imports X from B because A needs to validate tokens" | @@ -726,24 +879,24 @@ Pre-computed `flow_narratives` table caches results for key entry points at buil **Depends on:** 5.4 (`side_effects` metadata), 5.1 (descriptions for narration context) -### 6.5 — Onboarding & Navigation Tools +### 6.5 -- Onboarding & Navigation Tools Help new contributors and AI agents orient in an unfamiliar codebase. -- `entry_points` query — graph finds roots (high fan-out, low fan-in) + LLM ranks by importance -- `onboarding_guide` command — generates a reading order based on dependency layers -- MCP tool: `get_started` — returns ordered list: "start here, then read this, then this" -- `change_plan ` — LLM reads description, graph identifies relevant modules, returns touch points and test coverage gaps +- `entry_points` query -- graph finds roots (high fan-out, low fan-in) + LLM ranks by importance +- `onboarding_guide` command -- generates a reading order based on dependency layers +- MCP tool: `get_started` -- returns ordered list: "start here, then read this, then this" +- `change_plan ` -- LLM reads description, graph identifies relevant modules, returns touch points and test coverage gaps **Depends on:** 5.5 (module summaries for context), 6.1 (query engine) --- -## Phase 7 — Expanded Language Support +## Phase 7 -- Expanded Language Support -**Goal:** Go from 12 → 20 supported languages. +**Goal:** Go from 11 -> 19 supported languages. -### 7.1 — Batch 1: High Demand +### 7.1 -- Batch 1: High Demand | Language | Extensions | Grammar | Effort | |----------|-----------|---------|--------| @@ -752,7 +905,7 @@ Help new contributors and AI agents orient in an unfamiliar codebase. | Kotlin | `.kt`, `.kts` | `tree-sitter-kotlin` | Low | | Swift | `.swift` | `tree-sitter-swift` | Medium | -### 7.2 — Batch 2: Growing Ecosystems +### 7.2 -- Batch 2: Growing Ecosystems | Language | Extensions | Grammar | Effort | |----------|-----------|---------|--------| @@ -761,7 +914,7 @@ Help new contributors and AI agents orient in an unfamiliar codebase. | Lua | `.lua` | `tree-sitter-lua` | Low | | Zig | `.zig` | `tree-sitter-zig` | Low | -### 7.3 — Parser Abstraction Layer +### 7.3 -- Parser Abstraction Layer Extract shared patterns from existing extractors into reusable helpers. @@ -777,20 +930,23 @@ Extract shared patterns from existing extractors into reusable helpers. --- -## Phase 8 — GitHub Integration & CI +## Phase 8 -- GitHub Integration & CI **Goal:** Bring codegraph's analysis into pull request workflows. -### 8.1 — Reusable GitHub Action +> **Note:** Phase 2.5 delivered `codegraph check` (CI validation predicates with exit code 0/1), which provides the foundation for GitHub Action integration. The boundary violation, blast radius, and cycle detection predicates are already available. + +### 8.1 -- Reusable GitHub Action A reusable GitHub Action that runs on PRs: 1. `codegraph build` on the repository 2. `codegraph diff-impact` against the PR's base branch -3. `codegraph cycles` to detect new circular dependencies +3. `codegraph check --staged` to run CI predicates (cycles, blast radius, signatures, boundaries) 4. Posts a PR comment summarizing: - Number of affected functions and files - New cycles introduced (if any) + - Boundary violations - Top impacted functions with caller counts **Configuration via `.codegraphrc.json`:** @@ -799,11 +955,11 @@ A reusable GitHub Action that runs on PRs: { "ci": { "failOnCycles": true, "impactThreshold": 50 } } ``` -**Fail conditions:** Configurable — fail if new cycles or impact exceeds threshold. +**Fail conditions:** Configurable -- fail if new cycles or impact exceeds threshold. **New file:** `.github/actions/codegraph-ci/action.yml` -### 8.2 — PR Review Integration +### 8.2 -- PR Review Integration ```bash codegraph review --pr @@ -820,36 +976,36 @@ Requires `gh` CLI. For each changed function: **LLM-enhanced mode** (when LLM provider configured): - **Risk labels per node**: `low` (cosmetic / internal), `medium` (behavior change), `high` (breaking / public API) -- **Review focus ranking**: rank affected files by risk × blast radius — "review this file first" +- **Review focus ranking**: rank affected files by risk x blast radius -- "review this file first" - **Critical path highlighting**: shortest path from a changed function to a high-fan-in entry point - **Test coverage gaps**: cross-reference affected code with test file graph edges **New file:** `src/github.js` -### 8.3 — Visual Impact Graphs for PRs +### 8.3 -- Visual Impact Graphs for PRs Extend the existing `diff-impact --format mermaid` foundation with CI automation and LLM annotations. **CI automation** (GitHub Action): 1. `codegraph build .` (incremental, fast on CI cache) 2. `codegraph diff-impact $BASE_REF --format mermaid -T` to generate the graph -3. Post as PR comment — GitHub renders Mermaid natively in markdown +3. Post as PR comment -- GitHub renders Mermaid natively in markdown 4. Update on new pushes (edit the existing comment) **LLM-enriched annotations** (when provider configured): - For each changed function: one-line summary of WHAT changed (from diff hunks) -- For each affected caller: WHY it's affected — what behavior might change downstream -- Node colors shift from green → yellow → red based on risk labels +- For each affected caller: WHY it's affected -- what behavior might change downstream +- Node colors shift from green -> yellow -> red based on risk labels - Overall PR risk score (aggregate of node risks weighted by centrality) **Historical context overlay:** - Annotate nodes with churn data: "this function changed 12 times in the last 30 days" - Highlight fragile nodes: high churn + high fan-in = high breakage risk -- Track blast radius trends: "this PR's blast radius is 2× larger than your average" +- Track blast radius trends: "this PR's blast radius is 2x larger than your average" **Depends on:** 8.1 (GitHub Action), 5.4 (`risk_score`, `side_effects`) -### 8.4 — SARIF Output +### 8.4 -- SARIF Output Add SARIF output format for cycle detection. SARIF integrates with GitHub Code Scanning, showing issues inline in the PR. @@ -857,9 +1013,9 @@ Add SARIF output format for cycle detection. SARIF integrates with GitHub Code S --- -## Phase 9 — Interactive Visualization & Advanced Features +## Phase 9 -- Interactive Visualization & Advanced Features -### 9.1 — Interactive Web Visualization +### 9.1 -- Interactive Web Visualization ```bash codegraph viz @@ -867,19 +1023,21 @@ codegraph viz Opens a local web UI at `localhost:3000` with: -- Force-directed graph layout (D3.js, inline — no external dependencies) +- Force-directed graph layout (D3.js, inline -- no external dependencies) - Zoom, pan, click-to-expand - Node coloring by type (file=blue, function=green, class=purple) - Edge styling by type (imports=solid, calls=dashed, extends=bold) - Search bar for finding nodes by name - Filter panel: toggle node kinds, confidence thresholds, test files - Code preview on hover (reads from source files) +- **Role-based coloring:** entry=orange, core=blue, utility=green, adapter=yellow, dead=gray (from structure.js roles) +- **Community overlay:** color by Louvain community assignment **Data source:** Export JSON from DB, serve via lightweight HTTP server. **New file:** `src/visualizer.js` -### 9.2 — Dead Code Detection +### 9.2 -- Dead Code Detection ```bash codegraph dead @@ -888,9 +1046,11 @@ codegraph dead --exclude-exports --exclude-tests Find functions/methods/classes with zero incoming edges (never called). Filters for exports, test files, and entry points. +> **Note:** Phase 2.5 added role classification (`dead` role in structure.js) which provides the foundation. This extends it with a dedicated command and smarter filtering. + **Affected files:** `src/queries.js` -### 9.3 — Cross-Repository Support (Monorepo) +### 9.3 -- Cross-Repository Support (Monorepo) Support multi-package monorepos with cross-package edges. @@ -900,7 +1060,7 @@ Support multi-package monorepos with cross-package edges. - `codegraph build --workspace` to scan all packages - Impact analysis across package boundaries -### 9.4 — Agentic Search +### 9.4 -- Agentic Search Recursive reference-following search that traces connections. @@ -916,13 +1076,13 @@ codegraph agent-search "payment processing" 4. Follow the most relevant references (up to configurable depth) 5. Return the full chain of related code -**Use case:** "Find everything related to payment processing" → finds payment functions → follows to validation → follows to database layer → returns complete picture. +**Use case:** "Find everything related to payment processing" -> finds payment functions -> follows to validation -> follows to database layer -> returns complete picture. -**Requires:** LLM for relevance re-ranking (optional — degrades to BFS without LLM). +**Requires:** LLM for relevance re-ranking (optional -- degrades to BFS without LLM). **New file:** `src/agentic-search.js` -### 9.5 — Refactoring Analysis +### 9.5 -- Refactoring Analysis LLM-powered structural analysis that identifies refactoring opportunities. The graph provides the structural data; the LLM interprets it. @@ -935,16 +1095,18 @@ LLM-powered structural analysis that identifies refactoring opportunities. The g | `hotspots` | High fan-in + high fan-out + on many paths | Ranked fragility report with explanations, `risk_score` per node | | `boundary_analysis` | Graph clustering (tightly-coupled groups spanning modules) | Reorganization suggestions: "these 4 functions in 3 files all deal with auth" | +> **Note:** `hotspots` and `boundary_analysis` already have data foundations from Phase 2.5 (structure.js hotspots, boundaries.js evaluation). This phase adds LLM interpretation on top. + **Depends on:** 5.4 (`risk_score`, `complexity_notes`), 5.5 (module summaries) -### 9.6 — Auto-generated Docstrings +### 9.6 -- Auto-generated Docstrings ```bash codegraph annotate codegraph annotate --changed-only ``` -LLM-generated docstrings aware of callers, callees, and types. Diff-aware: only regenerate for functions whose code or dependencies changed. Stores in `docstrings` column on nodes table — does not modify source files unless explicitly requested. +LLM-generated docstrings aware of callers, callees, and types. Diff-aware: only regenerate for functions whose code or dependencies changed. Stores in `docstrings` column on nodes table -- does not modify source files unless explicitly requested. **Depends on:** 5.1 (LLM provider abstraction), 5.4 (side effects context) @@ -960,13 +1122,14 @@ Each phase includes targeted verification: |-------|-------------| | **1** | Benchmark native vs WASM parsing on a large repo, verify identical output from both engines | | **2** | `npm test`, manual MCP client test for all tools, config loading tests | -| **3** | All existing tests pass; each refactored module produces identical output to the pre-refactoring version; unit tests for pure analysis modules | +| **2.5** | All 59 test files pass; integration tests for every new command; engine parity tests | +| **3** | All existing tests pass; each refactored module produces identical output to the pre-refactoring version; unit tests for pure analysis modules; InMemoryRepository tests | | **4** | `tsc --noEmit` passes with zero errors; all existing tests pass after migration; no runtime behavior changes | | **5** | Compare `codegraph search` quality before/after descriptions; verify `side_effects` and `risk_score` populated for LLM-enriched builds | | **6** | `codegraph ask "How does import resolution work?"` against codegraph itself; verify `trace_flow` and `get_started` produce coherent narration | | **7** | Parse sample files for each new language, verify definitions/calls/imports | | **8** | Test PR in a fork, verify GitHub Action comment with Mermaid graph and risk labels is posted | -| **9** | `codegraph viz` loads; `hotspots` returns ranked list; `split_analysis` produces actionable output | +| **9** | `codegraph viz` loads; `hotspots` returns ranked list with LLM commentary; `split_analysis` produces actionable output | **Full integration test** after all phases: @@ -988,8 +1151,8 @@ codegraph viz Technology changes to monitor that may unlock future improvements. -- **`node:sqlite` (Node.js built-in)** — **primary target.** Zero native dependencies, eliminates C++ addon breakage on Node major releases (`better-sqlite3` already broken on Node 24/25). Currently Stability 1.1 (Active Development) as of Node 25.x. Adopt when it reaches Stability 2, or use as a fallback alongside `better-sqlite3` (dual-engine pattern like native/WASM parsing). Backed by the Node.js project — no startup risk. -- **`libsql` (SQLite fork by Turso)** — monitor only. Drop-in `better-sqlite3` replacement with built-in DiskANN vector search. However, Turso is pivoting engineering focus to Limbo (full Rust SQLite rewrite), leaving libsql as legacy. Pre-1.0 (v0.5.x) with uncertain long-term maintenance. Low switching cost (API-compatible, data is standard SQLite), but not worth adopting until the Turso/Limbo situation clarifies. +- **`node:sqlite` (Node.js built-in)** -- **primary target.** Zero native dependencies, eliminates C++ addon breakage on Node major releases (`better-sqlite3` already broken on Node 24/25). Currently Stability 1.1 (Active Development) as of Node 25.x. Adopt when it reaches Stability 2, or use as a fallback alongside `better-sqlite3` (dual-engine pattern like native/WASM parsing). Backed by the Node.js project -- no startup risk. +- **`libsql` (SQLite fork by Turso)** -- monitor only. Drop-in `better-sqlite3` replacement with built-in DiskANN vector search. However, Turso is pivoting engineering focus to Limbo (full Rust SQLite rewrite), leaving libsql as legacy. Pre-1.0 (v0.5.x) with uncertain long-term maintenance. Low switching cost (API-compatible, data is standard SQLite), but not worth adopting until the Turso/Limbo situation clarifies. --- diff --git a/generated/architecture.md b/generated/architecture.md index 1c3f4db0..bc9e5fa6 100644 --- a/generated/architecture.md +++ b/generated/architecture.md @@ -1,522 +1,402 @@ -# Codegraph Architectural Audit — Cold Analysis +# Codegraph Architectural Audit — Revised Analysis > **Scope:** Unconstrained redesign proposals. No consideration for migration effort or backwards compatibility. What would the ideal architecture look like? +> +> **Revision context:** The original audit (Feb 22, 2026) analyzed v1.4.0 with ~12 source modules totaling ~5K lines. Since then, the codebase grew to v2.6.0 with 35 source modules totaling 17,830 lines — a 3.5x expansion. 18 new modules were added, MCP tools went from 12 to 25, CLI commands from ~20 to 45, and `index.js` exports from ~40 to 120+. This revision re-evaluates every recommendation against the actual codebase as it stands today. --- -## 1. parser.js Is a Monolith — Split Into a Plugin System +## What Changed Since the Original Audit -**Current state:** `parser.js` is 2,215 lines containing 9 language extractors, the WASM/native engine abstraction, the language registry, tree walking helpers, and the unified parse API — all in one file. +Before diving into recommendations, here's what happened: -**Problem:** Adding or modifying a language extractor forces you to work inside a 2K-line file alongside unrelated extractors. The extractors share repetitive patterns (walk tree → switch on node type → push to arrays) but each reimplements the loop. Testing a single language requires importing the entire parser surface. +| Metric | Feb 2026 (v1.4.0) | Mar 2026 (v2.6.0) | Growth | +|--------|-------------------|-------------------|--------| +| Source modules | ~12 | 35 | 2.9x | +| Total source lines | ~5,000 | 17,830 | 3.5x | +| `queries.js` | 823 lines | 3,110 lines | 3.8x | +| `mcp.js` | 354 lines | 1,212 lines | 3.4x | +| `cli.js` | -- | 1,285 lines | -- | +| `builder.js` | 554 lines | 1,173 lines | 2.1x | +| `embedder.js` | 525 lines | 1,113 lines | 2.1x | +| `complexity.js` | -- | 2,163 lines | New | +| MCP tools | 12 | 25 | 2.1x | +| CLI commands | ~20 | 45 | 2.3x | +| `index.js` exports | ~40 | 120+ | 3x | +| Test files | ~15 | 59 | 3.9x | -**Ideal architecture:** - -``` -src/ - parser/ - index.js # Public API: parseFileAuto, parseFilesAuto, resolveEngine - registry.js # LANGUAGE_REGISTRY + extension mapping - engine.js # Native/WASM init, engine resolution, grammar loading - tree-utils.js # findChild, findParentClass, walkTree helpers - base-extractor.js # Shared extraction framework (the walk loop + accumulator) - extractors/ - javascript.js # JS/TS/TSX extractor - python.js - go.js - rust.js - java.js - csharp.js - ruby.js - php.js - hcl.js -``` - -**Key design change:** Introduce a `BaseExtractor` that owns the tree walk loop and provides hook methods per node type. Each language extractor declares a node-type → handler map instead of reimplementing the traversal: - -```js -// Conceptual — not real API -export default { - language: 'python', - handlers: { - function_definition: (node, ctx) => { ctx.addDefinition(...) }, - call: (node, ctx) => { ctx.addCall(...) }, - import_statement: (node, ctx) => { ctx.addImport(...) }, - } -} -``` - -This eliminates the repeated walk-and-switch boilerplate across 9 extractors while keeping language-specific logic isolated. Each extractor becomes independently testable and the registration is declarative. +**Key pattern observed:** Every new feature (audit, batch, boundaries, check, cochange, communities, complexity, flow, manifesto, owners, structure, triage) was added as a standalone module following the same internal pattern: raw SQL + BFS/traversal logic + CLI formatting + JSON output + `*Data()` / `*()` dual functions. No shared abstractions were introduced. The original architectural debt wasn't addressed -- it was replicated 15 times. --- -## 2. The Database Layer Is Too Thin — Introduce a Repository Pattern +## 1. The Dual-Function Anti-Pattern Is Now the Dominant Architecture Problem -**Current state:** `db.js` is 130 lines — it opens SQLite, runs migrations, and that's it. All actual SQL lives scattered across `builder.js`, `queries.js`, `embedder.js`, `watcher.js`, and `cycles.js`. Every consumer writes raw SQL inline. +**Original analysis (S3):** `queries.js` mixes data access, graph algorithms, and presentation. The `*Data()` / `*()` dual-function pattern was identified as a workaround for coupling. -**Problems:** -- SQL duplication (similar node/edge lookups written multiple times in different modules) -- No single place to understand or optimize the query surface -- Schema knowledge leaks everywhere — if a column changes, you grep the entire codebase -- No abstraction boundary for swapping storage engines (e.g., moving to DuckDB or an in-memory graph for tests) - -**Ideal architecture:** +**What happened:** Every new module adopted the same pattern. There are now **15+ modules** each implementing both data extraction AND CLI formatting: ``` -src/ - db/ - connection.js # Open, WAL mode, pragma tuning - migrations.js # Schema versions - repository.js # ALL data access methods - types.js # TS-style JSDoc type defs for Node, Edge, Embedding +queries.js -> queryNameData() / queryName(), impactAnalysisData() / impactAnalysis(), ... +audit.js -> auditData() / audit() +batch.js -> batchData() / batch() +check.js -> checkData() / check() +cochange.js -> coChangeData() / coChange(), coChangeTopData() / coChangeTop() +communities.js -> communitiesData() / communities() +complexity.js -> complexityData() / complexity() +flow.js -> flowData() / flow() +manifesto.js -> manifestoData() / manifesto() +owners.js -> ownersData() / owners() +structure.js -> structureData() / structure(), hotspotsData() / hotspots() +triage.js -> triageData() / triage() +branch-compare -> branchCompareData() / branchCompare() ``` -`repository.js` would expose a complete data access API: +Each of these modules independently handles: DB opening, SQL execution, result shaping, pagination integration, CLI formatting, JSON output, and `--no-tests` filtering. The repetition is massive. -```js -// Writes -insertNode(node) -insertEdge(edge) -insertEmbeddings(batch) -upsertFileHash(file, hash, mtime) -deleteFileNodes(file) -deleteFileEdges(file) - -// Reads -findNodesByName(name, opts?) -findNodesByFile(file, opts?) -findEdgesFrom(nodeId, kind?) -findEdgesTo(nodeId, kind?) -getFileHash(file) -getChangedFiles(allFiles) -getAllEmbeddings() -getEmbeddingMeta() - -// Graph traversals (currently in queries.js as raw SQL + BFS) -getTransitiveCallers(nodeId, depth) -getTransitiveDependents(file, depth) -getClassHierarchy(classNodeId) -``` - -All prepared statements live here. All index tuning happens here. Consumers never see SQL. - -**Secondary benefit:** This enables an `InMemoryRepository` for tests — no temp file cleanup, instant setup, true unit isolation. - ---- - -## 3. queries.js Mixes Data Access, Graph Algorithms, and Presentation - -**Current state:** `queries.js` (823 lines) contains SQL queries, BFS traversal logic, formatting/printing, JSON serialization, and CLI output — all interleaved. Each "query command" exists as both a `*Data()` function (returns object) and a presentation function (prints to stdout). - -**Problem:** The presentation layer (stdout formatting, `kindIcon()`, table printing) is coupled to the analysis layer (BFS, impact scoring). You can't reuse the BFS logic in the MCP server without also pulling in the CLI formatting. The `*Data()`/`*()` dual-function pattern is a workaround for this coupling. - -**Ideal architecture — three layers:** +**Ideal architecture -- Command + Query separation with shared infrastructure:** ``` src/ - analysis/ - impact.js # impactAnalysis: BFS over edges, returns typed result - call-chain.js # fnDeps, fnImpact: transitive caller/callee traversal - diff-impact.js # Git diff → affected functions → blast radius - module-map.js # Connectivity ranking - class-hierarchy.js # Inheritance resolution - - formatters/ - cli-formatter.js # Human-readable stdout output - json-formatter.js # --json flag handling - table-formatter.js # Tabular output for module-map, list-functions + commands/ # One file per command + query.js # { execute(args, ctx) -> data, format(data, opts) -> string } + impact.js + audit.js + check.js + ... + + infrastructure/ + command-runner.js # Shared lifecycle: open DB -> validate -> execute -> format -> paginate + result-formatter.js # Shared formatting: table, JSON, NDJSON, Mermaid + pagination.js # Shared pagination with consistent interface + test-filter.js # Shared --no-tests / isTestFile logic + + analysis/ # Pure algorithms -- no I/O, no formatting + bfs.js # Graph traversals (BFS, DFS, shortest path) + impact.js # Blast radius computation + confidence.js # Import resolution scoring + clustering.js # Community detection, coupling analysis + risk.js # Triage scoring, hotspot detection ``` -Analysis modules take a repository and return pure data. Formatters take data and produce strings. The CLI, MCP server, and programmatic API all consume analysis modules directly and pick their own formatter (or none). - ---- - -## 4. builder.js Orchestrates Too Many Concerns — Extract a Pipeline - -**Current state:** `builder.js` (554 lines) handles file collection, config loading, alias resolution, incremental change detection, parsing, node insertion, edge building, barrel file resolution, and statistics — all in `buildGraph()`. - -**Problem:** `buildGraph()` is a mega-function that's hard to test in parts. You can't test edge building without running the full parse phase. You can't test barrel resolution without a populated database. - -**Ideal architecture — explicit pipeline stages:** - -```js -// Each stage is a pure-ish function: (input, config) => output -const pipeline = [ - collectFiles, // (rootDir, config) => filePaths[] - detectChanges, // (filePaths, db) => { changed, removed, isFullBuild } - parseFiles, // (filePaths, engineOpts) => Map - insertNodes, // (symbolMap, db) => nodeIndex - resolveImports, // (symbolMap, rootDir, aliases) => importEdges[] - buildCallEdges, // (symbolMap, nodeIndex) => callEdges[] - buildClassEdges, // (symbolMap, nodeIndex) => classEdges[] - resolveBarrels, // (edges, symbolMap) => resolvedEdges[] - insertEdges, // (allEdges, db) => stats -] -``` +The key insight: every command follows the same lifecycle -- `(args) -> open DB -> query -> analyze -> format -> output`. A shared `CommandRunner` handles the lifecycle. Each command only implements the unique query + analysis logic. Formatting is always separate and pluggable (CLI text, JSON, NDJSON, Mermaid). -Each stage is independently testable. The pipeline runner handles transactions, logging, and statistics. Stages can be composed differently for watch mode (skip collectFiles, skip detectChanges, run single-file variant). +This eliminates the dual-function pattern entirely. `index.js` exports `auditData` (the command's execute function) -- the CLI formatter is internal to the CLI layer and never exported. --- -## 5. Embedder Should Be a Standalone Subsystem +## 2. The Database Layer Needs a Repository -- Now More Than Ever -**Current state:** `embedder.js` (525 lines) creates its own DB tables (`embeddings`, `embedding_meta`), manages its own model lifecycle, and implements both vector storage and search. It's effectively a mini vector database bolted onto the side of the graph database. +**Original analysis (S2):** SQL scattered across `builder.js`, `queries.js`, `embedder.js`, `watcher.js`, `cycles.js`. -**Problem:** Embedding concerns bleed into the graph DB schema. The cosine similarity search is O(n) full scan — fine for thousands of symbols, will not scale. The model registry, embedding generation, and search are all tangled in one file. +**What happened:** SQL is now scattered across **20+ modules**: all of the above plus `audit.js`, `check.js`, `cochange.js`, `communities.js`, `complexity.js`, `flow.js`, `manifesto.js`, `owners.js`, `structure.js`, `triage.js`, `snapshot.js`, `branch-compare.js`. Each module opens the DB independently with `openDb()`, creates its own prepared statements, and writes raw SQL inline. -**Ideal architecture:** +The schema grew to 9 tables: `nodes`, `edges`, `node_metrics`, `file_hashes`, `co_changes`, `co_change_meta`, `file_commit_counts`, `build_meta`, `function_complexity`. Plus embeddings and FTS5 tables in `embedder.js`. + +**Ideal architecture** (unchanged from original, but now higher priority): ``` src/ - embeddings/ - index.js # Public API - model-registry.js # Model definitions, batch sizes, loading - generator.js # Source → text preparation → batch embedding - store.js # Vector storage (pluggable: SQLite blob, flat file, HNSW index) - search.js # Similarity search, RRF multi-query fusion -``` - -**Key design change:** Make the vector store pluggable. The current SQLite blob approach works but is a linear scan. A future `HNSWStore` (using `hnswlib-node` or similar) would give O(log n) approximate nearest neighbor search — critical when the symbol count reaches 50K+. - -The store interface would be: - -```js -// Abstract store -insert(nodeId, vector, preview) -search(queryVector, topK, minScore) → results[] -delete(nodeId) -rebuild() + db/ + connection.js # Open, WAL mode, pragma tuning, connection pooling + migrations.js # Schema versions (currently 9 migrations) + repository.js # ALL read/write operations across all 9+ tables + types.js # JSDoc type definitions for all entities ``` -This also enables storing embeddings in a separate file from the graph DB, which avoids bloating `graph.db` with large binary blobs. +**New addition -- query builders for common patterns:** ---- - -## 6. The Native/WASM Abstraction Leaks - -**Current state:** `parser.js` has `resolveEngine()` that returns `{ name, native }`, then every call site branches on `engine.name === 'native'`. `resolve.js` has its own native check. `cycles.js` has its own native check. `builder.js` passes engine options through. - -**Problem:** The dual-engine strategy is a great idea but its implementation is scattered. Every consumer needs to know about native vs. WASM and handle both paths. - -**Ideal architecture — unified engine interface:** +Many modules do the same filtered query: "find nodes WHERE kind IN (...) AND file NOT LIKE '%test%' AND name LIKE ? ORDER BY ... LIMIT ? OFFSET ?". A lightweight query builder eliminates this SQL duplication: ```js -// engine.js — returns an object with the same API regardless of backend -export function createEngine(opts) { - const backend = resolveBackend(opts) // 'native' | 'wasm' - - return { - name: backend, - parseFile(filePath, source) { ... }, - parseFiles(filePaths, rootDir) { ... }, - resolveImport(from, source, rootDir, aliases) { ... }, - resolveImports(batch, rootDir, aliases) { ... }, - detectCycles(db) { ... }, - computeConfidence(caller, target, imported) { ... }, - createCache() { ... }, - } -} +repo.nodes() + .where({ kind: ['function', 'method'], file: { notLike: '%test%' } }) + .matching(name) + .orderBy('name') + .paginate(opts) + .all() ``` -Consumers receive an engine object and call methods on it. They never branch on native vs. WASM. The engine internally dispatches to the right implementation. This is the Strategy pattern properly applied. - -**Bonus:** This makes it trivial to add a third engine backend (e.g., a remote parsing service for very large repos) without touching any consumer code. +Not an ORM -- a thin SQL builder that generates the same prepared statements but eliminates string construction across 20 modules. --- -## 7. No Streaming / Event Architecture — Everything Is Batch - -**Current state:** The entire build pipeline is synchronous batch processing. Parse all files → insert all nodes → build all edges. The watcher does per-file updates but reimplements the pipeline in a simpler form. +## 3. queries.js at 3,110 Lines Must Be Decomposed -**Problem:** For large repos (10K+ files), the user waits for the entire pipeline to complete before seeing anything. There's no progress reporting during parsing. There's no way to cancel a build mid-flight. The watcher's simplified pipeline diverges from the main build path (different code, different edge cases). *(Note: two concrete edge cases — concurrent file edits causing EBUSY/EACCES during read, and symlink loops causing infinite recursion in `collectFiles` — have been fixed. `readFileSafe` retries on transient OS errors and is shared between `builder.js` and `watcher.js`. `collectFiles` tracks visited real paths to break symlink cycles.)* +**Original analysis (S3):** 823 lines mixing data access, algorithms, and presentation. -**Ideal architecture — event-driven pipeline:** +**Current state:** 3,110 lines -- nearly 4x growth. Contains 15+ data functions, 15+ display functions, constants (`SYMBOL_KINDS`, `ALL_SYMBOL_KINDS`, `VALID_ROLES`, `FALSE_POSITIVE_NAMES`), icon helpers (`kindIcon`), normalization (`normalizeSymbol`), test filtering (`isTestFile`), and generator functions (`iterListFunctions`, `iterRoles`, `iterWhere`). -```js -const pipeline = createPipeline(config) +This is now the second-largest file in the codebase (after `complexity.js` at 2,163 lines) and the most interconnected -- almost every other module imports from it. -pipeline.on('file:parsed', (file, symbols) => { /* progress */ }) -pipeline.on('file:indexed', (file, nodeCount) => { /* progress */ }) -pipeline.on('edge:built', (edge) => { /* streaming insert */ }) -pipeline.on('build:complete', (stats) => { /* summary */ }) -pipeline.on('error', (file, err) => { /* continue or abort */ }) +**Ideal decomposition:** -await pipeline.run(rootDir) -// or for watch mode: -await pipeline.watch(rootDir) // reuses same stages, different trigger ``` - -This unifies the build and watch code paths. Progress is naturally reported via events. Cancellation is a `pipeline.abort()`. Large builds can stream results to the DB incrementally instead of buffering everything in memory. - ---- - -## 8. Configuration Is Fine but Should Support Project Profiles - -**Current state:** Single `.codegraphrc.json` file, flat config, env var overrides. Clean and simple. - -**What's missing for real-world use:** - -**Profile-based configuration.** A monorepo with 3 services needs different settings per service (different `include`/`exclude`, different `ignoreDirs`, different `dbPath`). Currently you'd need 3 separate config files and run from 3 different directories. - -```json -{ - "profiles": { - "backend": { - "include": ["services/api/**"], - "build": { "dbPath": ".codegraph/api.db" } - }, - "frontend": { - "include": ["apps/web/**"], - "extensions": [".ts", ".tsx"], - "build": { "dbPath": ".codegraph/web.db" } - } - } -} -``` - -```bash -codegraph build --profile backend -codegraph build --profile frontend -codegraph build # default = all +src/ + analysis/ + symbol-lookup.js # queryNameData, whereData, listFunctionsData + impact.js # impactAnalysisData, fnImpactData, diffImpactData + dependencies.js # fileDepsData, fnDepsData, pathData + module-map.js # moduleMapData, statsData + context.js # contextData, explainData + roles.js # rolesData (currently delegates to structure.js) + + shared/ + constants.js # SYMBOL_KINDS, ALL_SYMBOL_KINDS, VALID_ROLES, FALSE_POSITIVE_NAMES + filters.js # isTestFile, normalizeSymbol, kindIcon + generators.js # iterListFunctions, iterRoles, iterWhere ``` -This maps cleanly to the multi-repo registry concept already in the codebase, but works within a single repo. - ---- - -## 9. Import Resolution Confidence Scoring Is Heuristic — Add Import-Graph Awareness - -**Current state:** `computeConfidence()` uses file proximity (same dir = 0.7, parent dir = 0.5, fallback = 0.3) to disambiguate when multiple functions share a name. - -**Problem:** Proximity is a weak signal. If `src/utils/format.js` exports `format()` and `src/api/format.js` also exports `format()`, and the caller is in `src/api/handler.js`, proximity correctly scores `src/api/format.js` higher. But if the caller explicitly imports from `src/utils/format.js`, the import graph already tells us the answer with certainty — and the current code does use imports when available (score 1.0). The gap is in the fallback path where there's no import but there IS an import chain (A imports B which imports C which exports the function). - -**Ideal enhancement — transitive import awareness:** - -Before falling back to proximity, walk the import graph from the caller file. If there's any import path (even indirect through barrel files) that reaches one of the candidates, that candidate gets a 0.9 score. Only if no import path exists at all do we fall back to proximity heuristics. - -This is a targeted algorithmic improvement, not a structural change, but it significantly improves edge accuracy for large codebases with many same-named functions. +Each analysis module is purely data -- no CLI output, no JSON formatting, no `console.log`. The `*Data()` suffix disappears because there's no `*()` counterpart. These are just functions that return data. --- -## 10. The MCP Server Should Be Composable, Not Monolithic +## 4. MCP at 1,212 Lines with 25 Tools Needs Composability -**Current state:** `mcp.js` (354 lines) has a hardcoded `TOOLS` array with 12 tool definitions, each with inline JSON schemas, and a `switch` statement dispatching to handler functions. +**Original analysis (S10):** 354 lines, 12 tools, monolithic switch dispatch. -**Problem:** Adding a new MCP tool requires editing the TOOLS array (schema), the switch statement (dispatch), and importing the handler — three changes in one file. The tool schemas are verbose JSON objects mixed with implementation logic. +**Current state:** 1,212 lines, 25 tools. The `buildToolList()` function dynamically builds tool definitions, and a large switch/dispatch handles all 25 tools. Adding a tool still requires editing the tool list, the dispatch block, and importing the handler -- three changes in one file. -**Ideal architecture:** +**Ideal architecture** (unchanged from original, now critical): ``` src/ mcp/ - server.js # MCP server setup, transport, connection lifecycle - tool-registry.js # Dynamic tool registration + server.js # MCP server setup, transport, connection lifecycle + tool-registry.js # Auto-discovery + dynamic registration + middleware.js # Pagination, error handling, repo resolution tools/ - query-function.js # { schema, handler } per tool + query-function.js # { schema, handler } file-deps.js impact-analysis.js - find-cycles.js - semantic-search.js - ... + check.js + audit.js + complexity.js + co-changes.js + structure.js + ... (25 files, one per tool) ``` -Each tool is a self-contained module: +Each tool is self-contained: ```js -// tools/query-function.js export const schema = { - name: 'query_function', + name: 'audit', description: '...', inputSchema: { ... } } export async function handler(args, context) { - const dbPath = context.resolveDb(args.repo) - return queryNameData(args.name, dbPath) + return auditData(args.target, context.resolveDb(args.repo), args) } ``` -The registry auto-discovers tools from the `tools/` directory. Adding a tool = adding a file. No other files change. - ---- - -## 11. Testing Strategy Needs Layers - -**Current state:** Tests are a mix of integration tests (full pipeline through SQLite) and pseudo-unit tests that still often hit the filesystem or database. There's no clear boundary between "test the algorithm" and "test the integration." - -**Ideal testing pyramid:** - -``` - ╱╲ - ╱ ╲ E2E (2-3 tests) - ╱ E2E╲ Full CLI invocation, real project, assert output - ╱──────╲ - ╱ ╲ Integration (current tests, refined) - ╱Integration╲ Build pipeline, query results, MCP responses - ╱────────────╲ - ╱ ╲ Unit (new layer) - ╱ Unit ╲ Extractors, algorithms, formatters — no I/O - ╱──────────────────╲ -``` - -**What's missing:** -- **Pure unit tests** for extractors (pass AST node, assert symbols — no file I/O) -- **Pure unit tests** for BFS/Tarjan algorithms (pass adjacency list, assert result) -- **Pure unit tests** for confidence scoring (pass parameters, assert score) -- **Repository mock** for query tests (in-memory data, no SQLite) -- **E2E tests** that invoke the CLI binary on a real (small) project and assert exit codes + stdout - -The repository pattern from point #2 directly enables this: unit tests use `InMemoryRepository`, integration tests use `SqliteRepository`. +The registry auto-discovers tools from the directory. Shared middleware handles pagination (the `MCP_DEFAULTS` logic currently in `paginate.js`), error wrapping, and multi-repo resolution. Adding a tool = adding a file. --- -## 12. CLI Architecture — Move to Command Objects +## 5. CLI at 1,285 Lines with 45 Commands Needs Command Objects -**Current state:** `cli.js` defines all commands inline with Commander.js. Each command is a `.command().description().option().action()` chain that directly calls functions. +**Original analysis (S12):** CLI was mentioned as a future concern. -**Problem:** The CLI file grows linearly with every new command. Command logic (option parsing, validation, output formatting) is mixed with framework wiring. You can't test a command's behavior without invoking Commander. +**Current state:** 1,285 lines of inline Commander.js chains. 45 commands registered with `.command().description().option().action()` patterns. Each action handler directly calls module functions, handles `--json` output, and manages error display. **Ideal architecture:** ``` src/ cli/ - index.js # Commander setup, command registration + index.js # Commander setup, auto-discover commands + shared/ + output.js # --json, --ndjson, table, plain text output + options.js # Shared options (--no-tests, --json, --db, --engine, --limit, --offset) + validation.js # Argument validation, path resolution commands/ - build.js # { name, description, options, validate, execute } + build.js # { name, description, options, validate, execute } query.js impact.js - deps.js - export.js - search.js - watch.js - registry.js - ... + audit.js + check.js + ... (45 files) ``` -Each command is a plain object: +Each command: ```js export default { - name: 'impact', - description: 'Show what depends on a file', - arguments: [{ name: 'file', required: true }], + name: 'audit', + description: 'Combined explain + impact + health report', + arguments: [{ name: 'target', required: true }], options: [ - { flags: '--depth ', description: 'Traversal depth', default: 3 }, - { flags: '--json', description: 'JSON output' }, + { flags: '-T, --no-tests', description: 'Exclude test files' }, + { flags: '-j, --json', description: 'JSON output' }, + { flags: '--db ', description: 'Custom DB path' }, ], - validate(args, opts) { /* pre-flight checks */ }, - async execute(args, opts) { /* the actual work */ }, + async execute(args, opts) { + const data = await auditData(args.target, opts.db, opts) + return data // CommandRunner handles formatting + }, } ``` -The CLI index auto-discovers commands and registers them with Commander. Each command is independently testable by calling `execute()` directly. +The CLI index auto-discovers commands. Shared options (`--no-tests`, `--json`, `--db`, `--engine`, `--limit`, `--offset`) are applied uniformly. The `CommandRunner` handles the open-DB -> execute -> format -> output lifecycle. --- -## 13. Graph Model Is Flat — Consider Hierarchical Scoping +## 6. complexity.js at 2,163 Lines Is a Hidden Monolith -**Current state:** The `nodes` table has `(name, kind, file, line)`. A function named `format` in `src/a.js` and a method named `format` on class `DateHelper` in `src/b.js` are both just nodes with `name=format`. The class membership is encoded as an edge, not as a structural property. +**Not in original analysis** -- this module didn't exist in Feb 2026. -**Problem:** Name collisions are resolved through the confidence scoring heuristic. But the graph has no concept of scope — there's no way to express "this `format` belongs to `DateHelper`" as a structural property of the node. This makes queries ambiguous: `codegraph query format` returns all `format` symbols across the entire graph. +**Current state:** 2,163 lines containing language-specific AST complexity rules for 8 languages (JS/TS, Python, Go, Rust, Java, C#, PHP, Ruby), plus Halstead metrics computation, maintainability index calculation, LOC/SLOC counting, and CLI formatting. It's the largest file in the codebase. -**Ideal enhancement — qualified names:** +**Problem:** The file is structured as a giant map of language to rules, but the rules for each language are deeply nested objects with inline AST traversal logic. Adding a new language or modifying a rule requires working inside a 2K-line file. -```sql -CREATE TABLE nodes ( - id INTEGER PRIMARY KEY, - name TEXT NOT NULL, -- 'format' - qualified_name TEXT, -- 'DateHelper.format' or 'utils/date::format' - kind TEXT NOT NULL, - file TEXT NOT NULL, - scope TEXT, -- 'DateHelper' (parent class/module/namespace) - line INTEGER, - end_line INTEGER, - visibility TEXT, -- 'public' | 'private' | 'protected' | 'internal' - UNIQUE(qualified_name, kind, file) -); -``` +**Ideal architecture:** -The `qualified_name` gives every symbol a unique identity within its file. The `scope` field enables queries like "all methods of class X" without traversing edges. The `visibility` field enables filtering out private implementation details from impact analysis. +``` +src/ + complexity/ + index.js # Public API: computeComplexity, complexityData + metrics.js # Halstead, MI, LOC/SLOC computation (language-agnostic) + engine.js # Walk AST + apply rules -> raw metric values + rules/ + javascript.js # JS/TS/TSX complexity rules + python.js + go.js + rust.js + java.js + csharp.js + php.js + ruby.js +``` -This doesn't change the edge model — it enriches the node model to reduce ambiguity at the source. +Each rules file exports a declarative complexity rule set. The engine applies rules to AST nodes. Metrics computation is shared. This mirrors the parser plugin system concept -- same pattern, applied to complexity. --- -## 14. No Caching Layer Between DB and Queries +## 7. builder.js at 1,173 Lines -- Pipeline Architecture -**Current state:** Every query function opens the DB, runs SQL, returns results, and closes. There's no caching of query results, no materialized views, no precomputed aggregates. +**Original analysis (S4):** 554 lines, mega-function that's hard to test in parts. -**Fine for now.** SQLite is fast and the graph fits in memory. But as graphs grow (50K+ nodes), repeated queries (especially from MCP where an AI agent may query the same function multiple times in a conversation) will redundantly hit disk. +**Current state:** 1,173 lines -- doubled. Now includes change journal integration, structure building, role classification, incremental verification, and more complex edge building. The `buildGraph()` function is even more of a mega-function. -**Ideal enhancement — query result cache:** +**Ideal architecture** (unchanged, reinforced): ```js -class QueryCache { - constructor(db, maxAge = 60_000) { ... } - - // Cache key = query name + args hash - // Invalidated on DB write (build, watch update) - get(key) { ... } - set(key, value) { ... } - invalidate() { ... } // Called after any DB mutation -} +const pipeline = [ + collectFiles, // (rootDir, config) => filePaths[] + detectChanges, // (filePaths, db) => { changed, removed, isFullBuild } + parseFiles, // (filePaths, engineOpts) => Map + insertNodes, // (symbolMap, db) => nodeIndex + resolveImports, // (symbolMap, rootDir, aliases) => importEdges[] + buildCallEdges, // (symbolMap, nodeIndex) => callEdges[] + buildClassEdges, // (symbolMap, nodeIndex) => classEdges[] + resolveBarrels, // (edges, symbolMap) => resolvedEdges[] + insertEdges, // (allEdges, db) => stats + buildStructure, // (db, fileSymbols, rootDir) => structureStats + classifyRoles, // (db) => roleStats + computeComplexity, // (db, rootDir, engine) => complexityStats + emitChangeJournal, // (rootDir, changes) => void +] ``` -This is a simple LRU or TTL cache that sits between the analysis layer and the repository. It's transparent to consumers. Particularly valuable for MCP where the same agent session may repeatedly query related symbols. +The pipeline grew -- four new stages since the original analysis. This reinforces the need: each stage is independently testable and the pipeline runner handles transactions, logging, progress, and statistics. + +**Watch mode** reuses the same stages triggered per-file, eliminating the `watcher.js` divergence. `change-journal.js` and `journal.js` integrate as pipeline hooks rather than separate code paths. --- -## 15. Watcher and Builder Share Logic But Don't Share Code +## 8. embedder.js at 1,113 Lines -- Now Includes Three Search Engines + +**Original analysis (S5):** 525 lines, mini vector database bolted onto the graph DB. -**Current state:** `watcher.js` reimplements parts of `builder.js` — node insertion, edge building, prepared statement setup — in a simplified single-file form. The two implementations can drift. +**Current state:** 1,113 lines. Now contains: +- 8 embedding model definitions with batch sizes and dimensions +- 2 embedding strategies (structured, source) +- Vector storage in SQLite blobs +- Cosine similarity search (O(n) linear scan) +- **FTS5 full-text index with BM25 scoring** (new) +- **Hybrid search with RRF fusion** (new) +- Model lifecycle management (lazy loading, caching) -**Problem:** Bug fixes to edge building in `builder.js` must be separately applied to `watcher.js`. The watcher's edge building is simpler (no barrel resolution, simpler confidence) which means watch-mode graphs are subtly different from full-build graphs. +Hybrid search (originally planned as Phase 5.3) is already implemented -- but inside the monolith. -**Partial progress:** `readFileSafe` (exported from `builder.js`, imported by `watcher.js`) is the first shared utility between the two modules. It retries on transient OS errors (EBUSY/EACCES/EPERM) that occur when editors perform non-atomic saves, replacing bare `readFileSync` calls in both code paths. This is a small step toward the shared-stages goal. +**Ideal architecture** (updated): -**Ideal fix:** The pipeline architecture from point #4 eliminates this entirely. Watch mode uses the same pipeline stages, just triggered per-file instead of per-project. The `insertNodes` and `buildEdges` stages are literally the same functions. +``` +src/ + embeddings/ + index.js # Public API + models.js # Model definitions, batch sizes, loading + generator.js # Source -> text preparation -> batch embedding + stores/ + sqlite-blob.js # Current O(n) cosine similarity + fts5.js # BM25 keyword search via FTS5 + search/ + semantic.js # Vector similarity search + keyword.js # FTS5 BM25 search + hybrid.js # RRF fusion of semantic + keyword + strategies/ + structured.js # Structured text preparation + source.js # Raw source preparation +``` + +The three search modes (semantic, keyword, hybrid) become composable search strategies rather than three code paths in one file. The store abstraction enables future pluggable backends (HNSW, DiskANN) without touching search logic. --- -## 16. Export Module Should Support Filtering and Subgraph Extraction +## 9. parser.js Is No Longer a Monolith -- Downgrade Priority -**Current state:** `export.js` exports the entire graph or nothing. DOT/Mermaid/JSON always include all nodes and edges. +**Original analysis (S1):** 2,215 lines, 9 language extractors in one file. Highest priority. -**Problem:** For a 5K-node graph, the DOT output is unusable — Graphviz chokes, Mermaid renders an incomprehensible hairball. +**Current state:** 404 lines. The native Rust engine now handles the heavy parsing. `parser.js` is a thin WASM fallback with `LANGUAGE_REGISTRY`, engine resolution, and minimal extraction. The extractors still exist but are much smaller per-language. -**Ideal enhancement:** +**Revised recommendation:** This is no longer urgent. The Rust engine already implements the plugin system concept natively. The WASM path in `parser.js` at 404 lines is manageable. If the parser ever grows again (new languages added to WASM fallback), revisit -- but for now, this is fine. -```bash -codegraph export --format dot --focus src/builder.js --depth 2 -# Exports only builder.js and its 2-hop neighborhood +--- -codegraph export --format mermaid --filter "src/api/**" --kind function -# Only functions in the api directory +## 10. The Native/WASM Abstraction -- Less Critical Now -codegraph export --format json --changed # Only files changed since last commit -``` +**Original analysis (S6):** Scattered `engine.name === 'native'` branching across multiple files. -The export module receives a subgraph specification (focus node + depth, file pattern, kind filter) and extracts the relevant subgraph before formatting. This makes visualization actually useful for real projects. +**Current state:** The native engine is the primary path. WASM is a fallback. The branching still exists but is less problematic because most users never hit the WASM path. The unified engine interface is still the right design but it's a polish item, not a structural problem. + +**Revised priority:** Low-Medium. Do it when touching these files for other reasons. --- -## 17. Error Handling Is Ad-Hoc — Introduce Domain Errors +## 11. Qualified Names + Hierarchical Scoping -- Still Important -**Current state:** Errors are handled inconsistently: -- Some functions throw generic `Error` -- Some return null/undefined on failure -- Some call `logger.warn()` and continue -- Some call `process.exit(1)` +**Original analysis (S13):** Flat node model with name collisions resolved by heuristics. -**Problem:** Callers can't distinguish "file not found" from "parse failed" from "DB corrupt" without inspecting error message strings. The MCP server wraps everything in try-catch and returns generic error text. +**Current state:** Unchanged. The `nodes` table still has `(name, kind, file, line)` with no scope or qualified name. The `structure.js` module added `role` classification but not scoping. With the codebase now handling more complex analysis (communities, boundaries, flow tracing), the lack of qualified names creates more ambiguity in more places. -**Ideal architecture:** +**Ideal enhancement** (unchanged): + +```sql +ALTER TABLE nodes ADD COLUMN qualified_name TEXT; -- 'DateHelper.format' +ALTER TABLE nodes ADD COLUMN scope TEXT; -- 'DateHelper' +ALTER TABLE nodes ADD COLUMN visibility TEXT; -- 'public' | 'private' | 'protected' +``` + +--- + +## 12. Domain Error Hierarchy -- More Urgent with 35 Modules + +**Original analysis (S17):** Inconsistent error handling across ~12 modules. + +**Current state:** 35 modules with inconsistent error handling. Some throw, some return null, some `logger.warn()` and continue, some `process.exit(1)`. The MCP server wraps everything in generic try-catch. The `check.js` module returns structured pass/fail objects but other modules don't. + +**`check.js` already demonstrates the right pattern** -- structured result objects with clear pass/fail semantics. This should be generalized: ```js // errors.js export class CodegraphError extends Error { - constructor(message, { code, file, cause } = {}) { ... } + constructor(message, { code, file, cause } = {}) { + super(message) + this.code = code + this.file = file + this.cause = cause + } } export class ParseError extends CodegraphError { code = 'PARSE_FAILED' } @@ -524,32 +404,56 @@ export class DbError extends CodegraphError { code = 'DB_ERROR' } export class ConfigError extends CodegraphError { code = 'CONFIG_INVALID' } export class ResolutionError extends CodegraphError { code = 'RESOLUTION_FAILED' } export class EngineError extends CodegraphError { code = 'ENGINE_UNAVAILABLE' } +export class AnalysisError extends CodegraphError { code = 'ANALYSIS_FAILED' } +export class BoundaryError extends CodegraphError { code = 'BOUNDARY_VIOLATION' } ``` -The CLI catches domain errors and formats them for humans. The MCP server catches them and returns structured error responses. The programmatic API lets them propagate. No more `process.exit()` from library code. - --- -## 18. The Programmatic API (index.js) Exposes Too Much +## 13. Public API Surface -- 120+ Exports Is Unsustainable -**Current state:** `index.js` re-exports ~40 functions from every module — internal helpers, data functions, presentation functions, DB utilities, everything. +**Original analysis (S18):** ~40 re-exports, no distinction between public and internal. -**Problem:** There's no distinction between public API and internal implementation. A consumer importing `buildGraph` also sees `findChild` (a tree-sitter helper) and `openDb` (internal DB function). Any refactoring risks breaking unnamed consumers. +**Current state:** 120+ exports from `index.js`. Every `*Data()` function, every CLI display function, every constant, every utility is exported. The public API is the entire internal surface. -**Ideal architecture — explicit public surface:** +**The problem is now 3x worse** and directly blocks any refactoring -- every internal rename could break an unnamed consumer. + +**Ideal architecture** (reinforced): ```js -// index.js — curated public API only +// index.js -- curated public API (~30 exports) +// Build export { buildGraph } from './builder.js' -export { queryFunction, impactAnalysis, fileDeps, fnDeps, diffImpact } from './analysis/index.js' -export { search, multiSearch, embedSymbols } from './embeddings/index.js' + +// Analysis (data functions only -- no CLI formatters) +export { queryNameData, impactAnalysisData, fileDepsData, fnDepsData, + fnImpactData, diffImpactData, moduleMapData, statsData, + contextData, explainData, whereData, listFunctionsData, + rolesData } from './analysis/index.js' + +// New analysis modules +export { auditData } from './commands/audit.js' +export { checkData } from './commands/check.js' +export { complexityData } from './commands/complexity.js' +export { manifestoData } from './commands/manifesto.js' +export { triageData } from './commands/triage.js' +export { flowData } from './commands/flow.js' +export { communitiesData } from './commands/communities.js' + +// Search +export { searchData, hybridSearchData, embedSymbols } from './embeddings/index.js' + +// Infrastructure export { detectCycles } from './analysis/cycles.js' export { exportGraph } from './export.js' export { startMcpServer } from './mcp/server.js' export { loadConfig } from './config.js' + +// Constants +export { SYMBOL_KINDS, ALL_SYMBOL_KINDS } from './shared/constants.js' ``` -Everything else is internal. Use `package.json` `exports` field to enforce module boundaries: +Lock it with `package.json` exports: ```json { @@ -560,35 +464,143 @@ Everything else is internal. Use `package.json` `exports` field to enforce modul } ``` -Consumers can only import from the documented entry points. Internal modules are truly internal. +--- + +## 14. Structure + Cochange + Communities -- Parallel Graph Models Need Unification + +**Not in original analysis** -- these modules didn't exist. + +**Current state:** Three separate analytical subsystems each build their own graph representation: + +- **`structure.js`** (668 lines): Builds directory nodes, computes cohesion/density/coupling metrics, classifies roles (entry, core, utility, adapter, leaf, dead). Has its own BFS and metrics computation. +- **`cochange.js`** (502 lines): Builds temporal coupling graph from git history. Stores in `co_changes` table with Jaccard coefficients. Independent of the dependency graph. +- **`communities.js`** (310 lines): Uses graphology to build an in-memory graph from edges, runs Louvain community detection, computes modularity and drift. + +Each constructs its own graph representation independently. There's no shared graph abstraction they all operate on. + +**Ideal architecture -- unified graph model:** + +``` +src/ + graph/ + model.js # In-memory graph representation (nodes + edges + metadata) + builders/ + dependency.js # Build from SQLite edges (imports, calls, extends) + structure.js # Build from file/directory hierarchy + temporal.js # Build from git history (co-changes) + algorithms/ + bfs.js # Breadth-first traversal (used by impact, flow, etc.) + shortest-path.js # Path finding (used by path command) + tarjan.js # Cycle detection (currently in cycles.js) + louvain.js # Community detection (currently uses graphology) + centrality.js # Fan-in/fan-out, betweenness (used by triage, hotspots) + clustering.js # Cohesion, coupling, density metrics + classifiers/ + roles.js # Node role classification + risk.js # Risk scoring (currently in triage.js) +``` + +The graph model is a shared in-memory structure that multiple builders can populate and multiple algorithms can query. This eliminates the repeated graph construction across modules and makes algorithms composable -- you can run community detection on the dependency graph, the temporal graph, or a merged graph. + +--- + +## 15. Pagination Pattern Needs Standardization + +**Not in original analysis** -- paginate.js was just introduced. + +**Current state:** `paginate.js` (106 lines) provides `paginate()` and `paginateResult()` helpers plus `MCP_DEFAULTS` with per-command limits. But each module integrates pagination differently -- some pass `opts` to paginate, some manually slice arrays, some use `LIMIT/OFFSET` in SQL, some paginate in memory after fetching all results. + +**Ideal architecture:** Pagination belongs in the repository layer (SQL `LIMIT/OFFSET`) for data fetching and in the command runner for result shaping. The current pattern of fetching all data then slicing in memory doesn't scale. The repository should accept pagination parameters directly: + +```js +// In repository +findNodes(filters, { limit, offset, orderBy }) { + // Generates SQL with LIMIT/OFFSET -- never fetches more than needed +} + +// In command runner (after execute) +runner.paginate(result, 'functions', opts) // Consistent shaping for all commands +``` + +--- + +## 16. Testing -- Good Coverage, Wrong Distribution + +**Original analysis (S11):** Missing proper unit tests. + +**Current state:** 59 test files -- major improvement. Tests exist across: +- `tests/unit/` -- 18 files +- `tests/integration/` -- 18 files +- `tests/parsers/` -- 8 files +- `tests/engines/` -- 2 files (parity tests) +- `tests/search/` -- 3 files +- `tests/incremental/` -- 2 files + +**What's still missing:** +- Unit tests for pure graph algorithms (BFS, Tarjan) in isolation +- Unit tests for confidence scoring with various inputs +- Unit tests for the triage risk scoring formula +- Mock-based tests (the repository pattern would enable `InMemoryRepository`) +- Many "unit" tests still hit SQLite -- they're integration tests in the unit directory + +The test count is adequate. The issue is that without the repository pattern, true unit testing is impossible for most modules -- they all need a real SQLite DB. + +--- + +## 17. Event-Driven Pipeline -- Still Relevant for Scale + +**Original analysis (S7):** Batch pipeline with no progress reporting. + +**Current state:** Still batch. The `change-journal.js` module adds NDJSON event logging for watch mode, which is a step toward events -- but the build pipeline itself is still synchronous batch. For repos with 10K+ files, users still see no progress during builds. + +**Ideal architecture** (unchanged, lower priority than structural issues): + +```js +pipeline.on('file:parsed', (file, symbols) => { /* progress */ }) +pipeline.on('file:indexed', (file, nodeCount) => { /* progress */ }) +pipeline.on('build:complete', (stats) => { /* summary */ }) +await pipeline.run(rootDir) +``` + +--- + +## Remaining Items (Unchanged from Original) + +- **Config profiles (S8):** Single flat config, no monorepo profiles. Still relevant but not blocking anything. +- **Transitive import-aware confidence (S9):** Walk import graph before falling back to proximity heuristics. Targeted algorithmic improvement. +- **Query result caching (S14):** LRU/TTL cache between analysis and repository. More valuable now with 25 MCP tools. +- **Subgraph export filtering (S16):** Export the full graph or nothing. Still relevant for usability. --- -## Summary — Priority Ordering by Architectural Impact - -| # | Change | Impact | Category | -|---|--------|--------|----------| -| 1 | Split parser.js into plugin system | High | Modularity | -| 2 | Repository pattern for data access | High | Testability, maintainability | -| 3 | Separate analysis / formatting layers | High | Separation of concerns | -| 4 | Pipeline architecture for builder | High | Testability, reuse | -| 6 | Unified engine interface (Strategy) | Medium-High | Abstraction | -| 5 | Embedder as standalone subsystem | Medium | Extensibility | -| 13 | Qualified names + scoping in graph model | Medium | Data model accuracy | -| 7 | Event-driven pipeline for streaming | Medium | Scalability, UX | -| 10 | Composable MCP tool registry | Medium | Extensibility | -| 12 | CLI command objects | Medium | Maintainability | -| 17 | Domain error hierarchy | Medium | Reliability | -| 18 | Curated public API surface | Medium | API stability | -| 11 | Testing pyramid with proper layers | Medium | Quality | -| 16 | Subgraph export with filtering | Low-Medium | Usability | -| 9 | Transitive import-aware confidence | Low-Medium | Accuracy | -| 14 | Query result caching | Low | Performance | -| 8 | Config profiles for monorepos | Low | Feature | -| 15 | Unify watcher/builder code paths | Low | Falls out of #4 (partial: `readFileSafe` shared) | - -Items 1–4 and 6 are foundational — they restructure the core and everything else becomes easier after them. Items 13 and 7 are the most impactful feature-level changes. Items 14–15 are natural consequences of earlier changes. +## Revised Summary -- Priority Ordering by Architectural Impact + +| # | Change | Impact | Category | Original # | +|---|--------|--------|----------|------------| +| **1** | **Command/Query separation -- eliminate dual-function pattern across 15 modules** | **Critical** | Separation of concerns | S3 (was High) | +| **2** | **Repository pattern for data access -- SQL in 20+ modules** | **Critical** | Testability, maintainability | S2 (was High) | +| **3** | **Decompose queries.js (3,110 lines) into analysis modules** | **Critical** | Modularity | S3 (was High) | +| **4** | **Composable MCP tool registry (25 tools in 1,212 lines)** | **High** | Extensibility | S10 (was Medium) | +| **5** | **CLI command objects (45 commands in 1,285 lines)** | **High** | Maintainability | S12 (was Medium) | +| **6** | **Curated public API surface (120+ to ~30 exports)** | **High** | API stability | S18 (was Medium) | +| **7** | **Domain error hierarchy (35 modules, inconsistent handling)** | **High** | Reliability | S17 (was Medium) | +| **8** | **Decompose complexity.js (2,163 lines) into rules/engine** | **High** | Modularity | New | +| **9** | **Builder pipeline architecture (1,173 lines)** | **High** | Testability, reuse | S4 (was High) | +| **10** | **Embedder subsystem (1,113 lines, 3 search engines)** | **Medium-High** | Extensibility | S5 (was Medium) | +| **11** | **Unified graph model for structure/cochange/communities** | **Medium-High** | Cohesion | New | +| **12** | **Qualified names + hierarchical scoping** | **Medium** | Data model accuracy | S13 (unchanged) | +| **13** | **Pagination standardization (SQL-level + command runner)** | **Medium** | Consistency | New | +| **14** | **Testing pyramid with InMemoryRepository** | **Medium** | Quality | S11 (unchanged) | +| **15** | **Event-driven pipeline for streaming** | **Medium** | Scalability, UX | S7 (unchanged) | +| **16** | **Query result caching (25 MCP tools)** | **Low-Medium** | Performance | S14 (unchanged) | +| **17** | **Unified engine interface (Strategy)** | **Low-Medium** | Abstraction | S6 (was Medium-High) | +| **18** | **Subgraph export with filtering** | **Low-Medium** | Usability | S16 (unchanged) | +| **19** | **Transitive import-aware confidence** | **Low** | Accuracy | S9 (unchanged) | +| **20** | **Parser plugin system** | **Low** | Modularity | S1 (was High -- parser.js shrank to 404 lines) | +| **21** | **Config profiles for monorepos** | **Low** | Feature | S8 (unchanged) | + +**The structural priority shifted.** In the original analysis, the parser monolith was #1 -- it's now #20 because the native engine solved it. The new #1 is the command/query separation: the dual-function anti-pattern replicated across 15 modules is the single biggest source of code duplication and coupling in the codebase. Items 1-3 are the foundation -- they restructure the core and everything else becomes easier. Items 4-7 are high-impact but can be done in parallel. Items 8-10 are large-file decompositions that follow naturally once the shared infrastructure exists. --- -*Generated 2026-02-22. Cold architectural analysis — no implementation constraints applied.* +*Revised 2026-03-02. Cold architectural analysis -- no implementation constraints applied.*