Skip to content
24 changes: 22 additions & 2 deletions .claude/skills/dogfood/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ Test that incremental rebuilds, full rebuilds, and cross-feature state remain co

## Phase 4b — Performance Benchmarks

Run all four benchmark scripts from the codegraph source repo (not the temp install dir) and record results. These detect performance regressions between releases.
Run all four benchmark scripts from the codegraph source repo and record results. These detect performance regressions between releases.

| Benchmark | Script | What it measures | When it matters |
|-----------|--------|-----------------|-----------------|
Expand All @@ -169,6 +169,25 @@ Run all four benchmark scripts from the codegraph source repo (not the temp inst
| Query | `node scripts/query-benchmark.js` | Query depth scaling, diff-impact latency | Always |
| Embedding | `node scripts/embedding-benchmark.js` | Search recall (Hit@1/3/5/10) across models | Always |

### Pre-flight: verify native binary version

**Before running any benchmarks**, confirm the native binary in the source repo matches the version being dogfooded. A stale binary will produce misleading results (e.g., the Rust engine may lack features added in the release, causing silent WASM fallback during the complexity phase).

```bash
# In the codegraph source repo — adjust the platform package name as needed:
node -e "const r=require('module').createRequire(require('url').pathToFileURL(__filename).href); const pkg=r.resolve('@optave/codegraph-win32-x64-msvc/package.json'); const p=require(pkg); console.log('Native binary version:', p.version)"
```

If the version does **not** match `$ARGUMENTS`:
1. Update `optionalDependencies` in `package.json` to pin all `@optave/codegraph-*` packages to `$ARGUMENTS`.
2. Run `npm install` to fetch the correct binaries.
3. Verify with `npx codegraph info` that the native engine loads at the correct version.
4. Revert the `package.json` / `package-lock.json` changes after benchmarking (do not commit them on the fix branch).

**Why this matters:** The native engine computes complexity metrics during the Rust parse phase. If the binary is from an older release that lacks this, the complexity phase silently falls back to WASM — inflating native complexity time by 50-100x and making native appear slower than WASM.

### Running benchmarks

1. Run all four from the codegraph source repo directory.
2. Record the JSON output from each.
3. Compare with the previous release's numbers in `generated/BUILD-BENCHMARKS.md` (build benchmark) and previous dogfood reports.
Expand All @@ -177,7 +196,8 @@ Run all four benchmark scripts from the codegraph source repo (not the temp inst
- Query latency >2x slower → investigate
- Embedding recall (Hit@5) drops by >2% → investigate
- Incremental no-op >10ms → investigate
5. Include a **Performance Benchmarks** section in the report with tables for each benchmark.
5. **Sanity-check the complexity phase:** If native `complexityMs` is higher than WASM `complexityMs`, the native binary is likely stale — go back to the pre-flight step.
6. Include a **Performance Benchmarks** section in the report with tables for each benchmark.

**Note:** The native engine may not be available in the dev repo (no prebuilt binary in `node_modules`). Record WASM results at minimum. If native is available, record both.

Expand Down
12 changes: 6 additions & 6 deletions FOUNDATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,15 +64,15 @@ This dual-mode approach is unique in the competitive landscape. Competitors eith

*Test: does every core command (`build`, `query`, `fn`, `deps`, `impact`, `diff-impact`, `cycles`, `map`) work with zero API keys? Are LLM features additive, never blocking?*

### 5. Embeddable first, CLI second
### 5. Functional CLI, embeddable API

**Codegraph is a library that happens to have a CLI, not the other way around.**
**Codegraph is a CLI tool and MCP server that delivers code intelligence directly.**

Every capability is available through the programmatic API (`src/index.js`). The CLI (`src/cli.js`) and MCP server (`src/mcp.js`) are thin wrappers. This means codegraph can be imported into VS Code extensions, Electron apps, CI pipelines, other MCP servers, and any JavaScript tooling.
The CLI (`src/cli.js`) and MCP server (`src/mcp.js`) are the primary interfaces — the things we ship and the way people use codegraph. Every capability is also available through the programmatic API (`src/index.js`), so codegraph can be imported into VS Code extensions, CI pipelines, other MCP servers, and any JavaScript tooling.

Most competitors are CLI-first or server-first. We are library-first. The API surface is the product; the CLI is a convenience.
Most competitors are either library-only (requiring integration work) or server-only (requiring infrastructure). Codegraph works out of the box as a CLI, serves AI agents via MCP, and can be embedded when needed.

*Test: can another npm package `import { buildGraph, queryFunction } from '@optave/codegraph'` and use the full feature set programmatically?*
*Test: does every feature work from the CLI with zero integration effort? Can another npm package also `import { buildGraph, queryFunction } from '@optave/codegraph'` and use the full feature set programmatically?*

### 6. One registry, one schema, no magic

Expand Down Expand Up @@ -126,7 +126,7 @@ Staying in our lane means we can be embedded inside IDEs, AI agents, CI pipeline
- Cloud API calls in the core pipeline — violates Principle 1 (the graph must always rebuild in under a second) and Principle 4 (zero-cost core)
- AI-powered code generation or editing — violates Principle 8
- Multi-agent orchestration — violates Principle 8
- Native desktop GUI — outside our lane; we're a library
- Native desktop GUI — outside our lane; we're a CLI tool and engine, not a desktop app
- Features that require non-npm dependencies — keeps deployment simple

---
Expand Down
Loading