diff --git a/.github/workflows/claude-code-review.yml b/.github/workflows/claude-code-review.yml index 00648268..047dd6c3 100644 --- a/.github/workflows/claude-code-review.yml +++ b/.github/workflows/claude-code-review.yml @@ -39,6 +39,7 @@ jobs: plugin_marketplaces: 'https://github.com/anthropics/claude-code.git' plugins: 'code-review@claude-code-plugins' prompt: '/code-review:code-review ${{ github.repository }}/pull/${{ github.event.pull_request.number }}' + allowed_tools: 'Bash(gh pr *),Bash(gh api *),Bash(git diff *),Bash(git log *),Read,Glob,Grep' # See https://github.com/anthropics/claude-code-action/blob/main/docs/usage.md # or https://code.claude.com/docs/en/cli-reference for available options diff --git a/CLAUDE.md b/CLAUDE.md index 5b15d0e9..da581aa6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -49,7 +49,7 @@ JS source is plain JavaScript (ES modules) in `src/`. No transpilation step. The | `cycles.js` | Circular dependency detection | | `export.js` | DOT/Mermaid/JSON graph export | | `watcher.js` | Watch mode for incremental rebuilds | -| `config.js` | `.codegraphrc.json` loading | +| `config.js` | `.codegraphrc.json` loading, env overrides, `apiKeyCommand` secret resolution | | `constants.js` | `EXTENSIONS` (derived from parser registry) and `IGNORE_DIRS` constants | | `native.js` | Native napi-rs addon loader with WASM fallback | | `resolve.js` | Import resolution (supports native batch mode) | @@ -64,6 +64,7 @@ JS source is plain JavaScript (ES modules) in `src/`. No transpilation step. The - Non-required parsers (all except JS/TS/TSX) fail gracefully if their WASM grammar is unavailable - Import resolution uses a 6-level priority system with confidence scoring (import-aware → same-file → directory → parent → global → method hierarchy) - Incremental builds track file hashes in the DB to skip unchanged files +- **Credential resolution:** `loadConfig` pipeline is `mergeConfig → applyEnvOverrides → resolveSecrets`. The `apiKeyCommand` config field shells out to an external secret manager via `execFileSync` (no shell). Priority: command output > env var > file config > defaults. On failure, warns and falls back gracefully **Database:** SQLite at `.codegraph/graph.db` with tables: `nodes`, `edges`, `metadata`, `embeddings` diff --git a/README.md b/README.md index f645db3b..7cabe9dc 100644 --- a/README.md +++ b/README.md @@ -366,6 +366,7 @@ See **[docs/recommended-practices.md](docs/recommended-practices.md)** for integ - **CI/CD** — PR impact comments, threshold gates, graph caching - **AI agents** — MCP server, CLAUDE.md templates, Claude Code hooks - **Developer workflow** — watch mode, explore-before-you-edit, semantic search +- **Secure credentials** — `apiKeyCommand` with 1Password, Bitwarden, Vault, macOS Keychain, `pass` ## 🔁 CI / GitHub Actions @@ -395,6 +396,23 @@ Create a `.codegraphrc.json` in your project root to customize behavior: } ``` +### LLM credentials + +Codegraph supports an `apiKeyCommand` field for secure credential management. Instead of storing API keys in config files or environment variables, you can shell out to a secret manager at runtime: + +```json +{ + "llm": { + "provider": "openai", + "apiKeyCommand": "op read op://vault/openai/api-key" + } +} +``` + +The command is split on whitespace and executed with `execFileSync` (no shell injection risk). Priority: **command output > `CODEGRAPH_LLM_API_KEY` env var > file config**. On failure, codegraph warns and falls back to the next source. + +Works with any secret manager: 1Password CLI (`op`), Bitwarden (`bw`), `pass`, HashiCorp Vault, macOS Keychain (`security`), AWS Secrets Manager, etc. + ## 📖 Programmatic API Codegraph also exports a full API for use in your own tools: diff --git a/docs/recommended-practices.md b/docs/recommended-practices.md index afd9bce7..7825d5cd 100644 --- a/docs/recommended-practices.md +++ b/docs/recommended-practices.md @@ -332,6 +332,91 @@ codegraph search "catch exception; format error response; report failure to clie --- +## Secure Credential Management + +Codegraph's LLM features (semantic search with LLM-generated descriptions, future `codegraph ask`) require an API key. Use `apiKeyCommand` to fetch it from a secret manager at runtime instead of hardcoding it in config files or leaking it through environment variables. + +### Why not environment variables? + +Environment variables are better than plaintext in config files, but they still leak via `ps e`, `/proc//environ`, child processes, shell history, and CI logs. `apiKeyCommand` keeps the secret in your vault and only materializes it in process memory for the duration of the call. + +### Examples + +**1Password CLI:** + +```json +{ + "llm": { + "provider": "openai", + "apiKeyCommand": "op read op://Development/openai/api-key" + } +} +``` + +**Bitwarden CLI:** + +```json +{ + "llm": { + "provider": "anthropic", + "apiKeyCommand": "bw get password anthropic-api-key" + } +} +``` + +**macOS Keychain:** + +```json +{ + "llm": { + "provider": "openai", + "apiKeyCommand": "security find-generic-password -s codegraph-llm -w" + } +} +``` + +**HashiCorp Vault:** + +```json +{ + "llm": { + "provider": "openai", + "apiKeyCommand": "vault kv get -field=api_key secret/codegraph/openai" + } +} +``` + +**`pass` (GPG-encrypted):** + +```json +{ + "llm": { + "provider": "openai", + "apiKeyCommand": "pass show codegraph/openai-key" + } +} +``` + +### Priority chain + +The resolution order is: + +1. **`apiKeyCommand`** output (highest priority) +2. **`CODEGRAPH_LLM_API_KEY`** environment variable +3. **`llm.apiKey`** in config file +4. **`null`** (default) + +If the command fails (timeout, not found, non-zero exit), codegraph logs a warning and falls back to the next available source. The command has a 10-second timeout. + +### Security notes + +- The command is split on whitespace and executed with `execFileSync` (array args, no shell) — no shell injection risk +- stdout is captured; stderr is discarded +- The resolved key is held only in process memory, never written to disk +- Keep `.codegraphrc.json` out of version control if it contains `apiKeyCommand` paths specific to your vault layout, or use a shared command that works across the team + +--- + ## .gitignore Add the codegraph database to `.gitignore` — it's a build artifact: diff --git a/src/config.js b/src/config.js index cda99c07..5294d25e 100644 --- a/src/config.js +++ b/src/config.js @@ -1,6 +1,7 @@ +import { execFileSync } from 'node:child_process'; import fs from 'node:fs'; import path from 'node:path'; -import { debug } from './logger.js'; +import { debug, warn } from './logger.js'; export const CONFIG_FILES = ['.codegraphrc.json', '.codegraphrc', 'codegraph.config.json']; @@ -18,6 +19,10 @@ export const DEFAULTS = { defaultDepth: 3, defaultLimit: 20, }, + embeddings: { model: 'minilm', llmProvider: null }, + llm: { provider: null, model: null, baseUrl: null, apiKey: null, apiKeyCommand: null }, + search: { defaultMinScore: 0.2, rrfK: 60, topK: 15 }, + ci: { failOnCycles: false, impactThreshold: null }, }; /** @@ -33,13 +38,50 @@ export function loadConfig(cwd) { const raw = fs.readFileSync(filePath, 'utf-8'); const config = JSON.parse(raw); debug(`Loaded config from ${filePath}`); - return mergeConfig(DEFAULTS, config); + return resolveSecrets(applyEnvOverrides(mergeConfig(DEFAULTS, config))); } catch (err) { debug(`Failed to parse config ${filePath}: ${err.message}`); } } } - return { ...DEFAULTS }; + return resolveSecrets(applyEnvOverrides({ ...DEFAULTS })); +} + +const ENV_LLM_MAP = { + CODEGRAPH_LLM_PROVIDER: 'provider', + CODEGRAPH_LLM_API_KEY: 'apiKey', + CODEGRAPH_LLM_MODEL: 'model', +}; + +export function applyEnvOverrides(config) { + for (const [envKey, field] of Object.entries(ENV_LLM_MAP)) { + if (process.env[envKey] !== undefined) { + config.llm[field] = process.env[envKey]; + } + } + return config; +} + +export function resolveSecrets(config) { + const cmd = config.llm.apiKeyCommand; + if (typeof cmd !== 'string' || cmd.trim() === '') return config; + + const parts = cmd.trim().split(/\s+/); + const [executable, ...args] = parts; + try { + const result = execFileSync(executable, args, { + encoding: 'utf-8', + timeout: 10_000, + maxBuffer: 64 * 1024, + stdio: ['ignore', 'pipe', 'pipe'], + }).trim(); + if (result) { + config.llm.apiKey = result; + } + } catch (err) { + warn(`apiKeyCommand failed: ${err.message}`); + } + return config; } function mergeConfig(defaults, overrides) { diff --git a/tests/unit/config.test.js b/tests/unit/config.test.js index 4512f8f7..c005e6cb 100644 --- a/tests/unit/config.test.js +++ b/tests/unit/config.test.js @@ -5,8 +5,22 @@ import fs from 'node:fs'; import os from 'node:os'; import path from 'node:path'; -import { afterAll, beforeAll, describe, expect, it } from 'vitest'; -import { CONFIG_FILES, DEFAULTS, loadConfig } from '../../src/config.js'; +import { afterAll, afterEach, beforeAll, describe, expect, it, vi } from 'vitest'; +import { + applyEnvOverrides, + CONFIG_FILES, + DEFAULTS, + loadConfig, + resolveSecrets, +} from '../../src/config.js'; + +vi.mock('node:child_process', async (importOriginal) => { + const actual = await importOriginal(); + return { + ...actual, + execFileSync: vi.fn(actual.execFileSync), + }; +}); let tmpDir; @@ -39,6 +53,28 @@ describe('DEFAULTS', () => { expect(DEFAULTS.build).toHaveProperty('incremental', true); expect(DEFAULTS.query).toHaveProperty('defaultDepth', 3); }); + + it('has embeddings defaults', () => { + expect(DEFAULTS.embeddings).toEqual({ model: 'minilm', llmProvider: null }); + }); + + it('has llm defaults', () => { + expect(DEFAULTS.llm).toEqual({ + provider: null, + model: null, + baseUrl: null, + apiKey: null, + apiKeyCommand: null, + }); + }); + + it('has search defaults', () => { + expect(DEFAULTS.search).toEqual({ defaultMinScore: 0.2, rrfK: 60, topK: 15 }); + }); + + it('has ci defaults', () => { + expect(DEFAULTS.ci).toEqual({ failOnCycles: false, impactThreshold: null }); + }); }); describe('loadConfig', () => { @@ -115,4 +151,248 @@ describe('loadConfig', () => { const config = loadConfig(dir); expect(config.include).toEqual(['src/**']); }); + + it('deep-merges new search section with defaults', () => { + const dir = fs.mkdtempSync(path.join(tmpDir, 'search-merge-')); + fs.writeFileSync(path.join(dir, '.codegraphrc.json'), JSON.stringify({ search: { topK: 50 } })); + const config = loadConfig(dir); + expect(config.search.topK).toBe(50); + expect(config.search.defaultMinScore).toBe(0.2); + expect(config.search.rrfK).toBe(60); + }); +}); + +describe('applyEnvOverrides', () => { + const ENV_KEYS = ['CODEGRAPH_LLM_PROVIDER', 'CODEGRAPH_LLM_API_KEY', 'CODEGRAPH_LLM_MODEL']; + + afterEach(() => { + for (const key of ENV_KEYS) { + delete process.env[key]; + } + }); + + it('overrides llm.provider from env', () => { + process.env.CODEGRAPH_LLM_PROVIDER = 'anthropic'; + const config = applyEnvOverrides({ + llm: { provider: null, model: null, baseUrl: null, apiKey: null }, + }); + expect(config.llm.provider).toBe('anthropic'); + }); + + it('overrides llm.apiKey from env', () => { + process.env.CODEGRAPH_LLM_API_KEY = 'sk-test-123'; + const config = applyEnvOverrides({ + llm: { provider: null, model: null, baseUrl: null, apiKey: null }, + }); + expect(config.llm.apiKey).toBe('sk-test-123'); + }); + + it('overrides llm.model from env', () => { + process.env.CODEGRAPH_LLM_MODEL = 'gpt-4'; + const config = applyEnvOverrides({ + llm: { provider: null, model: null, baseUrl: null, apiKey: null }, + }); + expect(config.llm.model).toBe('gpt-4'); + }); + + it('env vars take priority over file config', () => { + process.env.CODEGRAPH_LLM_PROVIDER = 'anthropic'; + const dir = fs.mkdtempSync(path.join(tmpDir, 'env-priority-')); + fs.writeFileSync( + path.join(dir, '.codegraphrc.json'), + JSON.stringify({ llm: { provider: 'openai' } }), + ); + const config = loadConfig(dir); + expect(config.llm.provider).toBe('anthropic'); + }); + + it('leaves file config intact when env vars are not set', () => { + const dir = fs.mkdtempSync(path.join(tmpDir, 'env-absent-')); + fs.writeFileSync( + path.join(dir, '.codegraphrc.json'), + JSON.stringify({ llm: { provider: 'openai' } }), + ); + const config = loadConfig(dir); + expect(config.llm.provider).toBe('openai'); + }); +}); + +describe('resolveSecrets', () => { + let mockExecFile; + + beforeAll(async () => { + const cp = await import('node:child_process'); + mockExecFile = cp.execFileSync; + }); + + afterEach(() => { + mockExecFile.mockReset(); + }); + + it('resolves apiKey from command', () => { + mockExecFile.mockReturnValue('secret-key-123'); + const config = { + llm: { + provider: null, + model: null, + baseUrl: null, + apiKey: null, + apiKeyCommand: 'op read secret/key', + }, + }; + resolveSecrets(config); + expect(mockExecFile).toHaveBeenCalledWith('op', ['read', 'secret/key'], { + encoding: 'utf-8', + timeout: 10_000, + maxBuffer: 64 * 1024, + stdio: ['ignore', 'pipe', 'pipe'], + }); + expect(config.llm.apiKey).toBe('secret-key-123'); + }); + + it('trims whitespace from command output', () => { + mockExecFile.mockReturnValue(' secret-key \n'); + const config = { + llm: { + provider: null, + model: null, + baseUrl: null, + apiKey: null, + apiKeyCommand: 'cat keyfile', + }, + }; + resolveSecrets(config); + expect(config.llm.apiKey).toBe('secret-key'); + }); + + it('skips when apiKeyCommand is null', () => { + const config = { + llm: { provider: null, model: null, baseUrl: null, apiKey: 'existing', apiKeyCommand: null }, + }; + resolveSecrets(config); + expect(mockExecFile).not.toHaveBeenCalled(); + expect(config.llm.apiKey).toBe('existing'); + }); + + it('skips when apiKeyCommand is not a string', () => { + const config = { + llm: { provider: null, model: null, baseUrl: null, apiKey: 'existing', apiKeyCommand: 42 }, + }; + resolveSecrets(config); + expect(mockExecFile).not.toHaveBeenCalled(); + expect(config.llm.apiKey).toBe('existing'); + }); + + it('warns and preserves existing apiKey on command failure', () => { + mockExecFile.mockImplementation(() => { + throw new Error('command not found'); + }); + const stderrSpy = vi.spyOn(process.stderr, 'write').mockImplementation(() => true); + const config = { + llm: { + provider: null, + model: null, + baseUrl: null, + apiKey: 'keep-me', + apiKeyCommand: 'bad-cmd', + }, + }; + resolveSecrets(config); + expect(config.llm.apiKey).toBe('keep-me'); + expect(stderrSpy).toHaveBeenCalledWith(expect.stringContaining('apiKeyCommand failed')); + stderrSpy.mockRestore(); + }); + + it('does not overwrite apiKey when command returns empty output', () => { + mockExecFile.mockReturnValue(' \n'); + const config = { + llm: { + provider: null, + model: null, + baseUrl: null, + apiKey: 'original', + apiKeyCommand: 'echo ""', + }, + }; + resolveSecrets(config); + expect(config.llm.apiKey).toBe('original'); + }); + + it('handles empty string command gracefully', () => { + const config = { + llm: { provider: null, model: null, baseUrl: null, apiKey: 'existing', apiKeyCommand: ' ' }, + }; + resolveSecrets(config); + expect(mockExecFile).not.toHaveBeenCalled(); + expect(config.llm.apiKey).toBe('existing'); + }); +}); + +describe('apiKeyCommand integration', () => { + const ENV_KEYS = ['CODEGRAPH_LLM_PROVIDER', 'CODEGRAPH_LLM_API_KEY', 'CODEGRAPH_LLM_MODEL']; + let mockExecFile; + + beforeAll(async () => { + const cp = await import('node:child_process'); + mockExecFile = cp.execFileSync; + }); + + afterEach(() => { + mockExecFile.mockReset(); + for (const key of ENV_KEYS) { + delete process.env[key]; + } + }); + + it('command output beats file apiKey', () => { + mockExecFile.mockReturnValue('command-key'); + const dir = fs.mkdtempSync(path.join(tmpDir, 'cmd-file-')); + fs.writeFileSync( + path.join(dir, '.codegraphrc.json'), + JSON.stringify({ llm: { apiKey: 'file-key', apiKeyCommand: 'vault get key' } }), + ); + const config = loadConfig(dir); + expect(config.llm.apiKey).toBe('command-key'); + }); + + it('command output beats env var', () => { + process.env.CODEGRAPH_LLM_API_KEY = 'env-key'; + mockExecFile.mockReturnValue('command-key'); + const dir = fs.mkdtempSync(path.join(tmpDir, 'cmd-env-')); + fs.writeFileSync( + path.join(dir, '.codegraphrc.json'), + JSON.stringify({ llm: { apiKeyCommand: 'vault get key' } }), + ); + const config = loadConfig(dir); + expect(config.llm.apiKey).toBe('command-key'); + }); + + it('env var still works when no apiKeyCommand is set', () => { + process.env.CODEGRAPH_LLM_API_KEY = 'env-key'; + const dir = fs.mkdtempSync(path.join(tmpDir, 'env-no-cmd-')); + fs.writeFileSync( + path.join(dir, '.codegraphrc.json'), + JSON.stringify({ llm: { provider: 'openai' } }), + ); + const config = loadConfig(dir); + expect(config.llm.apiKey).toBe('env-key'); + expect(mockExecFile).not.toHaveBeenCalled(); + }); + + it('graceful failure falls back to env var value', () => { + process.env.CODEGRAPH_LLM_API_KEY = 'env-fallback'; + mockExecFile.mockImplementation(() => { + throw new Error('timeout'); + }); + const stderrSpy = vi.spyOn(process.stderr, 'write').mockImplementation(() => true); + const dir = fs.mkdtempSync(path.join(tmpDir, 'cmd-fail-')); + fs.writeFileSync( + path.join(dir, '.codegraphrc.json'), + JSON.stringify({ llm: { apiKeyCommand: 'vault get key' } }), + ); + const config = loadConfig(dir); + expect(config.llm.apiKey).toBe('env-fallback'); + expect(stderrSpy).toHaveBeenCalledWith(expect.stringContaining('apiKeyCommand failed')); + stderrSpy.mockRestore(); + }); });