Skip to content

feat(config): add apiKeyCommand for secure credential resolution#12

Merged
carlos-alm merged 2 commits intomainfrom
feat/mcp-complete-server
Feb 22, 2026
Merged

feat(config): add apiKeyCommand for secure credential resolution#12
carlos-alm merged 2 commits intomainfrom
feat/mcp-complete-server

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Add apiKeyCommand config field that shells out to external secret managers (1Password, Bitwarden, Vault, pass, macOS Keychain) at runtime to resolve llm.apiKey
  • New resolveSecrets() function using execFileSync with array args (no shell injection risk), 10s timeout, graceful fallback on failure
  • Priority chain: command output > env var > file config > defaults
  • 11 new tests (7 unit + 4 integration) covering resolution, trimming, skip conditions, failure fallback
  • Documentation in README, CLAUDE.md, and docs/recommended-practices.md with examples for 5 secret managers

Test plan

  • npx vitest run tests/unit/config.test.js — 31 tests pass
  • npm test — 296 tests pass (full suite)
  • npm run lint — clean

🤖 Generated with Claude Code

Claude Code was denied Bash permission during PR reviews, preventing
it from running gh pr diff to read the changeset. Scope tool access
to read-only gh/git commands plus file reading tools.
Add a command-based secret resolver that shells out to external secret
managers (1Password, Bitwarden, Vault, pass, macOS Keychain) at runtime.
Priority chain: command output > env var > file config > defaults.
Uses execFileSync with array args to prevent shell injection.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 22, 2026

Greptile Summary

This PR adds secure credential management through apiKeyCommand, allowing codegraph to fetch API keys from external secret managers at runtime. The implementation uses execFileSync with array args (preventing shell injection), includes proper timeouts, and gracefully falls back on failure.

Key changes:

  • New resolveSecrets() function shells out to secret managers with 10s timeout and 64KB buffer limit
  • Priority chain: command output > env var > file config > defaults
  • applyEnvOverrides() enables CODEGRAPH_LLM_* env var support
  • Comprehensive test suite with 11 new tests covering edge cases and integration scenarios
  • Extensive documentation with examples for 1Password, Bitwarden, Vault, pass, and macOS Keychain

Security posture:

  • Uses execFileSync with array args (no shell parsing) to prevent command injection
  • Captures stdout only; stderr discarded
  • Secrets held only in process memory, never written to disk
  • Graceful degradation on command failure with warning logs

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - well-tested security feature with proper safeguards
  • Clean implementation with no shell injection risk, comprehensive test coverage (11 new tests all passing), thorough documentation, and graceful error handling. The execution order correctly implements the documented priority chain.
  • No files require special attention

Important Files Changed

Filename Overview
src/config.js Added apiKeyCommand secret resolution with execFileSync, env overrides, and graceful fallback - secure implementation with proper timeout and no shell injection
tests/unit/config.test.js Comprehensive test coverage with 11 new tests covering resolution, trimming, skip conditions, failure fallback, and priority chain
README.md Added LLM credentials section with apiKeyCommand usage examples and security notes
docs/recommended-practices.md Detailed secure credential management guide with 5 secret manager examples and priority chain documentation

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[loadConfig called] --> B[Search for config file]
    B --> C{File exists?}
    C -->|Yes| D[Parse JSON & mergeConfig with DEFAULTS]
    C -->|No| E[Use DEFAULTS]
    D --> F[applyEnvOverrides]
    E --> F
    F --> G{CODEGRAPH_LLM_API_KEY set?}
    G -->|Yes| H[Override config.llm.apiKey]
    G -->|No| I[Keep file/default value]
    H --> J[resolveSecrets]
    I --> J
    J --> K{apiKeyCommand defined?}
    K -->|No| L[Return config]
    K -->|Yes| M[execFileSync with 10s timeout]
    M --> N{Command succeeds?}
    N -->|Yes| O[Set config.llm.apiKey to output]
    N -->|No| P[Warn & keep existing value]
    O --> L
    P --> L
Loading

Last reviewed commit: f3ab237

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@carlos-alm carlos-alm merged commit b2c533f into main Feb 22, 2026
13 of 14 checks passed
@carlos-alm carlos-alm deleted the feat/mcp-complete-server branch February 22, 2026 07:23
carlos-alm pushed a commit that referenced this pull request Feb 26, 2026
- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities
carlos-alm pushed a commit that referenced this pull request Feb 26, 2026
- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities
carlos-alm added a commit that referenced this pull request Feb 26, 2026
* feat: add codegraph path for A→B symbol pathfinding

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected

* docs: add Titan Paradigm use case, update docs with roles/co-change/path

- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities

* docs: restore Architecture Refactoring phase, fix references

- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review

* fix: correct MCP tool counts and backlog ID collisions

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22

* feat: add token savings benchmark (codegraph vs raw navigation)

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected

* feat: extend benchmarks with incremental builds and expanded query coverage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected

* ci: include version in automated benchmark commits and PRs

Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).

* fix: update remaining 19-tool references to 21-tool in README

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
carlos-alm added a commit that referenced this pull request Feb 26, 2026
* feat: add codegraph path for A→B symbol pathfinding

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected

* docs: add Titan Paradigm use case, update docs with roles/co-change/path

- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities

* docs: restore Architecture Refactoring phase, fix references

- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review

* fix: correct MCP tool counts and backlog ID collisions

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22

* feat: add token savings benchmark (codegraph vs raw navigation)

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected

* feat: extend benchmarks with incremental builds and expanded query coverage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected

* ci: include version in automated benchmark commits and PRs

Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).

* fix: update remaining 19-tool references to 21-tool in README

* docs: remove "viral" from titan paradigm LinkedIn reference

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
carlos-alm added a commit that referenced this pull request Feb 26, 2026
* feat: add codegraph path for A→B symbol pathfinding

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected

* docs: add Titan Paradigm use case, update docs with roles/co-change/path

- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities

* docs: restore Architecture Refactoring phase, fix references

- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review

* fix: correct MCP tool counts and backlog ID collisions

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22

* feat: add token savings benchmark (codegraph vs raw navigation)

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected

* feat: extend benchmarks with incremental builds and expanded query coverage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected

* ci: include version in automated benchmark commits and PRs

Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).

* fix: update remaining 19-tool references to 21-tool in README

* docs: remove "viral" from titan paradigm LinkedIn reference

* fix: use endLine for scope-aware caller selection in nested functions

Nested/closure functions (e.g. nodeId inside exportMermaid) were
incorrectly classified as [dead] because the caller selection loop
picked the last definition where line <= call.line, creating self-call
edges that got filtered out. Now uses endLine to find the innermost
enclosing scope, so calls within an outer function correctly attribute
the outer function as caller rather than the nested function itself.

Fixes false-positive [dead] for nodeId in branch-compare.js, export.js,
and queries.js.

Impact: 1 functions changed, 17 affected

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
carlos-alm added a commit that referenced this pull request Feb 26, 2026
* feat: add codegraph path for A→B symbol pathfinding

Add `codegraph path <from> <to>` — BFS shortest-path search on the
call graph. Given two symbol names, finds the shortest call chain
with hop count, intermediate nodes, edge kinds, and alternate path
count. Supports --reverse, --max-depth, --kinds, --from-file/--to-file,
-T, -j, -k flags. Exposed as symbol_path MCP tool.

Impact: 4 functions changed, 3 affected

* docs: add Titan Paradigm use case, update docs with roles/co-change/path

- Create docs/use-cases/titan-paradigm.md — maps Johannes R.'s multi-agent
  codebase cleanup architecture (RECON, GAUNTLET, GLOBAL SYNC, STATE MACHINE)
  to codegraph commands, roadmap items, and post-LLM-integration recommendations

- Update roadmap/BACKLOG.md: mark #4 (node classification), #9 (git change
  coupling), #1 (dead code), #2 (shortest path), #12 (execution flow) as DONE;
  add 6 new Titan Paradigm-inspired items (#21-#26): composite audit, batch
  querying, triage priority queue, change validation predicates, graph
  snapshots, MCP orchestration tools

- Update README.md: add roles + co-change to features table, differentiators,
  commands section, agent template, common flags, comparison table; update MCP
  tool count 18 → 19

- Update docs/recommended-practices.md: update MCP tool count and tool list,
  add roles/co-change/path to CLAUDE.md template and developer workflow, add
  "Understand architectural roles" and "Surface hidden coupling" sections,
  add co-change step to setup checklist

- Add full examples with real output for roles, co-change, and path to
  docs/examples/CLI.md and docs/examples/MCP.md

- Update GitHub repo description with new capabilities

* docs: restore Architecture Refactoring phase, fix references

- Restore Phase 3 (Architectural Refactoring) to ROADMAP
- Renumber phases 4-8 and all cross-references
- Fix MCP tool count per Greptile review

* fix: correct MCP tool counts and backlog ID collisions

Address Greptile review comments on #121:
- Update MCP tool counts from 18/19 to 21 (22 in multi-repo mode)
  across README, recommended-practices, dogfood skill, titan-paradigm
- Add missing execution_flow and list_entry_points to tool enumeration
- Renumber new backlog items 21-26 → 27-32 to avoid collision with
  existing items 21-22

* feat: add token savings benchmark (codegraph vs raw navigation)

Adds a benchmark suite that measures how much codegraph reduces token
usage when AI agents navigate the Next.js codebase (~4k TS files).

- scripts/token-benchmark-issues.js: 5 real Next.js PRs as test cases
- scripts/token-benchmark.js: runner using Claude Agent SDK (baseline
  vs codegraph MCP), with --perf flag for build/query benchmarks
- scripts/update-token-report.js: JSON → markdown report generator
- docs/benchmarks/: methodology docs and placeholder report

Impact: 21 functions changed, 7 affected

* feat: extend benchmarks with incremental builds and expanded query coverage

benchmark.js now measures no-op rebuilds, 1-file rebuilds, and query
latency (fn-deps, fn-impact, path, roles) alongside full builds.
update-benchmark-report.js renders new Incremental Rebuilds and Query
Latency sections in BUILD-BENCHMARKS.md and adds incremental/query rows
to the README performance table. All new fields are additive for backward
compatibility.

Impact: 5 functions changed, 2 affected

* ci: include version in automated benchmark commits and PRs

Extract version from benchmark result JSON and include it in branch
names, commit messages, PR titles, and PR bodies across all 4 benchmark
jobs (build, embedding, query, incremental).

* fix: update remaining 19-tool references to 21-tool in README

* docs: remove "viral" from titan paradigm LinkedIn reference

* fix: use endLine for scope-aware caller selection in nested functions

Nested/closure functions (e.g. nodeId inside exportMermaid) were
incorrectly classified as [dead] because the caller selection loop
picked the last definition where line <= call.line, creating self-call
edges that got filtered out. Now uses endLine to find the innermost
enclosing scope, so calls within an outer function correctly attribute
the outer function as caller rather than the nested function itself.

Fixes false-positive [dead] for nodeId in branch-compare.js, export.js,
and queries.js.

Impact: 1 functions changed, 17 affected

* feat: add cognitive & cyclomatic complexity metrics

Compute per-function complexity during build via single-traversal DFS
of tree-sitter ASTs: cognitive (SonarSource), cyclomatic (McCabe), and
max nesting depth. Stores results in new function_complexity table
(migration v8) and surfaces them in stats, context, explain, and a
dedicated `complexity` CLI command + MCP tool.

Adds manifesto config section with warn thresholds (cognitive: 15,
cyclomatic: 10, maxNesting: 4) seeding the future rule engine.

Phase 1 supports JS/TS/TSX; unsupported languages are skipped gracefully.

Impact: 18 functions changed, 32 affected

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant