Skip to content

refactor: split cli.js into self-contained command modules (ROADMAP 3.6)#427

Merged
carlos-alm merged 8 commits intomainfrom
refactor/cli-command-objects
Mar 13, 2026
Merged

refactor: split cli.js into self-contained command modules (ROADMAP 3.6)#427
carlos-alm merged 8 commits intomainfrom
refactor/cli-command-objects

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Split the monolithic 1,525-line src/cli.js into src/cli/ with auto-discovery of command modules
  • 40 independently testable command files in src/cli/commands/, each exporting { name, description, options?, queryOpts?, validate?, execute }
  • Shared utilities extracted to src/cli/shared/ (query options, output formatting)
  • src/cli/index.js provides registerCommand() + discoverCommands() — new commands are added by dropping a file into commands/
  • src/cli.js reduced to a 5-line thin wrapper

Test plan

  • All 1,608 existing tests pass (no behavioral changes)
  • codegraph --help lists all 40 commands correctly
  • codegraph --version works
  • Subcommands (registry, snapshot) nest correctly
  • Lint passes (Biome)
  • Pre-commit hooks pass (cycles, dead exports, diff-impact)

… (ROADMAP 3.6)

Move 1,525 lines of inline Commander chains from src/cli.js into
src/cli/ with auto-discovery of command modules:

- src/cli/index.js: Commander setup + registerCommand() + discoverCommands()
- src/cli/commands/: 40 independently testable command modules
- src/cli/shared/: options.js (query opts, resolveNoTests) + output.js
- src/cli.js: thin 5-line wrapper calling src/cli/index.js

Each command exports { name, description, options?, queryOpts?, validate?, execute }
and is auto-registered at startup. Subcommands (registry, snapshot) nest naturally.

Impact: 68 functions changed, 4 affected
@claude
Copy link

claude bot commented Mar 12, 2026

Claude finished @carlos-alm's task —— View job


PR Review: CLI Refactor Analysis

Completed comprehensive analysis of this major CLI refactor that splits 1,525 lines into 40 modular commands.

Root Cause Analysis

Problem Being Solved: The monolithic src/cli.js (1,525 lines) was becoming unwieldy for maintenance and testing. The refactor addresses ROADMAP 3.6 by creating a modular command architecture.

Why This Approach: Auto-discovery pattern with standardized command interfaces enables:

  • Independent testing of each command
  • Clear separation of concerns
  • Easier addition of new commands (just drop a file in commands/)
  • Consistent option handling via shared utilities

Risk Assessment: LOW RISK - This is a pure architectural refactor with no behavioral changes.

Critical Concerns: NONE FOUND

No configuration relaxations - biome.json, vitest.config.js unchanged
No test weakening - All 1,608 tests pass, comprehensive CLI coverage
No rule bypasses - Pre-commit hooks, cycles, and dead export checks pass
No shortcuts - Proper validation, error handling, and dynamic imports throughout

Architecture Analysis

Strong Design Patterns:

  • Consistent interface: All commands export { name, description, options?, queryOpts?, validate?, execute }
  • Shared utilities: Common query options and output formatting centralized in src/cli/shared/
  • Auto-discovery: discoverCommands() dynamically loads modules, enabling plugin-like extensibility
  • Proper separation: Command logic delegates to existing functions in src/commands/, maintaining single responsibility

Implementation Quality:

  • Validation: Input validation (e.g. EVERY_SYMBOL_KIND checks in triage.js:48)
  • Error handling: Proper exit codes and error messages
  • Lazy loading: Dynamic imports for optional dependencies
  • Subcommand support: Registry commands properly nest via subcommands array

Test Coverage Analysis:

  • 73 CLI test cases covering all commands including edge cases
  • Integration tests verify JSON output, error conditions, deprecation warnings
  • Registry subcommands thoroughly tested with isolation
  • No behavioral changes detected in test expectations

Code Quality Review

Strengths:

  • Clean, readable command modules averaging ~30 lines each
  • Consistent parameter parsing and option handling
  • Proper use of async/await and error boundaries
  • Good separation between CLI layer and business logic

Minor Observations:

  • Dynamic imports throughout command modules could benefit from error handling, but existing pattern is acceptable
  • Some commands import large modules (e.g. src/commands/complexity.js) but this is necessary
  • File path resolution using __cliDir is robust and cross-platform

Security & Performance:

  • No security issues detected
  • Lazy loading of modules improves startup performance
  • File system operations properly use absolute paths

Final Recommendation

## Root Cause Analysis
**Problem Being Solved**: Monolithic 1,525-line CLI needed modularization for maintainability
**Why This Approach**: Auto-discovery pattern enables independent testing and easy extensibility
**Risk Assessment**: LOW - Pure architectural refactor with comprehensive test coverage

## Critical Concerns
None found. This is exemplary refactoring work.

## Final Recommendation
- Rating: ⭐⭐⭐⭐⭐ (5/5) - Exceptional refactor
- Action: **APPROVE**
- Reasoning: Perfect execution of architectural improvement. No behavioral changes, comprehensive tests pass, clean modular design with consistent interfaces. This sets an excellent foundation for future CLI extensions.

This PR represents exceptional refactoring work that significantly improves codebase maintainability while preserving all existing functionality.


@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR completes ROADMAP 3.6 by splitting the 1,525-line monolithic src/cli.js into 40 self-contained command modules under src/cli/commands/, with shared utilities in src/cli/shared/ and a clean auto-discovery + registration engine in src/cli/index.js. The new src/cli.js is a 5-line wrapper.

Key changes:

  • registerCommand() uses a regex-based argCount to correctly slice Commander's actionArgs for all positional patterns (required, optional, variadic) — verified correct for all 40 command shapes
  • discoverCommands() uses readdirSync().sort() for deterministic cross-platform ordering, and pathToFileURL for cross-platform ESM dynamic imports
  • All five issues flagged in previous review rounds have been resolved: search now forwards offset+ndjson; embed restores the explicit DEFAULT_MODEL fallback; dataflow parses depth with parseInt; command discovery is sorted; and run() has a .catch() handler in cli.js
  • Shared ctx object ({ config, resolveNoTests, formatSize, outputResult, program }) gives every command module access to config, utilities, and the root Commander instance — the program reference is used by build, watch, info, and branch-compare to read the global --engine flag
  • Behavioral parity is preserved across all 40 commands — triage's raw minScore string passthrough, diff-impact's intentional absence of -j/--json, and check's synchronous inner calls all match the original exactly

Confidence Score: 5/5

  • This PR is safe to merge — it is a pure structural refactoring with no behavioral changes.
  • All 1,608 existing tests pass with no behavioral changes. Every command module was cross-checked against the original cli.js diff and found to be faithful. All five issues from previous review rounds are resolved. The registerCommand abstraction correctly handles every positional-arg pattern in the codebase. No new logic was introduced — only reorganization of existing logic into self-contained files.
  • No files require special attention.

Important Files Changed

Filename Overview
src/cli/index.js Core orchestrator: registers commands via registerCommand(), auto-discovers modules from commands/, adds .sort() for deterministic ordering, and .catch() handler is in cli.js. The argCount regex correctly covers all positional arg patterns (required, optional, variadic). ctx object cleanly exposes shared utilities.
src/cli.js Reduced to 8 lines — thin wrapper importing run() from src/cli/index.js and attaching a .catch() handler for clean error output on startup failure.
src/cli/shared/options.js Shared query options (applyQueryOpts), resolveNoTests, formatSize, and config — faithfully extracted from the original monolith with no behavioral changes.
src/cli/commands/check.js Faithful extraction of the dual-mode check command (manifesto/diff-predicates). Calls to check() and manifesto() are synchronous (as confirmed in underlying implementations), so the non-awaited pattern is correct. Behavior is identical to the original.
src/cli/commands/embed.js Restores explicit DEFAULT_MODEL fallback (`opts.model
src/cli/commands/search.js Correctly forwards offset and ndjson to search() (the fix from e0a0af0). All options including rrfK, mode, filePattern are forwarded correctly.
src/cli/commands/dataflow.js Correctly parses depth with parseInt(opts.depth, 10) as fixed in e6518de, removing the previous raw-string inconsistency.
src/cli/commands/batch.js Variadic [targets...] argument is correctly handled by the argCount calculation in registerCommand — Commander passes the variadic array as actionArgs[1], so args = [commandStr, targetsArray] is correct.
src/cli/commands/triage.js Passes minScore: opts.minScore as a raw string, matching original behavior. triageData() handles conversion with Number(opts.minScore) internally. Inline kind/role validation is intentional since they're conditional on the level flag.
src/cli/commands/build.js Correctly accesses ctx.program.opts().engine for the global --engine flag — the program reference in ctx is the Commander root instance.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["src/cli.js\n(5-line wrapper)"] --> B["run() — src/cli/index.js"]
    B --> C["discoverCommands()\nreaddirSync(commands/).sort()"]
    C --> D["import(pathToFileURL(file))"]
    D --> E{mod.command\nor mod.commands?}
    E -->|"mod.command"| F["registerCommand(program, def)"]
    E -->|"mod.commands[]"| F
    F --> G{def.queryOpts?}
    G -->|yes| H["applyQueryOpts(cmd)\nsrc/cli/shared/options.js"]
    G -->|no| I["cmd.option(...) for\neach def.options entry"]
    H --> I
    I --> J{def.execute?}
    J -->|yes| K["cmd.action()\nargCount = positional arg count\nargs = actionArgs.slice(0, argCount)\nopts = actionArgs[argCount]"]
    J -->|no| L["(no action — subcommands only)"]
    K --> M["def.validate?(args, opts, ctx)\n→ error string or falsy"]
    M -->|error| N["console.error + process.exit(1)"]
    M -->|ok| O["def.execute(args, opts, ctx)"]
    L --> P{def.subcommands?}
    J -->|no| P
    P -->|yes| Q["registerCommand(cmd, sub)\nfor each subcommand"]
    B --> R["program.parse()"]
    O --> S["ctx = \{ config, resolveNoTests,\nformatSize, outputResult, program \}"]
Loading

Last reviewed commit: 3fa03bd

Comment on lines +33 to +40
kind: opts.kind,
filePattern: opts.file,
rrfK: parseInt(opts.rrfK, 10),
mode: opts.mode,
json: opts.json,
});
},
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offset and ndjson options declared but never forwarded

Both --offset and --ndjson are listed in the command's options array (lines 21–22) but are not included in the object passed to search(). This means any user relying on these flags in the original monolithic cli.js will silently get no pagination or newline-delimited output.

Suggested change
kind: opts.kind,
filePattern: opts.file,
rrfK: parseInt(opts.rrfK, 10),
mode: opts.mode,
json: opts.json,
});
},
};
await search(query, opts.db, {
limit: parseInt(opts.limit, 10),
offset: opts.offset ? parseInt(opts.offset, 10) : undefined,
noTests: ctx.resolveNoTests(opts),
minScore: parseFloat(opts.minScore),
model: opts.model,
kind: opts.kind,
filePattern: opts.file,
rrfK: parseInt(opts.rrfK, 10),
mode: opts.mode,
json: opts.json,
ndjson: opts.ndjson,
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e0a0af0offset and ndjson are now forwarded to search().

Comment on lines +87 to +94
}

/**
* Auto-discover and register all command modules from src/cli/commands/.
* Each module must export a `command` (single definition) or `commands` (array).
*/
async function discoverCommands() {
const commandsDir = path.join(__cliDir, 'commands');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readdirSync result is not sorted — command order is filesystem-dependent

fs.readdirSync returns entries in filesystem-defined order, which is creation order on Linux ext4 but alphabetical on macOS HFS+ and Windows NTFS. The original monolithic cli.js had a fixed, deterministic registration order. Without an explicit .sort() here the --help command listing may differ between developer machines and CI, making output order non-deterministic in a multi-platform repo.

Suggested change
}
/**
* Auto-discover and register all command modules from src/cli/commands/.
* Each module must export a `command` (single definition) or `commands` (array).
*/
async function discoverCommands() {
const commandsDir = path.join(__cliDir, 'commands');
const files = fs.readdirSync(commandsDir).filter((f) => f.endsWith('.js')).sort();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e0a0af0.sort() added to readdirSync result for deterministic command order.

- Pass offset and ndjson options through to search() so the CLI flags
  actually take effect
- Sort readdirSync result in discoverCommands() for deterministic
  command ordering across platforms

Impact: 2 functions changed, 37 affected
Impact: 1 functions changed, 1 affected
@carlos-alm
Copy link
Contributor Author

Fixed both Greptile findings in cbfb3fe:

  1. search.jsoffset and ndjson options are now forwarded to search()
  2. index.js — added .sort() to readdirSync chain for deterministic command ordering across platforms

@greptileai

},
async execute([dir], opts, ctx) {
const root = path.resolve(dir || '.');
const model = opts.model || ctx.config.embeddings?.model;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEFAULT_MODEL fallback silently removed

The original cli.js resolved the model with an explicit three-part fallback:

const model = opts.model || config.embeddings?.model || DEFAULT_MODEL;

The new code drops the || DEFAULT_MODEL guard, so when neither --model is passed nor embeddings.model is set in config, model becomes undefined. buildEmbeddings still works correctly because getModelConfig(undefined) falls back to DEFAULT_MODEL internally — but this now relies on an undocumented implementation detail of embedder.js. Consider restoring the explicit fallback to keep the call site self-documenting and resilient to future changes in buildEmbeddings's internal defaults:

Suggested change
const model = opts.model || ctx.config.embeddings?.model;
const model = opts.model || ctx.config.embeddings?.model || DEFAULT_MODEL;

(and re-add DEFAULT_MODEL to the import on line 2: import { buildEmbeddings, DEFAULT_MODEL, EMBEDDING_STRATEGIES } from '../../embedder.js';)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 672c261 — restored explicit DEFAULT_MODEL fallback and re-added the import.

Restores the three-part model resolution (--model || config || DEFAULT_MODEL)
that was present in the original cli.js, so the call site doesn't silently
depend on getModelConfig's internal default.

Impact: 1 functions changed, 0 affected
@carlos-alm
Copy link
Contributor Author

Fixed in 672c261: restored explicit DEFAULT_MODEL fallback in the embed command so the three-part resolution (--model || config || DEFAULT_MODEL) matches the original cli.js behavior.

@greptileai

limit: opts.limit ? parseInt(opts.limit, 10) : undefined,
offset: opts.offset ? parseInt(opts.offset, 10) : undefined,
impact: opts.impact,
depth: opts.depth,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depth forwarded as a string instead of a number

Every other command that exposes --depth converts it in the CLI layer with parseInt(opts.depth, 10) before passing it down (e.g. fn-impact.js:20, sequence.js:21, query.js:38). Here the raw Commander string ('5') is passed instead.

This is functionally safe today because commands/dataflow.jsdataflowImpact() explicitly handles the conversion with Number(opts.depth), and dataflowData() doesn't use depth at all. However, it's an inconsistency that could silently break if dataflowImpactData is ever refactored to consume depth directly (it currently does const maxDepth = opts.depth || 5, which would then receive a string).

Suggested change
depth: opts.depth,
depth: parseInt(opts.depth, 10),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e6518dedepth is now parsed with parseInt(opts.depth, 10) for consistency.

@carlos-alm
Copy link
Contributor Author

Fixed in e6518de: depth is now parsed with parseInt(opts.depth, 10) in the dataflow command, consistent with fn-impact, sequence, and query commands.

@greptileai

src/cli.js Outdated
});

program.parse();
run();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unhandled promise rejection on startup failure

run() is called without .catch(). If discoverCommands() fails — for example because a new command file has a syntax error or a top-level import throws — the error propagates as an unhandled promise rejection. In Node.js 20+ this terminates the process, but the message the user sees is a raw UnhandledPromiseRejection stack dump rather than a clean codegraph error. The original synchronous code would have surfaced a clean module-loading error.

Suggested change
run();
run().catch((err) => {
console.error(`codegraph: fatal error — ${err.message || err}`);
process.exit(1);
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3fa03bdrun() now has a .catch() handler that prints a clean error message and exits with code 1.

@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit 6de3105 into main Mar 13, 2026
34 checks passed
@carlos-alm carlos-alm deleted the refactor/cli-command-objects branch March 13, 2026 05:32
@github-actions github-actions bot locked and limited conversation to collaborators Mar 13, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant