Skip to content

feat: enable --cfg and --dataflow by default on builds#312

Merged
carlos-alm merged 8 commits intomainfrom
feat/default-cfg-dataflow
Mar 4, 2026
Merged

feat: enable --cfg and --dataflow by default on builds#312
carlos-alm merged 8 commits intomainfrom
feat/default-cfg-dataflow

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • CFG and dataflow analyses now run by default during codegraph build, eliminating the surprise of empty results when users try cfg or dataflow commands
  • Users who want faster builds can opt out with --no-cfg / --no-dataflow (same Commander --no-* convention as --no-incremental)
  • Added dataflowMs to the build return object (was tracked internally but not returned)
  • Updated MCP tool descriptions and fallback warnings to reflect the new defaults

Closes #292

Test plan

  • codegraph build . --no-incremental — CFG + dataflow run by default (timing visible in output)
  • codegraph build . --no-cfg --no-dataflow --no-incremental — both skipped cleanly
  • codegraph cfg buildGraph / codegraph dataflow buildGraph — return data without explicit flags
  • npm test — all tests pass (only pre-existing ruby/php WASM failures)

Cargo.toml was still at 2.6.0 after the v3.0.0 release, causing
codegraph info to report a stale native version and potentially
triggering unnecessary full rebuilds from version mismatch detection.
CFG and dataflow analyses now run by default during `codegraph build`.
Users who want faster builds can skip them with `--no-cfg` / `--no-dataflow`,
following the same Commander --no-* convention as --no-incremental.

- Replace opt-in --cfg/--dataflow with opt-out --no-cfg/--no-dataflow
- Add dataflowMs to build return object (was tracked but not returned)
- Update MCP tool descriptions to remove stale "Requires build --cfg" notes
- Update fallback warnings in cfg.js and dataflow.js for legacy DBs

Impact: 5 functions changed, 19 affected
Impact: 4 functions changed, 7 affected
@claude
Copy link

claude bot commented Mar 4, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Enable CFG and Dataflow by Default

Reviewing PR #312 to enable --cfg and --dataflow by default on builds.

Review Checklist

  • Read repository guidelines (CLAUDE.md)
  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky changes
  • Phase 3: Test Integrity - verify no tests were weakened
  • Phase 4: Code Quality - assess implementation quality
  • Provide final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Users were encountering empty results when running cfg or dataflow commands because these features required explicit opt-in via --cfg and --dataflow flags during the build phase. This created a poor user experience where commands would fail silently or return empty data.

Why This Approach: The solution inverts the opt-in behavior to opt-out using Commander.js's --no-* convention. Now cfg: true and dataflow: true are the defaults, and users can disable with --no-cfg/--no-dataflow for faster builds.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • No linting rules disabled or relaxed
  • No test timeouts increased or coverage lowered
  • No TypeScript strict mode changes
  • One version bump: codegraph-core Cargo.toml 2.6.03.0.0 (legitimate major version for behavioral change)

Phase 3: Test Integrity ✅

No Test Weakening Detected:

  • No test files were modified
  • No assertions removed or made less strict
  • No eslint-disable comments added
  • Changes are purely to default behavior, not test expectations

Phase 4: Code Quality ✅

Implementation Quality:

✅ Positive Aspects:

  • Consistent API Design: Uses Commander.js --no-* convention matching existing --no-incremental
  • Backward Compatible: Existing scripts using explicit --cfg/--dataflow will continue working (Commander ignores unknown positive flags)
  • User Experience Fix: Eliminates "empty results" surprise when users run cfg/dataflow commands
  • Comprehensive Updates: All error messages and documentation updated consistently
  • Missing Field Added: dataflowMs timing now properly exposed in build return object

✅ Code Quality:

  • Clean, minimal changes focused on the specific issue
  • Proper comment updates reflecting new behavior
  • Error message improvements guide users correctly
  • MCP tool descriptions updated appropriately

⚠️ Minor Concerns:

  • Breaking Change: Users with performance-sensitive workflows who relied on cfg/dataflow being disabled by default will now get slower builds (mitigated by clear --no-* options)
  • Version Bump: Major version bump (3.0.0) is appropriate but worth noting

🔍 Security/Performance: No security issues. Performance impact is intentional (more analysis by default) with clear opt-out path.

Root Cause Analysis

Problem Being Solved: Poor UX where cfg/dataflow commands returned empty results because features weren't enabled during build
Why This Approach: Better defaults with clear opt-out rather than confusing opt-in that users forgot
Risk Assessment: Low risk - well-implemented behavioral change with proper versioning

Critical Concerns

None. This is a well-executed UX improvement with appropriate safeguards.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a legitimate UX improvement that solves a real problem. The implementation is clean, comprehensive, and follows established patterns. The major version bump appropriately signals the behavioral change. Users who need performance optimization have clear opt-out flags.

Minor suggestions for future:

  • Consider documenting the performance impact in README/docs
  • Could add timing comparison in help text to help users decide on opt-out

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 4, 2026

Greptile Summary

This PR enables CFG and dataflow analyses by default during codegraph build, addressing user confusion when these commands returned empty results. Users can now opt out with --no-cfg and --no-dataflow flags following Commander's standard negation pattern.

Key Changes:

  • Changed cfg/dataflow from opt-in (--cfg, --dataflow) to opt-out (--no-cfg, --no-dataflow) behavior
  • Implementation uses opts.cfg !== false pattern to default to true while respecting explicit false values
  • Fixed bug where needsDataflow check queried edges table instead of dataflow table (line 539 in builder.js)
  • Added dataflowMs timing metric to build return object for consistency with other phases
  • Updated all documentation, MCP tool descriptions, and fallback warnings to reflect new defaults
  • Backwards compatible: existing scripts with old flags still work, programmatic API defaults to enabled but respects explicit cfg: false / dataflow: false

Trade-offs:

  • Builds will be slower by default due to additional analyses, but users needing faster builds can opt out
  • Legacy databases will automatically compute missing analyses on next build via the needsCfg/needsDataflow checks

Confidence Score: 5/5

  • This PR is safe to merge with no risk
  • Score reflects clean implementation with correct opt-out pattern (opts !== false), backwards compatibility maintained for both CLI and programmatic API, legitimate bug fix for dataflow table query, comprehensive documentation updates, and thorough testing as documented in PR description
  • No files require special attention

Important Files Changed

Filename Overview
src/builder.js Changed cfg/dataflow from opt-in to opt-out with opts !== false pattern; fixed needsDataflow query to use dataflow table; added dataflowMs to return object
src/cli.js Changed CLI options from --cfg/--dataflow to --no-cfg/--no-dataflow for opt-out behavior
README.md Updated documentation to reflect that CFG/dataflow are now included by default with opt-out flags
src/cfg.js Updated warning message to reflect that CFG is now included by default in builds
src/dataflow.js Updated warning messages across multiple functions to reflect that dataflow is now included by default
src/mcp.js Removed "Requires build --cfg" and "Requires build --dataflow" from MCP tool descriptions

Last reviewed commit: 7b91262

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 4, 2026

Additional Comments (2)

README.md, line 212
These lines still say "opt-in via build --dataflow" and "opt-in via build --cfg", but this PR changes the default behavior - they now run by default. Update to mention they're included by default and can be disabled with --no-cfg / --no-dataflow.


README.md, line 330
This note is now outdated - CFG and dataflow run by default as of this PR. Update to reflect the new behavior (e.g., "Both are included by default. Use --no-cfg / --no-dataflow for faster builds.").

@claude
Copy link

claude bot commented Mar 4, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Enable CFG and Dataflow by Default

Reviewing PR #312 to enable --cfg and --dataflow by default on builds.

Review Checklist

  • Read repository guidelines (CLAUDE.md)
  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky changes
  • Phase 3: Test Integrity - verify no tests were weakened
  • Phase 4: Code Quality - assess implementation quality
  • Provide final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Users were encountering empty results when running cfg or dataflow commands because these features required explicit opt-in via --cfg and --dataflow flags during the build phase. This created a poor user experience where commands would fail silently or return empty data.

Why This Approach: The solution inverts the opt-in behavior to opt-out using Commander.js's --no-* convention. Now cfg: true and dataflow: true are the defaults, and users can disable with --no-cfg/--no-dataflow for faster builds.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • No linting rules disabled or relaxed
  • No test timeouts increased or coverage lowered
  • No TypeScript strict mode changes
  • One version bump: codegraph-core Cargo.toml 2.6.03.0.0 (legitimate major version for behavioral change)

Phase 3: Test Integrity ✅

No Test Weakening Detected:

  • No test files were modified
  • No assertions removed or made less strict
  • No eslint-disable comments added
  • Changes are purely to default behavior, not test expectations

Phase 4: Code Quality ✅

Implementation Quality:

✅ Positive Aspects:

  • Consistent API Design: Uses Commander.js --no-* convention matching existing --no-incremental
  • Backward Compatible: Existing scripts using explicit --cfg/--dataflow will continue working (Commander ignores unknown positive flags)
  • User Experience Fix: Eliminates "empty results" surprise when users run cfg/dataflow commands
  • Comprehensive Updates: All error messages and documentation updated consistently
  • Missing Field Added: dataflowMs timing now properly exposed in build return object

✅ Code Quality:

  • Clean, minimal changes focused on the specific issue
  • Proper comment updates reflecting new behavior
  • Error message improvements guide users correctly
  • MCP tool descriptions updated appropriately

⚠️ Minor Concerns:

  • Breaking Change: Users with performance-sensitive workflows who relied on cfg/dataflow being disabled by default will now get slower builds (mitigated by clear --no-* options)
  • Version Bump: Major version bump (3.0.0) is appropriate but worth noting

🔍 Security/Performance: No security issues. Performance impact is intentional (more analysis by default) with clear opt-out path.

Root Cause Analysis

Problem Being Solved: Poor UX where cfg/dataflow commands returned empty results because features weren't enabled during build
Why This Approach: Better defaults with clear opt-out rather than confusing opt-in that users forgot
Risk Assessment: Low risk - well-implemented behavioral change with proper versioning

Critical Concerns

None. This is a well-executed UX improvement with appropriate safeguards.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a legitimate UX improvement that solves a real problem. The implementation is clean, comprehensive, and follows established patterns. The major version bump appropriately signals the behavioral change. Users who need performance optimization have clear opt-out flags.

Minor suggestions for future:

  • Consider documenting the performance impact in README/docs
  • Could add timing comparison in help text to help users decide on opt-out

@carlos-alm
Copy link
Contributor Author

Fixed in b20b7eb — updated both README locations to reflect default-on behavior:

  • Feature table (line 211-212): changed from "opt-in via build --cfg/--dataflow" to "included by default, skip with --no-cfg/--no-dataflow"
  • Note block (line 330): replaced with "Dataflow and CFG are included by default. Use --no-dataflow / --no-cfg for faster builds."

@greptileai

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 4, 2026

Additional Comments (1)

src/builder.js, line 1296
CLI defaults opts.cfg and opts.dataflow to true (via Commander's --no-* behavior), but programmatic API leaves them undefined when not specified, causing these analyses to be skipped.

Consider adding default values in buildGraph:

const cfg = opts.cfg !== false;  // true unless explicitly disabled
const dataflow = opts.dataflow !== false;

Then use cfg and dataflow variables instead of opts.cfg and opts.dataflow throughout.

Impact: 1 functions changed, 0 affected
@carlos-alm
Copy link
Contributor Author

Fixed in 909bc8a — changed if (opts.cfg) / if (opts.dataflow) to if (opts.cfg !== false) / if (opts.dataflow !== false) in buildGraph(). Now programmatic callers get CFG + dataflow by default (matching CLI behavior), and can explicitly pass false to skip.

@greptileai

@claude
Copy link

claude bot commented Mar 4, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Enable CFG and Dataflow by Default

Reviewing PR #312 to enable --cfg and --dataflow by default on builds.

Review Checklist

  • Read repository guidelines (CLAUDE.md)
  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky changes
  • Phase 3: Test Integrity - verify no tests were weakened
  • Phase 4: Code Quality - assess implementation quality
  • Provide final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Users were encountering empty results when running cfg or dataflow commands because these features required explicit opt-in via --cfg and --dataflow flags during the build phase. This created a poor user experience where commands would fail silently or return empty data.

Why This Approach: The solution inverts the opt-in behavior to opt-out using Commander.js's --no-* convention. Now cfg: true and dataflow: true are the defaults, and users can disable with --no-cfg/--no-dataflow for faster builds.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • No linting rules disabled or relaxed
  • No test timeouts increased or coverage lowered
  • No TypeScript strict mode changes
  • One version bump: codegraph-core Cargo.toml 2.6.03.0.0 (legitimate major version for behavioral change)

Phase 3: Test Integrity ✅

No Test Weakening Detected:

  • No test files were modified
  • No assertions removed or made less strict
  • No eslint-disable comments added
  • Changes are purely to default behavior, not test expectations

Phase 4: Code Quality ✅

Implementation Quality:

✅ Positive Aspects:

  • Consistent API Design: Uses Commander.js --no-* convention matching existing --no-incremental
  • Backward Compatible: Existing scripts using explicit --cfg/--dataflow will continue working (Commander ignores unknown positive flags)
  • User Experience Fix: Eliminates "empty results" surprise when users run cfg/dataflow commands
  • Comprehensive Updates: All error messages and documentation updated consistently
  • Missing Field Added: dataflowMs timing now properly exposed in build return object
  • API Consistency Fix: Both CLI and programmatic API now default to enabled (fixed in commit 909bc8a with opts.cfg !== false pattern)

✅ Code Quality:

  • Clean, minimal changes focused on the specific issue
  • Proper comment updates reflecting new behavior
  • Error message improvements guide users correctly (src/cfg.js:1326, src/dataflow.js:605)
  • README documentation accurately updated (lines 211-212, 330)

⚠️ Minor Concerns:

  • Breaking Change: Users with performance-sensitive workflows who relied on cfg/dataflow being disabled by default will now get slower builds (mitigated by clear --no-* options)
  • Version Bump: Major version bump (3.0.0) is appropriate but worth noting

🔍 Security/Performance: No security issues. Performance impact is intentional (more analysis by default) with clear opt-out path.

Root Cause Analysis

Problem Being Solved: Poor UX where cfg/dataflow commands returned empty results because features weren't enabled during build
Why This Approach: Better defaults with clear opt-out rather than confusing opt-in that users forgot
Risk Assessment: Low risk - well-implemented behavioral change with proper versioning

Critical Concerns

None. This is a well-executed UX improvement with appropriate safeguards.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a legitimate UX improvement that solves a real problem. The implementation is clean, comprehensive, and follows established patterns. The major version bump appropriately signals the behavioral change. Users who need performance optimization have clear opt-out flags.

Minor suggestions for future:

  • Consider documenting the performance impact in README/docs
  • Could add timing comparison in help text to help users decide on opt-out

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 4, 2026

Additional Comments (1)

src/builder.js, line 549
change opts.cfg && to opts.cfg !== false && (line 527) and opts.dataflow && to opts.dataflow !== false && (line 536)

When opts.cfg and opts.dataflow are undefined (default for programmatic API calls and CLI without flags), the logical AND short-circuits to falsy, preventing the check from running. This breaks the default-on behavior - users with existing databases will miss CFG/dataflow on incremental builds.

    const needsCfg =
      opts.cfg !== false &&
      (() => {
        try {
          return db.prepare('SELECT COUNT(*) as c FROM cfg_blocks').get().c === 0;
        } catch {
          return true;
        }
      })();
    const needsDataflow =
      opts.dataflow !== false &&
      (() => {
        try {
          return (
            db
              .prepare(
                "SELECT COUNT(*) as c FROM edges WHERE kind IN ('flows_to','returns','mutates')",
              )
              .get().c === 0
          );
        } catch {
          return true;
        }
      })();

Impact: 1 functions changed, 0 affected
@carlos-alm
Copy link
Contributor Author

Fixed in aa5b989 — applied the same opts.cfg !== false / opts.dataflow !== false pattern to the incremental build checks (lines 527, 536). Also updated the comment from "optional analysis" to "default analyses" since they're no longer opt-in.

The previous README comments were already addressed in b20b7eb.

@greptileai

@claude
Copy link

claude bot commented Mar 4, 2026

Claude finished @carlos-alm's task —— View job


Reviewing PR #312: Enable CFG and Dataflow by Default

Review Checklist

  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky changes
  • Phase 3: Test Integrity - verify no tests were weakened
  • Phase 4: Code Quality - assess implementation quality
  • Provide final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Users were encountering empty results when running cfg or dataflow commands because these features required explicit opt-in via --cfg and --dataflow flags during the build phase. This created a poor user experience where commands would fail silently or return empty data.

Why This Approach: The solution inverts the opt-in behavior to opt-out using Commander.js's --no-* convention. Now cfg: true and dataflow: true are the defaults, and users can disable with --no-cfg/--no-dataflow for faster builds.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • No linting rules disabled or relaxed
  • No test timeouts increased or coverage lowered
  • No TypeScript strict mode changes
  • One version bump: codegraph-core Cargo.toml 2.6.03.0.0 (legitimate major version for behavioral change)

Phase 3: Test Integrity ✅

No Test Weakening Detected:

  • No test files were modified
  • No assertions removed or made less strict
  • No eslint-disable comments added
  • Changes are purely to default behavior, not test expectations

Phase 4: Code Quality ✅

Implementation Quality:

✅ Positive Aspects:

  • Consistent API Design: Uses Commander.js --no-* convention matching existing --no-incremental
  • Backward Compatible: Existing scripts using explicit --cfg/--dataflow will continue working (Commander ignores unknown positive flags)
  • User Experience Fix: Eliminates "empty results" surprise when users run cfg/dataflow commands
  • Comprehensive Updates: All error messages and documentation updated consistently
  • Missing Field Added: dataflowMs timing now properly exposed in build return object (src/builder.js:1390)
  • API Consistency: Both CLI and programmatic API now default to enabled via opts.cfg !== false pattern

✅ Code Quality:

  • Clean, minimal changes focused on the specific issue
  • Proper comment updates reflecting new behavior (src/builder.js:527-536)
  • Error message improvements guide users correctly (src/cfg.js:1246, src/dataflow.js multiple locations)
  • README documentation accurately updated (lines 211-212, 330)
  • MCP tool descriptions updated appropriately (src/mcp.js:2 locations)

✅ Technical Correctness:

  • Fixed incremental build logic: Latest commit aa5b989 correctly applies opts.cfg !== false pattern to lines 527, 536 in builder.js for incremental builds
  • Proper default handling: Both CLI (via Commander's --no-* behavior) and programmatic API (via opts.cfg !== false) now consistently default to enabled
  • Version signaling: Major version bump to 3.0.0 appropriately signals breaking behavioral change

⚠️ Minor Concerns:

  • Breaking Change: Users with performance-sensitive workflows who relied on cfg/dataflow being disabled by default will now get slower builds (mitigated by clear --no-* options)
  • Version Bump: Major version bump (3.0.0) is appropriate but worth noting

🔍 Security/Performance: No security issues. Performance impact is intentional (more analysis by default) with clear opt-out path.

Root Cause Analysis

Problem Being Solved: Poor UX where cfg/dataflow commands returned empty results because features weren't enabled during build
Why This Approach: Better defaults with clear opt-out rather than confusing opt-in that users forgot
Risk Assessment: Low risk - well-implemented behavioral change with proper versioning

Critical Concerns

None. This is a well-executed UX improvement with appropriate safeguards. The critical bug identified by Greptile in the incremental build logic has been fixed in commit aa5b989.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a legitimate UX improvement that solves a real problem. The implementation is clean, comprehensive, and follows established patterns. The major version bump appropriately signals the behavioral change. Users who need performance optimization have clear opt-out flags. All technical issues have been addressed.

Minor suggestions for future:

  • Consider documenting the performance impact in README/docs
  • Could add timing comparison in help text to help users decide on opt-out

View job run: https://github.com/optave/codegraph/actions/runs/22653353523

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 4, 2026

Additional Comments (1)

src/builder.js, line 549
pre-existing bug: checking wrong table for dataflow data

dataflow edges are stored in the dataflow table (see db.js migration v10, dataflow.js line 549), not in the edges table. This check always returns true (or throws), causing unnecessary re-parsing on incremental builds when dataflow data already exists.

    const needsDataflow =
      opts.dataflow !== false &&
      (() => {
        try {
          return db.prepare('SELECT COUNT(*) as c FROM dataflow').get().c === 0;
        } catch {
          return true;
        }
      })();

@claude
Copy link

claude bot commented Mar 4, 2026

Claude finished @carlos-alm's task —— View job


Claude finished reviewing this PR —— View job


PR Review: Enable CFG and Dataflow by Default

Reviewing PR #312 to enable --cfg and --dataflow by default on builds.

Review Checklist

  • Read repository guidelines (CLAUDE.md)
  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky changes
  • Phase 3: Test Integrity - verify no tests were weakened
  • Phase 4: Code Quality - assess implementation quality
  • Provide final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Users were encountering empty results when running cfg or dataflow commands because these features required explicit opt-in via --cfg and --dataflow flags during the build phase. This created a poor user experience where commands would fail silently or return empty data.

Why This Approach: The solution inverts the opt-in behavior to opt-out using Commander.js's --no-* convention. Now cfg: true and dataflow: true are the defaults, and users can disable with --no-cfg/--no-dataflow for faster builds.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • No linting rules disabled or relaxed
  • No test timeouts increased or coverage lowered
  • No TypeScript strict mode changes
  • One version bump: codegraph-core Cargo.toml 2.6.03.0.0 (legitimate major version for behavioral change)

Phase 3: Test Integrity ✅

No Test Weakening Detected:

  • No test files were modified
  • No assertions removed or made less strict
  • No eslint-disable comments added
  • Changes are purely to default behavior, not test expectations

Phase 4: Code Quality ✅

Implementation Quality:

✅ Positive Aspects:

  • Consistent API Design: Uses Commander.js --no-* convention matching existing --no-incremental
  • Backward Compatible: Existing scripts using explicit --cfg/--dataflow will continue working (Commander ignores unknown positive flags)
  • User Experience Fix: Eliminates "empty results" surprise when users run cfg/dataflow commands
  • Comprehensive Updates: All error messages and documentation updated consistently
  • Missing Field Added: dataflowMs timing now properly exposed in build return object (src/builder.js:1390)
  • API Consistency: Both CLI and programmatic API now default to enabled via opts.cfg !== false pattern

✅ Code Quality:

  • Clean, minimal changes focused on the specific issue
  • Proper comment updates reflecting new behavior (src/builder.js:527-536)
  • Error message improvements guide users correctly (src/cfg.js:1245, src/dataflow.js:738,881,1011)
  • README documentation accurately updated (lines 211-212, 330)
  • MCP tool descriptions updated appropriately (confirmed in commit history)

⚠️ Pre-Existing Bug Identified:

  • src/builder.js:542: Checking wrong table for dataflow data detection. Code checks edges table for flows_to/returns/mutates but dataflow data is stored in the dataflow table (created in db.js migration v10, inserted at dataflow.js:549)
  • This will cause unnecessary dataflow re-parsing on incremental builds when dataflow data already exists
  • This is a pre-existing issue, not introduced by this PR, but becomes more visible with dataflow enabled by default

⚠️ Minor Concerns:

  • Breaking Change: Users with performance-sensitive workflows who relied on cfg/dataflow being disabled by default will now get slower builds (mitigated by clear --no-* options)
  • Version Bump: Major version bump (3.0.0) is appropriate but worth noting

🔍 Security/Performance: No security issues. Performance impact is intentional (more analysis by default) with clear opt-out path.

Root Cause Analysis

Problem Being Solved: Poor UX where cfg/dataflow commands returned empty results because features weren't enabled during build
Why This Approach: Better defaults with clear opt-out rather than confusing opt-in that users forgot
Risk Assessment: Low risk - well-implemented behavioral change with proper versioning

Critical Concerns

Pre-existing bug should be fixed in follow-up: The dataflow detection logic in src/builder.js:542 checks the wrong table, causing unnecessary re-parsing. This doesn't block the PR but should be addressed soon since dataflow will now run more frequently.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a legitimate UX improvement that solves a real problem. The implementation is clean, comprehensive, and follows established patterns. The major version bump appropriately signals the behavioral change. Users who need performance optimization have clear opt-out flags. The pre-existing dataflow table bug doesn't impact the core functionality of this PR.

Follow-up suggestion: Fix the dataflow table check bug in src/builder.js:542 by changing the query to SELECT COUNT(*) as c FROM dataflow instead of checking the edges table.


@carlos-alm
Copy link
Contributor Author

Fixed in 7b91262 — the needsDataflow check was querying the edges table for flows_to/returns/mutates kinds, but dataflow edges are stored in the separate dataflow table. This caused every incremental no-change build to re-run dataflow analysis (always finding 0 in the wrong table). Now queries SELECT COUNT(*) FROM dataflow instead.

All 16 build integration tests pass locally.

@greptileai

@claude
Copy link

claude bot commented Mar 4, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Enable CFG and Dataflow by Default

Reviewing PR #312 to enable --cfg and --dataflow by default on builds.

Review Checklist

  • Read repository guidelines (CLAUDE.md)
  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky changes
  • Phase 3: Test Integrity - verify no tests were weakened
  • Phase 4: Code Quality - assess implementation quality
  • Provide final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Users were encountering empty results when running cfg or dataflow commands because these features required explicit opt-in via --cfg and --dataflow flags during the build phase. This created a poor user experience where commands would fail silently or return empty data.

Why This Approach: The solution inverts the opt-in behavior to opt-out using Commander.js's --no-* convention. Now cfg: true and dataflow: true are the defaults, and users can disable with --no-cfg/--no-dataflow for faster builds.

Phase 2: Configuration Change Detection ✅

No High-Risk Changes Detected:

  • No linting rules disabled or relaxed
  • No test timeouts increased or coverage lowered
  • No TypeScript strict mode changes
  • One version bump: codegraph-core Cargo.toml 2.6.03.0.0 (legitimate major version for behavioral change)

Phase 3: Test Integrity ✅

No Test Weakening Detected:

  • No test files were modified
  • No assertions removed or made less strict
  • No eslint-disable comments added
  • Changes are purely to default behavior, not test expectations

Phase 4: Code Quality ✅

Implementation Quality:

✅ Positive Aspects:

  • Consistent API Design: Uses Commander.js --no-* convention matching existing --no-incremental
  • Backward Compatible: Existing scripts using explicit --cfg/--dataflow will continue working (Commander ignores unknown positive flags)
  • User Experience Fix: Eliminates "empty results" surprise when users run cfg/dataflow commands
  • Comprehensive Updates: All error messages and documentation updated consistently
  • Missing Field Added: dataflowMs timing now properly exposed in build return object (src/builder.js:1384)
  • API Consistency: Both CLI and programmatic API now default to enabled via opts.cfg !== false pattern

✅ Code Quality:

  • Clean, minimal changes focused on the specific issue
  • Proper comment updates reflecting new behavior (src/builder.js:525-536)
  • Error message improvements guide users correctly (src/cfg.js:1245, src/dataflow.js:738,881,1011)
  • README documentation accurately updated (lines 211-212, 330)
  • MCP tool descriptions updated appropriately (src/mcp.js:641,661)

✅ Technical Correctness:

  • Fixed incremental build logic: Latest commit 7b91262 correctly fixed the dataflow table check bug
  • Proper default handling: Both CLI (via Commander's --no-* behavior) and programmatic API (via opts.cfg !== false) now consistently default to enabled
  • Version signaling: Major version bump to 3.0.0 appropriately signals breaking behavioral change

⚠️ Minor Concerns:

  • Breaking Change: Users with performance-sensitive workflows who relied on cfg/dataflow being disabled by default will now get slower builds (mitigated by clear --no-* options)
  • Version Bump: Major version bump (3.0.0) is appropriate but worth noting

🔍 Security/Performance: No security issues. Performance impact is intentional (more analysis by default) with clear opt-out path.

Root Cause Analysis

Problem Being Solved: Poor UX where cfg/dataflow commands returned empty results because features weren't enabled during build
Why This Approach: Better defaults with clear opt-out rather than confusing opt-in that users forgot
Risk Assessment: Low risk - well-implemented behavioral change with proper versioning

Critical Concerns

None. This is a well-executed UX improvement with appropriate safeguards. Previous technical issues identified by Greptile have been addressed in subsequent commits.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a legitimate UX improvement that solves a real problem. The implementation is clean, comprehensive, and follows established patterns. The major version bump appropriately signals the behavioral change. Users who need performance optimization have clear opt-out flags. All technical issues have been addressed through iterative commits.

Minor suggestions for future:

  • Consider documenting the performance impact in README/docs
  • Could add timing comparison in help text to help users decide on opt-out

@carlos-alm carlos-alm merged commit 13e3a09 into main Mar 4, 2026
16 checks passed
@carlos-alm carlos-alm deleted the feat/default-cfg-dataflow branch March 4, 2026 03:38
@github-actions github-actions bot locked and limited conversation to collaborators Mar 4, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant