Skip to content

feat: extend CFG to all supported languages#283

Merged
carlos-alm merged 2 commits intomainfrom
feat/cfg-all-languages
Mar 3, 2026
Merged

feat: extend CFG to all supported languages#283
carlos-alm merged 2 commits intomainfrom
feat/cfg-all-languages

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Extend CFG support from JS/TS/TSX to all 10 supported languages: Python, Go, Rust, Java, C#, Ruby, PHP
  • Refactor core algorithm to handle 3 else-if patterns (wrapper, siblings, direct alternative), flexible switch/try-catch body extraction, expression_statement unwrapping (Rust), and new constructs (infinite loop, unless/until)
  • Fix C# language ID mismatchc_sharpcsharp in COMPLEXITY_RULES, HALSTEAD_RULES, and COMMENT_PREFIXES (was silently broken since C# was added)
  • Fix C# for_each_statementforeach_statement to match actual tree-sitter node type
  • 43 new unit tests covering all 7 new languages (empty fn, if/else, loops, break/continue, try/catch, switch/match, language-specific features)

Changes

File Changes
src/cfg.js 7 language rule objects, CFG_RULES/CFG_LANG_IDS registration, processIf 3-pattern refactor, processInfiniteLoop, expression_statement unwrapping, blockNodes set in getStatements, flexible processSwitch/processTryCatch
src/complexity.js c_sharpcsharp (3 maps), for_each_statementforeach_statement, elseViaAlternative: true for C#
tests/unit/cfg.test.js 43 new tests across Python, Go, Rust, Java, C#, Ruby, PHP
tests/unit/complexity.test.js c_sharpcsharp in test assertions

Test plan

  • All 67 CFG unit tests pass (24 existing + 43 new)
  • All 11 CFG integration tests pass
  • All 1357 tests in full suite pass (0 failures)
  • Lint clean on all changed files
  • Verify build --cfg on a multi-language project end-to-end

Add control flow graph support for Python, Go, Rust, Java, C#, Ruby,
and PHP. The core algorithm was already language-agnostic; this adds
per-language CFG_RULES mapping tree-sitter node types and refactors
processIf (3 else-if patterns), getStatements (blockNodes set),
processStatement (expression_statement unwrapping, unless/until), and
processSwitch/processTryCatch for cross-language flexibility. Adds
processInfiniteLoop for Rust's `loop {}`.

Also fixes C# language ID mismatch (`c_sharp` → `csharp`) in
COMPLEXITY_RULES, HALSTEAD_RULES, and COMMENT_PREFIXES, and corrects
`for_each_statement` → `foreach_statement` to match tree-sitter.

Impact: 8 functions changed, 11 affected
@claude
Copy link

claude bot commented Mar 3, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Extend CFG to All Supported Languages

Conducting comprehensive review following the 4-phase methodology:

Todo List

  • Phase 1: Root Cause Analysis - Understand problem being solved and approach justification
  • Phase 2: Configuration Change Detection - Check for rule relaxations or workarounds
  • Phase 3: Test Integrity - Verify test quality and coverage
  • Phase 4: Code Quality Review - Examine implementation details and potential issues
  • Final Assessment - Provide rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: The CFG (Control Flow Graph) functionality was previously limited to JavaScript/TypeScript/TSX only. This PR extends it to all 10 supported languages (Python, Go, Rust, Java, C#, Ruby, PHP) to provide consistent analysis capabilities across the entire codebase.

Why This Approach: The approach is well-justified:

  • Declarative rule-based system: Each language defines its CFG rules mapping tree-sitter node types to control flow constructs
  • Unified algorithm: Core CFG building logic remains the same, with language-specific node type mappings
  • Fixes real bugs: Addresses C# language ID mismatch (c_sharp vs csharp) and incorrect node type (for_each_statementforeach_statement)

Risk Assessment: LOW RISK - This is pure feature expansion with extensive test coverage. No existing functionality is modified.

Phase 2: Configuration Change Detection ✅

No problematic configuration changes detected. The changes are:

  1. Language rule additions: 7 new CFG rule objects for new languages - appropriate expansion
  2. Bug fixes: c_sharpcsharp and for_each_statementforeach_statement - these fix actual mismatches between parser.js (which uses csharp) and complexity.js (which was using c_sharp)

These are genuine fixes, not workarounds to silence errors.

Phase 3: Test Integrity ✅

Excellent test coverage:

  • 43 new unit tests added across all 7 new languages
  • No weakened assertions - tests verify proper CFG structure (entry/exit blocks, correct edge types, loop constructs)
  • Language-specific features tested: Ruby's unless/until, Rust's loop {}, Go's type switches, etc.
  • Integration maintained: Existing 24 JS/TS tests remain unchanged

The test design follows established patterns and maintains high quality standards.

Phase 4: Code Quality Review ⭐

Strengths:

  1. Excellent architectural design:

    • Clean separation of language-specific rules from core algorithm
    • Handles 3 different else-if patterns (wrapper, siblings, direct alternative) correctly
    • Proper abstraction with CFG_RULES registry
  2. Robust implementation:

    • Expression statement unwrapping for Rust expressions properly handled
    • Flexible body extraction for switch/try-catch with field access fallback
    • Infinite loop support for Rust's loop {} construct
    • Unless/until support for Ruby
  3. Bug fixes are correct:

    • c_sharpcsharp aligns with parser.js L251: id: 'csharp'
    • for_each_statementforeach_statement matches actual tree-sitter C# grammar

Minor concerns:

  1. Large rule objects (lines 19-294): While comprehensive, the rule definitions are verbose. Consider if some patterns could be consolidated.

  2. No validation: The CFG_RULES map doesn't validate that rule objects have required fields. Could add runtime validation.

  3. WASM fallback complexity (lines 1122-1137): Multiple parser imports could be simplified, but this is existing technical debt, not introduced by this PR.

Security/Performance: No concerns. Pure AST processing with no external input or command execution.


Root Cause Analysis

Problem Being Solved: Extend CFG functionality from JS/TS/TSX-only to all 10 supported languages, providing consistent control flow analysis across the entire codebase.

Why This Approach: Declarative rule-based system allows each language to define its control flow constructs while reusing the core CFG algorithm. Also fixes real bugs in C# language ID and node type mismatches.

Risk Assessment: LOW - Pure feature expansion with comprehensive test coverage. No existing functionality modified.

Critical Concerns (if any)

None identified. This is a well-implemented feature expansion with proper testing.

Final Recommendation

  • Rating: ⭐⭐⭐⭐ (4/5)
  • Action: APPROVE
  • Reasoning: Excellent implementation that extends CFG support to all languages with comprehensive testing. Fixes real bugs and follows established architectural patterns. The only missing element preventing 5 stars is the lack of end-to-end validation mentioned in the test plan.

Recommended next steps: After merge, verify build --cfg on a multi-language project as noted in the test plan checklist.


@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 3, 2026

Greptile Summary

Successfully extends CFG support from JS/TS/TSX to all 10 supported languages (Python, Go, Rust, Java, C#, Ruby, PHP).

Major changes:

  • Added 7 language-specific rule objects in cfg.js with proper node type mappings for each language's control flow constructs
  • Refactored processIf to handle 3 distinct else-if patterns: Wrapper (JS/TS/Rust), Siblings (Python/Ruby/PHP), and Direct (Go/Java/C#)
  • Added processInfiniteLoop for Rust's loop {} construct (exits only via break, no condition check)
  • Implemented flexible body extraction for switch/try-catch to accommodate different tree-sitter grammar structures
  • Added expression_statement unwrapping to handle Rust's expression-based control flow
  • Fixed critical C# bugs: language ID mismatch (c_sharpcsharp) affecting 3 maps, incorrect node type (for_each_statementforeach_statement), and corrected if/else pattern (Pattern A → Pattern C)
  • 43 comprehensive tests covering empty functions, if/else chains, loops, break/continue, try/catch, switch/match, and language-specific features (unless/until for Ruby, infinite loop for Rust)

The implementation is thorough and handles the nuances of each language's syntax tree structure correctly.

Confidence Score: 5/5

  • This PR is safe to merge with high confidence
  • The changes are well-structured with comprehensive test coverage (43 new tests across 7 languages), all existing tests pass, and the implementation correctly handles language-specific nuances. The C# fixes resolve real bugs that would have caused silent failures. The refactored control flow pattern handling is elegant and maintainable.
  • No files require special attention

Important Files Changed

Filename Overview
src/cfg.js Added 7 language rule objects, refactored processIf for 3 patterns, added processInfiniteLoop, flexible switch/try-catch body extraction, expression_statement unwrapping
src/complexity.js Fixed C# language ID mismatch (c_sharpcsharp), corrected for_each_statementforeach_statement, updated C# to use Pattern C (elseViaAlternative: true)
tests/unit/cfg.test.js Added 43 comprehensive tests covering all 7 new languages with language-specific features (unless/until for Ruby, loop for Rust, etc.)
tests/unit/complexity.test.js Updated test assertions to use csharp instead of c_sharp language ID

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    Start[processIf: Handle if/else chain] --> CheckElif{rules.elifNode<br/>defined?}
    
    CheckElif -->|Yes| PatternB[Pattern B: Siblings<br/>Python, Ruby, PHP]
    CheckElif -->|No| CheckAlt{alternative<br/>node exists?}
    
    PatternB --> ProcessSiblings[processElifSiblings<br/>iterate through children<br/>find elif/else nodes]
    
    CheckAlt -->|No| NoElse[Add false edge<br/>to join block]
    CheckAlt -->|Yes| CheckElseVia{elseViaAlternative<br/>AND not elseClause?}
    
    CheckElseVia -->|Yes| PatternC[Pattern C: Direct<br/>Go, Java, C#]
    CheckElseVia -->|No| CheckWrapper{alternative.type<br/>== elseClause?}
    
    PatternC --> IsIfNode{alternative is<br/>if_statement?}
    IsIfNode -->|Yes| RecurseC[Recurse processIf<br/>for else-if]
    IsIfNode -->|No| ElseBlockC[Process else block]
    
    CheckWrapper -->|Yes| PatternA[Pattern A: Wrapper<br/>JS/TS, Rust]
    CheckWrapper -->|No| NoElse
    
    PatternA --> CheckInner{inner child is<br/>if_statement?}
    CheckInner -->|Yes| RecurseA[Recurse processIf<br/>for else-if]
    CheckInner -->|No| ElseBlockA[Process else block]
    
    ProcessSiblings --> Join[Return join block]
    RecurseC --> Join
    ElseBlockC --> Join
    RecurseA --> Join
    ElseBlockA --> Join
    NoElse --> Join
Loading

Last reviewed commit: 2f3ac78

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

src/cfg.js Outdated
/**
* Process an if/else-if/else chain.
* Handles three patterns:
* A) Wrapper: alternative → else_clause → nested if or block (JS/TS, C#, Rust)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# listed under Pattern A (Wrapper), but config has elseClause: null + elseViaAlternative: true, which routes it to Pattern C (Direct)

Suggested change
* A) Wrapper: alternative else_clause nested if or block (JS/TS, C#, Rust)
* A) Wrapper: alternative else_clause nested if or block (JS/TS, Rust)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2f3ac78 — moved C# from Pattern A to Pattern C in the comment to match the actual config (elseClause: null, elseViaAlternative: true).

Impact: 1 functions changed, 1 affected
@claude
Copy link

claude bot commented Mar 3, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Extend CFG to All Supported Languages

Conducted comprehensive review following the 4-phase methodology:

Todo List

  • Phase 1: Root Cause Analysis - Understand problem being solved and approach justification
  • Phase 2: Configuration Change Detection - Check for rule relaxations or workarounds
  • Phase 3: Test Integrity - Verify test quality and coverage
  • Phase 4: Code Quality Review - Examine implementation details and potential issues
  • Final Assessment - Provide rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: The CFG (Control Flow Graph) functionality was previously limited to JavaScript/TypeScript/TSX only. This PR extends it to all 10 supported languages (Python, Go, Rust, Java, C#, Ruby, PHP) to provide consistent analysis capabilities across the entire codebase.

Why This Approach: The approach is well-justified:

  • Declarative rule-based system: Each language defines its CFG rules mapping tree-sitter node types to control flow constructs
  • Unified algorithm: Core CFG building logic remains the same, with language-specific node type mappings
  • Fixes real bugs: Addresses C# language ID mismatch (c_sharp vs csharp) and incorrect node type (for_each_statementforeach_statement)

Risk Assessment: LOW RISK - This is pure feature expansion with extensive test coverage. No existing functionality is modified.

Phase 2: Configuration Change Detection ✅

No problematic configuration changes detected. The changes are:

  1. Language rule additions: 7 new CFG rule objects for new languages - appropriate expansion
  2. Bug fixes: c_sharpcsharp and for_each_statementforeach_statement - these fix actual mismatches between parser.js (which uses csharp) and complexity.js (which was using c_sharp)

These are genuine fixes, not workarounds to silence errors.

Phase 3: Test Integrity ✅

Excellent test coverage:

  • 43 new unit tests added across all 7 new languages
  • No weakened assertions - tests verify proper CFG structure (entry/exit blocks, correct edge types, loop constructs)
  • Language-specific features tested: Ruby's unless/until, Rust's loop {}, Go's type switches, etc.
  • Integration maintained: Existing 24 JS/TS tests remain unchanged

The test design follows established patterns and maintains high quality standards.

Phase 4: Code Quality Review ⭐

Strengths:

  1. Excellent architectural design:

    • Clean separation of language-specific rules from core algorithm
    • Handles 3 different else-if patterns (wrapper, siblings, direct alternative) correctly
    • Proper abstraction with CFG_RULES registry
  2. Robust implementation:

    • Expression statement unwrapping for Rust expressions properly handled
    • Flexible body extraction for switch/try-catch with field access fallback
    • Infinite loop support for Rust's loop {} construct
    • Unless/until support for Ruby
  3. Bug fixes are correct:

    • c_sharpcsharp aligns with parser.js L251: id: 'csharp'
    • for_each_statementforeach_statement matches actual tree-sitter C# grammar

Minor concerns:

  1. Large rule objects (lines 19-294): While comprehensive, the rule definitions are verbose. Consider if some patterns could be consolidated.

  2. No validation: The CFG_RULES map doesn't validate that rule objects have required fields. Could add runtime validation.

  3. WASM fallback complexity (lines 1122-1137): Multiple parser imports could be simplified, but this is existing technical debt, not introduced by this PR.

Security/Performance: No concerns. Pure AST processing with no external input or command execution.


Root Cause Analysis

Problem Being Solved: Extend CFG functionality from JS/TS/TSX-only to all 10 supported languages, providing consistent control flow analysis across the entire codebase.

Why This Approach: Declarative rule-based system allows each language to define its control flow constructs while reusing the core CFG algorithm. Also fixes real bugs in C# language ID and node type mismatches.

Risk Assessment: LOW - Pure feature expansion with comprehensive test coverage. No existing functionality modified.

Critical Concerns (if any)

None identified. This is a well-implemented feature expansion with proper testing.

Final Recommendation

  • Rating: ⭐⭐⭐⭐ (4/5)
  • Action: APPROVE
  • Reasoning: Excellent implementation that extends CFG support to all languages with comprehensive testing. Fixes real bugs and follows established architectural patterns. The only missing element preventing 5 stars is the lack of end-to-end validation mentioned in the test plan.

Recommended next steps: After merge, verify build --cfg on a multi-language project as noted in the test plan checklist.


@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit aa8fb42 into main Mar 3, 2026
20 checks passed
@carlos-alm carlos-alm deleted the feat/cfg-all-languages branch March 3, 2026 06:31
@github-actions github-actions bot locked and limited conversation to collaborators Mar 3, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant