feat: Halstead/LOC/MI metrics + benchmark dev/release versioning by carlos-alm · Pull Request #161 · optave/codegraph

carlos-alm · 2026-02-28T01:42:24Z

Summary

Native Halstead, LOC, and MI metrics: Extends the Rust native engine with Halstead complexity (volume, difficulty, effort, bugs), lines-of-code (total, logical, comment), and Maintainability Index computation — matching the JS implementation
Benchmark dev/release versioning: Fixes benchmark labeling so release benchmarks install from npm and measure the actual published package, while dev benchmarks are labeled "dev" and rolling (only keep latest)

Benchmark changes

New scripts/lib/bench-config.js shared helper with --version and --npm CLI flags
All 4 benchmark runners use CLI-driven version instead of reading package.json
All 4 report updaters: dev entries are rolling, releases remove dev, trends skip dev
Workflow: workflow_dispatch accepts version input, detects npm vs local mode, adds npm propagation wait
Re-benchmarked 2.4.0 from actual npm package (123 files, 801 nodes)

Test plan

All 832 existing tests pass
node scripts/benchmark.js outputs "version": "dev" by default
node scripts/benchmark.js --npm --version 2.4.0 installs from npm, outputs "version": "2.4.0"
Report update with dev entry: "dev" row at top, trends compare against latest release
Report update with release entry: release row appears, dev row removed
CI workflow_dispatch with version "dev" creates PR with dev entry

claude · 2026-02-28T01:42:35Z

Claude finished @carlos-alm's task —— View job

PR Review - Complete Analysis

Todo List:

Phase 1: Root Cause Analysis - understand the problem being solved
Phase 2: Configuration Change Detection - check for risky config changes
Phase 3: Test Integrity - verify no weakened assertions or lowered standards
Phase 4: Code Quality Review - analyze implementation quality
Provide final rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved:

Missing native metrics: The Rust native engine was only computing basic complexity (cognitive, cyclomatic, nesting) while the JS implementation had Halstead complexity, LOC, and Maintainability Index metrics. This created feature parity gaps.
Benchmark labeling confusion: Dev builds and release builds were using the same version labeling, making it impossible to distinguish between local testing vs. published package benchmarks.

Why This Approach:

Extends the native Rust engine to match JS implementation feature-for-feature
Adds proper versioning infrastructure to separate dev/release benchmark results
Uses shared configuration to ensure consistency across all 4 benchmark scripts

Risk Assessment: Medium - Large implementation with complex metric calculations, but addresses real feature gaps rather than hiding issues.

Phase 2: Configuration Change Detection ✅

Analysis: No risky configuration changes detected:

biome.json unchanged - no linting rules relaxed
No test timeouts increased or coverage thresholds lowered
No TypeScript strict mode flags disabled
No build configuration modified inappropriately

Phase 3: Test Integrity ✅

Analysis: Tests appear sound:

No test files modified in this PR
Existing complexity tests already support extended metrics (LOC, Halstead, MI)
No weakened assertions or skipped validations
PR claims "All 832 existing tests pass"

Phase 4: Code Quality Review ✅

Strengths:

Comprehensive implementation: All 8 language extractors updated consistently to use compute_all_metrics
Faithful port: Rust implementation mirrors JS logic precisely with equivalent language rules
Proper error handling: Uses Option<ComplexityMetrics> return type, graceful fallbacks
Clean architecture: Single entry point compute_all_metrics for unified metric computation
Safe math: Division-by-zero guards, proper f64 rounding, saturation arithmetic

Code Quality Issues Found:

Potential Integer Overflow crates/codegraph-core/src/complexity.rs:973:
```
let sloc = (loc_total.saturating_sub(blank_lines).saturating_sub(comment_lines)).max(1);
```
Good use of saturating_sub to prevent underflow.
Test-only unsafe operations crates/codegraph-core/src/complexity.rs:1274,1445,1479:
- Multiple unwrap() calls in test code only - acceptable for tests
- Production code uses safe error handling
Benchmark Infrastructure scripts/lib/bench-config.js:
- Robust npm installation retry logic with exponential backoff
- Proper cleanup of temp directories
- Clear separation of dev vs release modes

Minor Concerns:

Line 1002-1006: Redundant unwrap_or pattern could be simplified
Very long function signatures due to comprehensive metric collection (acceptable given functionality)

Critical Concerns

None identified. This is a legitimate feature enhancement that:

Solves real parity gaps between native/JS engines
Implements complex algorithms correctly with proper safeguards
Does not weaken any existing functionality or tests
Follows established patterns in the codebase

Final Recommendation

## Root Cause Analysis
**Problem Being Solved**: Native Rust engine lacked Halstead/LOC/MI metrics vs JS implementation; benchmark versioning couldn't distinguish dev vs release builds
**Why This Approach**: Faithful port of JS complexity algorithms to Rust with shared benchmark configuration for consistency
**Risk Assessment**: Medium complexity but addresses real feature gaps, not hiding issues

## Critical Concerns
None identified. Implementation follows established patterns, uses proper error handling, and extends functionality without weakening existing code.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)  
- Action: **APPROVE WITH CONDITIONS**
- Reasoning: High-quality implementation solving real feature parity gaps. Comprehensive test coverage already exists. Minor optimization opportunities but no blocking issues. Condition: Verify the 832 existing tests actually pass in CI.

The implementation is technically sound with proper safeguards. The benchmark infrastructure improvements are well-architected. This enhances the native engine capabilities substantially while maintaining code quality standards.

greptile-apps · 2026-02-28T01:46:12Z

Greptile Summary

Extends native Rust engine with comprehensive complexity metrics (Halstead volume/difficulty/effort/bugs, LOC, MI) matching the JS implementation, and fixes benchmark versioning to correctly distinguish dev vs release benchmarks by installing from npm for releases.

Key changes:

Added compute_all_metrics to Rust complexity module for single-pass metric computation
All 9 language extractors updated to use new metrics function
Created shared bench-config.js helper for CLI-driven version resolution
All 4 benchmark scripts and 4 report updaters refactored for dev/release handling
Workflow accepts version input, waits for npm propagation, benchmarks actual published packages
Re-benchmarked v2.4.0 from npm (123 files, 801 nodes)

Confidence Score: 5/5

Safe to merge with minimal risk
Well-structured refactor with clear separation of concerns. Rust implementation mirrors tested JS logic. Only minor style improvement suggested (removing unnecessary shell flag).
No files require special attention

Important Files Changed

Filename	Overview
scripts/lib/bench-config.js	New shared benchmark config helper; handles CLI args and npm installation with retry logic
crates/codegraph-core/src/complexity.rs	Adds Halstead, LOC, and MI metrics to Rust native engine; mirrors JS implementation
src/complexity.js	Adds computeAllMetrics for single-pass complexity+Halstead+LOC+MI calculation
scripts/update-benchmark-report.js	Handles dev/release versioning: dev entries are rolling, releases remove dev
.github/workflows/benchmark.yml	Accepts version input, detects npm vs local mode, adds npm propagation wait
crates/codegraph-core/src/types.rs	Adds HalsteadMetrics, LocMetrics to ComplexityMetrics type definition

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Benchmark Script] --> B{Parse CLI Args}
    B --> C{--npm flag?}
    C -->|Yes| D[npm mode]
    C -->|No| E[local mode]
    E --> F[Use repo src/]
    E --> G[version = 'dev']
    D --> H[Install from npm]
    H --> I[Retry with backoff]
    I --> J[Extract installed src/]
    I --> K[version = semver]
    F --> L[Run Benchmarks]
    J --> L
    L --> M[Update Report]
    M --> N{version == 'dev'?}
    N -->|Yes| O[Rolling: replace existing dev]
    N -->|No| P[Release: remove dev, add release]
    O --> Q[Generate Markdown]
    P --> Q
    Q --> R[Update README]
    
    S[Language Extractor] --> T[Parse AST Node]
    T --> U[compute_all_metrics]
    U --> V[Single DFS Walk]
    V --> W[Track Complexity]
    V --> X[Track Halstead]
    W --> Y[Compute LOC]
    X --> Y
    Y --> Z[Compute MI]
    Z --> AA[Return Full Metrics]

_{Last reviewed commit: 1071297}

greptile-apps

_{24 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-28T01:46:15Z

scripts/lib/bench-config.js

+			execFileSync('npm', ['install', `@optave/codegraph@${version}`, '--no-audit', '--no-fund'], {
+				cwd: tmpDir,
+				stdio: 'pipe',
+				timeout: 120_000,
+				shell: true,
+			});


shell: true is unnecessary with execFileSync and reduces security. execFileSync doesn't use shell by default when args are separate.

Suggested change

execFileSync('npm', ['install', `@optave/codegraph@${version}`, '--no-audit', '--no-fund'], {

cwd: tmpDir,

stdio: 'pipe',

timeout: 120_000,

shell: true,

});

execFileSync('npm', ['install', `@optave/codegraph@${version}`, '--no-audit', '--no-fund'], {

cwd: tmpDir,

stdio: 'pipe',

timeout: 120_000,

});

shell: true is required here for Windows compatibility — npm is a .cmd file on Windows and execFileSync without shell cannot resolve it (ENOENT). Since the arguments are passed as an array, Node.js still handles escaping properly, so there is no injection risk.

Add shared bench-config.js with --version and --npm CLI flags so benchmark scripts can install and measure the actual published npm package instead of always using local source labeled with package.json version. - New scripts/lib/bench-config.js: parseArgs(), resolveBenchmarkSource(), srcImport() with retry/backoff for npm propagation - All 4 benchmark runners use CLI-driven version (default "dev") and srcImport() for dynamic imports; call cleanup() on exit - All 4 report updaters: dev entries are rolling (one at a time), releases remove dev entry, findPrevRelease() skips dev for trends - Workflow: workflow_dispatch accepts version input, detects npm vs local mode, adds npm propagation wait, removes artifact download steps - Re-benchmarked 2.4.0 from actual npm package (123 files, 801 nodes) Impact: 10 functions changed, 10 affected

Impact: 1 functions changed, 0 affected

claude · 2026-02-28T02:07:14Z

Claude finished @carlos-alm's task —— View job

PR Review - Comprehensive Analysis

Todo List:

Phase 1: Root Cause Analysis - understand the problem being solved
Phase 2: Configuration Change Detection - check for risky config changes
Phase 3: Test Integrity - verify no weakened assertions or lowered standards
Phase 4: Code Quality Review - analyze implementation quality
Provide final rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved:

Missing native metrics: The Rust native engine was only computing basic complexity (cognitive, cyclomatic, nesting) while the JS implementation had Halstead complexity, LOC, and Maintainability Index metrics. This created feature parity gaps.
Benchmark labeling confusion: Dev builds and release builds were using the same version labeling, making it impossible to distinguish between local testing vs. published package benchmarks.

Why This Approach:

Extends the native Rust engine to match JS implementation feature-for-feature
Adds proper versioning infrastructure to separate dev/release benchmark results
Uses shared configuration to ensure consistency across all 4 benchmark scripts

Risk Assessment: Medium - Large implementation with complex metric calculations, but addresses real feature gaps rather than hiding issues.

Phase 2: Configuration Change Detection ✅

Analysis: No risky configuration changes detected:

biome.json unchanged - no linting rules relaxed
No test timeouts increased or coverage thresholds lowered
No build configuration or module resolution modified
No TypeScript strict mode flags disabled

Phase 3: Test Integrity ✅

Analysis: Tests appear sound:

No test files modified in this PR
Existing complexity tests already support extended metrics (LOC, Halstead, MI)
No weakened assertions or skipped validations
PR claims "All 832 existing tests pass"

Phase 4: Code Quality Review ✅

Strengths:

Comprehensive implementation: All 8 language extractors updated consistently to use compute_all_metrics
Faithful port: Rust implementation mirrors JS logic precisely with equivalent language rules
Proper error handling: Uses Option<ComplexityMetrics> return type, graceful fallbacks
Clean architecture: Single entry point compute_all_metrics for unified metric computation
Safe math: Division-by-zero guards, proper f64 rounding, saturation arithmetic

Code Quality Issues Found:

Potential Integer Overflow crates/codegraph-core/src/complexity.rs:973:
```
let sloc = (loc_total.saturating_sub(blank_lines).saturating_sub(comment_lines)).max(1);
```
Good use of saturating_sub to prevent underflow.
Test-only unsafe operations crates/codegraph-core/src/complexity.rs:1274,1445,1479:
- Multiple unwrap() calls in test code only - acceptable for tests
- Production code uses safe error handling
Benchmark Infrastructure scripts/lib/bench-config.js:
- Robust npm installation retry logic with exponential backoff
- Proper cleanup of temp directories
- Clear separation of dev vs release modes

Minor Concerns:

Line 1002-1006: Redundant unwrap_or pattern could be simplified
Very long function signatures due to comprehensive metric collection (acceptable given functionality)

Critical Concerns

None identified. This is a legitimate feature enhancement that:

Solves real parity gaps between native/JS engines
Implements complex algorithms correctly with proper safeguards
Does not weaken any existing functionality or tests
Follows established patterns in the codebase

Final Recommendation

## Root Cause Analysis
**Problem Being Solved**: Native Rust engine lacked Halstead/LOC/MI metrics vs JS implementation; benchmark versioning couldn't distinguish dev vs release builds
**Why This Approach**: Faithful port of JS complexity algorithms to Rust with shared benchmark configuration for consistency
**Risk Assessment**: Medium complexity but addresses real feature gaps, not hiding issues

## Critical Concerns
None identified. Implementation follows established patterns, uses proper error handling, and extends functionality without weakening existing code.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)  
- Action: **APPROVE WITH CONDITIONS**
- Reasoning: High-quality implementation solving real feature parity gaps. Comprehensive test coverage already exists. Minor optimization opportunities but no blocking issues. Condition: Verify the 832 existing tests actually pass in CI.

The implementation is technically sound with proper safeguards. The benchmark infrastructure improvements are well-architected. This enhances the native engine capabilities substantially while maintaining code quality standards.

carlos-alm · 2026-02-28T02:44:44Z

Addressing review feedback

Minor concern — redundant unwrap_or (line 1002-1006): Fixed in 150c3eb — simplified Some(halstead.unwrap_or(default)) to halstead.or(Some(default)).

Condition — verify 832 tests pass in CI: Tests pass locally on Windows (Node 22). CI will confirm on Linux once the PR merges. The Halstead/LOC/MI commit already passed CI as PR #159.

shell: true on execFileSync (Greptile): Required for Windows — npm is a .cmd file and execFileSync without shell gets ENOENT. Args passed as array so escaping is safe.

All other findings were positive — no action needed.

carlos-alm · 2026-02-28T02:51:17Z

@claude

claude · 2026-02-28T02:51:30Z

Claude finished @carlos-alm's task —— View job

PR Status Check

Todo List:

Acknowledge the mention and assess current PR status
Review recent changes addressing feedback
Provide updated assessment
Ask for specific guidance if needed

Current Status ✅

Great to see you've tagged me! Here's the current status of PR #161:

Review Feedback Addressed:

✅ Redundant unwrap_or pattern fixed in commit 150c3eb - I can confirm the code now uses halstead.or(Some(default)) instead of Some(halstead.unwrap_or(default))
✅ All CI checks passing - 6/6 checks successful including full CI suite
✅ Tests verified - You confirmed 832 tests pass locally, and CI confirms success

Outstanding Items from Previous Reviews:

The shell: true discussion with Greptile is resolved - you correctly explained the Windows compatibility requirement
No critical concerns remain from either review

Current Assessment:
This PR is in excellent shape! The implementation is solid, feedback has been addressed, and all automated checks are green. The feature parity between Rust native engine and JS implementation is now complete with proper benchmark versioning.

What do you need?

Since you didn't include specific instructions with your @claude mention, could you let me know:

Do you need me to review any specific aspect?
Are you looking for final approval/merge recommendation?
Is there something specific you'd like me to check or help with?

The PR appears ready for merge from a technical perspective! 🚀

greptile-apps bot reviewed Feb 28, 2026

View reviewed changes

github-actions bot added 2 commits February 27, 2026 19:06

refactor: simplify redundant unwrap_or pattern in complexity.rs

150c3eb

Impact: 1 functions changed, 0 affected

carlos-alm force-pushed the feat/native-halstead-loc-mi branch from 5ddbe87 to 150c3eb Compare February 28, 2026 02:07

carlos-alm merged commit 0e41ceb into main Feb 28, 2026
19 checks passed

carlos-alm deleted the feat/native-halstead-loc-mi branch February 28, 2026 03:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Halstead/LOC/MI metrics + benchmark dev/release versioning#161

feat: Halstead/LOC/MI metrics + benchmark dev/release versioning#161
carlos-alm merged 2 commits intomainfrom
feat/native-halstead-loc-mi

carlos-alm commented Feb 28, 2026

Uh oh!

claude bot commented Feb 28, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 28, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 28, 2026

Uh oh!

carlos-alm Feb 28, 2026

Uh oh!

claude bot commented Feb 28, 2026 •

edited

Loading

Uh oh!

carlos-alm commented Feb 28, 2026

Uh oh!

carlos-alm commented Feb 28, 2026

Uh oh!

claude bot commented Feb 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented Feb 28, 2026

Summary

Benchmark changes

Test plan

Uh oh!

claude bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review - Complete Analysis

Phase 1: Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity ✅

Phase 4: Code Quality Review ✅

Critical Concerns

Final Recommendation

Uh oh!

greptile-apps bot commented Feb 28, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

carlos-alm Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review - Comprehensive Analysis

Phase 1: Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity ✅

Phase 4: Code Quality Review ✅

Critical Concerns

Final Recommendation

Uh oh!

carlos-alm commented Feb 28, 2026

Addressing review feedback

Uh oh!

carlos-alm commented Feb 28, 2026

Uh oh!

claude bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Status Check

Current Status ✅

What do you need?

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Feb 28, 2026 •

edited

Loading

claude bot commented Feb 28, 2026 •

edited

Loading

claude bot commented Feb 28, 2026 •

edited

Loading