Skip to content

feat: Halstead/LOC/MI metrics + benchmark dev/release versioning#161

Merged
carlos-alm merged 2 commits intomainfrom
feat/native-halstead-loc-mi
Feb 28, 2026
Merged

feat: Halstead/LOC/MI metrics + benchmark dev/release versioning#161
carlos-alm merged 2 commits intomainfrom
feat/native-halstead-loc-mi

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Native Halstead, LOC, and MI metrics: Extends the Rust native engine with Halstead complexity (volume, difficulty, effort, bugs), lines-of-code (total, logical, comment), and Maintainability Index computation — matching the JS implementation
  • Benchmark dev/release versioning: Fixes benchmark labeling so release benchmarks install from npm and measure the actual published package, while dev benchmarks are labeled "dev" and rolling (only keep latest)

Benchmark changes

  • New scripts/lib/bench-config.js shared helper with --version and --npm CLI flags
  • All 4 benchmark runners use CLI-driven version instead of reading package.json
  • All 4 report updaters: dev entries are rolling, releases remove dev, trends skip dev
  • Workflow: workflow_dispatch accepts version input, detects npm vs local mode, adds npm propagation wait
  • Re-benchmarked 2.4.0 from actual npm package (123 files, 801 nodes)

Test plan

  • All 832 existing tests pass
  • node scripts/benchmark.js outputs "version": "dev" by default
  • node scripts/benchmark.js --npm --version 2.4.0 installs from npm, outputs "version": "2.4.0"
  • Report update with dev entry: "dev" row at top, trends compare against latest release
  • Report update with release entry: release row appears, dev row removed
  • CI workflow_dispatch with version "dev" creates PR with dev entry

@claude
Copy link

claude bot commented Feb 28, 2026

Claude finished @carlos-alm's task —— View job


PR Review - Complete Analysis

Todo List:

  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky config changes
  • Phase 3: Test Integrity - verify no weakened assertions or lowered standards
  • Phase 4: Code Quality Review - analyze implementation quality
  • Provide final rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved:

  1. Missing native metrics: The Rust native engine was only computing basic complexity (cognitive, cyclomatic, nesting) while the JS implementation had Halstead complexity, LOC, and Maintainability Index metrics. This created feature parity gaps.
  2. Benchmark labeling confusion: Dev builds and release builds were using the same version labeling, making it impossible to distinguish between local testing vs. published package benchmarks.

Why This Approach:

  • Extends the native Rust engine to match JS implementation feature-for-feature
  • Adds proper versioning infrastructure to separate dev/release benchmark results
  • Uses shared configuration to ensure consistency across all 4 benchmark scripts

Risk Assessment: Medium - Large implementation with complex metric calculations, but addresses real feature gaps rather than hiding issues.

Phase 2: Configuration Change Detection ✅

Analysis: No risky configuration changes detected:

  • biome.json unchanged - no linting rules relaxed
  • No test timeouts increased or coverage thresholds lowered
  • No TypeScript strict mode flags disabled
  • No build configuration modified inappropriately

Phase 3: Test Integrity ✅

Analysis: Tests appear sound:

  • No test files modified in this PR
  • Existing complexity tests already support extended metrics (LOC, Halstead, MI)
  • No weakened assertions or skipped validations
  • PR claims "All 832 existing tests pass"

Phase 4: Code Quality Review ✅

Strengths:

  • Comprehensive implementation: All 8 language extractors updated consistently to use compute_all_metrics
  • Faithful port: Rust implementation mirrors JS logic precisely with equivalent language rules
  • Proper error handling: Uses Option<ComplexityMetrics> return type, graceful fallbacks
  • Clean architecture: Single entry point compute_all_metrics for unified metric computation
  • Safe math: Division-by-zero guards, proper f64 rounding, saturation arithmetic

Code Quality Issues Found:

  1. Potential Integer Overflow crates/codegraph-core/src/complexity.rs:973:

    let sloc = (loc_total.saturating_sub(blank_lines).saturating_sub(comment_lines)).max(1);

    Good use of saturating_sub to prevent underflow.

  2. Test-only unsafe operations crates/codegraph-core/src/complexity.rs:1274,1445,1479:

    • Multiple unwrap() calls in test code only - acceptable for tests
    • Production code uses safe error handling
  3. Benchmark Infrastructure scripts/lib/bench-config.js:

    • Robust npm installation retry logic with exponential backoff
    • Proper cleanup of temp directories
    • Clear separation of dev vs release modes

Minor Concerns:

  • Line 1002-1006: Redundant unwrap_or pattern could be simplified
  • Very long function signatures due to comprehensive metric collection (acceptable given functionality)

Critical Concerns

None identified. This is a legitimate feature enhancement that:

  • Solves real parity gaps between native/JS engines
  • Implements complex algorithms correctly with proper safeguards
  • Does not weaken any existing functionality or tests
  • Follows established patterns in the codebase

Final Recommendation

## Root Cause Analysis
**Problem Being Solved**: Native Rust engine lacked Halstead/LOC/MI metrics vs JS implementation; benchmark versioning couldn't distinguish dev vs release builds
**Why This Approach**: Faithful port of JS complexity algorithms to Rust with shared benchmark configuration for consistency
**Risk Assessment**: Medium complexity but addresses real feature gaps, not hiding issues

## Critical Concerns
None identified. Implementation follows established patterns, uses proper error handling, and extends functionality without weakening existing code.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)  
- Action: **APPROVE WITH CONDITIONS**
- Reasoning: High-quality implementation solving real feature parity gaps. Comprehensive test coverage already exists. Minor optimization opportunities but no blocking issues. Condition: Verify the 832 existing tests actually pass in CI.

The implementation is technically sound with proper safeguards. The benchmark infrastructure improvements are well-architected. This enhances the native engine capabilities substantially while maintaining code quality standards.


@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 28, 2026

Greptile Summary

Extends native Rust engine with comprehensive complexity metrics (Halstead volume/difficulty/effort/bugs, LOC, MI) matching the JS implementation, and fixes benchmark versioning to correctly distinguish dev vs release benchmarks by installing from npm for releases.

Key changes:

  • Added compute_all_metrics to Rust complexity module for single-pass metric computation
  • All 9 language extractors updated to use new metrics function
  • Created shared bench-config.js helper for CLI-driven version resolution
  • All 4 benchmark scripts and 4 report updaters refactored for dev/release handling
  • Workflow accepts version input, waits for npm propagation, benchmarks actual published packages
  • Re-benchmarked v2.4.0 from npm (123 files, 801 nodes)

Confidence Score: 5/5

  • Safe to merge with minimal risk
  • Well-structured refactor with clear separation of concerns. Rust implementation mirrors tested JS logic. Only minor style improvement suggested (removing unnecessary shell flag).
  • No files require special attention

Important Files Changed

Filename Overview
scripts/lib/bench-config.js New shared benchmark config helper; handles CLI args and npm installation with retry logic
crates/codegraph-core/src/complexity.rs Adds Halstead, LOC, and MI metrics to Rust native engine; mirrors JS implementation
src/complexity.js Adds computeAllMetrics for single-pass complexity+Halstead+LOC+MI calculation
scripts/update-benchmark-report.js Handles dev/release versioning: dev entries are rolling, releases remove dev
.github/workflows/benchmark.yml Accepts version input, detects npm vs local mode, adds npm propagation wait
crates/codegraph-core/src/types.rs Adds HalsteadMetrics, LocMetrics to ComplexityMetrics type definition

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Benchmark Script] --> B{Parse CLI Args}
    B --> C{--npm flag?}
    C -->|Yes| D[npm mode]
    C -->|No| E[local mode]
    E --> F[Use repo src/]
    E --> G[version = 'dev']
    D --> H[Install from npm]
    H --> I[Retry with backoff]
    I --> J[Extract installed src/]
    I --> K[version = semver]
    F --> L[Run Benchmarks]
    J --> L
    L --> M[Update Report]
    M --> N{version == 'dev'?}
    N -->|Yes| O[Rolling: replace existing dev]
    N -->|No| P[Release: remove dev, add release]
    O --> Q[Generate Markdown]
    P --> Q
    Q --> R[Update README]
    
    S[Language Extractor] --> T[Parse AST Node]
    T --> U[compute_all_metrics]
    U --> V[Single DFS Walk]
    V --> W[Track Complexity]
    V --> X[Track Halstead]
    W --> Y[Compute LOC]
    X --> Y
    Y --> Z[Compute MI]
    Z --> AA[Return Full Metrics]
Loading

Last reviewed commit: 1071297

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

24 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +69 to +74
execFileSync('npm', ['install', `@optave/codegraph@${version}`, '--no-audit', '--no-fund'], {
cwd: tmpDir,
stdio: 'pipe',
timeout: 120_000,
shell: true,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shell: true is unnecessary with execFileSync and reduces security. execFileSync doesn't use shell by default when args are separate.

Suggested change
execFileSync('npm', ['install', `@optave/codegraph@${version}`, '--no-audit', '--no-fund'], {
cwd: tmpDir,
stdio: 'pipe',
timeout: 120_000,
shell: true,
});
execFileSync('npm', ['install', `@optave/codegraph@${version}`, '--no-audit', '--no-fund'], {
cwd: tmpDir,
stdio: 'pipe',
timeout: 120_000,
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shell: true is required here for Windows compatibility — npm is a .cmd file on Windows and execFileSync without shell cannot resolve it (ENOENT). Since the arguments are passed as an array, Node.js still handles escaping properly, so there is no injection risk.

Add shared bench-config.js with --version and --npm CLI flags so
benchmark scripts can install and measure the actual published npm
package instead of always using local source labeled with package.json
version.

- New scripts/lib/bench-config.js: parseArgs(), resolveBenchmarkSource(),
  srcImport() with retry/backoff for npm propagation
- All 4 benchmark runners use CLI-driven version (default "dev") and
  srcImport() for dynamic imports; call cleanup() on exit
- All 4 report updaters: dev entries are rolling (one at a time),
  releases remove dev entry, findPrevRelease() skips dev for trends
- Workflow: workflow_dispatch accepts version input, detects npm vs local
  mode, adds npm propagation wait, removes artifact download steps
- Re-benchmarked 2.4.0 from actual npm package (123 files, 801 nodes)

Impact: 10 functions changed, 10 affected
@carlos-alm carlos-alm force-pushed the feat/native-halstead-loc-mi branch from 5ddbe87 to 150c3eb Compare February 28, 2026 02:07
@claude
Copy link

claude bot commented Feb 28, 2026

Claude finished @carlos-alm's task —— View job


PR Review - Comprehensive Analysis

Todo List:

  • Phase 1: Root Cause Analysis - understand the problem being solved
  • Phase 2: Configuration Change Detection - check for risky config changes
  • Phase 3: Test Integrity - verify no weakened assertions or lowered standards
  • Phase 4: Code Quality Review - analyze implementation quality
  • Provide final rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved:

  1. Missing native metrics: The Rust native engine was only computing basic complexity (cognitive, cyclomatic, nesting) while the JS implementation had Halstead complexity, LOC, and Maintainability Index metrics. This created feature parity gaps.
  2. Benchmark labeling confusion: Dev builds and release builds were using the same version labeling, making it impossible to distinguish between local testing vs. published package benchmarks.

Why This Approach:

  • Extends the native Rust engine to match JS implementation feature-for-feature
  • Adds proper versioning infrastructure to separate dev/release benchmark results
  • Uses shared configuration to ensure consistency across all 4 benchmark scripts

Risk Assessment: Medium - Large implementation with complex metric calculations, but addresses real feature gaps rather than hiding issues.

Phase 2: Configuration Change Detection ✅

Analysis: No risky configuration changes detected:

  • biome.json unchanged - no linting rules relaxed
  • No test timeouts increased or coverage thresholds lowered
  • No build configuration or module resolution modified
  • No TypeScript strict mode flags disabled

Phase 3: Test Integrity ✅

Analysis: Tests appear sound:

  • No test files modified in this PR
  • Existing complexity tests already support extended metrics (LOC, Halstead, MI)
  • No weakened assertions or skipped validations
  • PR claims "All 832 existing tests pass"

Phase 4: Code Quality Review ✅

Strengths:

  • Comprehensive implementation: All 8 language extractors updated consistently to use compute_all_metrics
  • Faithful port: Rust implementation mirrors JS logic precisely with equivalent language rules
  • Proper error handling: Uses Option<ComplexityMetrics> return type, graceful fallbacks
  • Clean architecture: Single entry point compute_all_metrics for unified metric computation
  • Safe math: Division-by-zero guards, proper f64 rounding, saturation arithmetic

Code Quality Issues Found:

  1. Potential Integer Overflow crates/codegraph-core/src/complexity.rs:973:

    let sloc = (loc_total.saturating_sub(blank_lines).saturating_sub(comment_lines)).max(1);

    Good use of saturating_sub to prevent underflow.

  2. Test-only unsafe operations crates/codegraph-core/src/complexity.rs:1274,1445,1479:

    • Multiple unwrap() calls in test code only - acceptable for tests
    • Production code uses safe error handling
  3. Benchmark Infrastructure scripts/lib/bench-config.js:

    • Robust npm installation retry logic with exponential backoff
    • Proper cleanup of temp directories
    • Clear separation of dev vs release modes

Minor Concerns:

  • Line 1002-1006: Redundant unwrap_or pattern could be simplified
  • Very long function signatures due to comprehensive metric collection (acceptable given functionality)

Critical Concerns

None identified. This is a legitimate feature enhancement that:

  • Solves real parity gaps between native/JS engines
  • Implements complex algorithms correctly with proper safeguards
  • Does not weaken any existing functionality or tests
  • Follows established patterns in the codebase

Final Recommendation

## Root Cause Analysis
**Problem Being Solved**: Native Rust engine lacked Halstead/LOC/MI metrics vs JS implementation; benchmark versioning couldn't distinguish dev vs release builds
**Why This Approach**: Faithful port of JS complexity algorithms to Rust with shared benchmark configuration for consistency
**Risk Assessment**: Medium complexity but addresses real feature gaps, not hiding issues

## Critical Concerns
None identified. Implementation follows established patterns, uses proper error handling, and extends functionality without weakening existing code.

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)  
- Action: **APPROVE WITH CONDITIONS**
- Reasoning: High-quality implementation solving real feature parity gaps. Comprehensive test coverage already exists. Minor optimization opportunities but no blocking issues. Condition: Verify the 832 existing tests actually pass in CI.

The implementation is technically sound with proper safeguards. The benchmark infrastructure improvements are well-architected. This enhances the native engine capabilities substantially while maintaining code quality standards.


@carlos-alm
Copy link
Contributor Author

Addressing review feedback

Minor concern — redundant unwrap_or (line 1002-1006): Fixed in 150c3eb — simplified Some(halstead.unwrap_or(default)) to halstead.or(Some(default)).

Condition — verify 832 tests pass in CI: Tests pass locally on Windows (Node 22). CI will confirm on Linux once the PR merges. The Halstead/LOC/MI commit already passed CI as PR #159.

shell: true on execFileSync (Greptile): Required for Windows — npm is a .cmd file and execFileSync without shell gets ENOENT. Args passed as array so escaping is safe.

All other findings were positive — no action needed.

@carlos-alm
Copy link
Contributor Author

@claude

@claude
Copy link

claude bot commented Feb 28, 2026

Claude finished @carlos-alm's task —— View job


PR Status Check

Todo List:

  • Acknowledge the mention and assess current PR status
  • Review recent changes addressing feedback
  • Provide updated assessment
  • Ask for specific guidance if needed

Current Status ✅

Great to see you've tagged me! Here's the current status of PR #161:

Review Feedback Addressed:

  • Redundant unwrap_or pattern fixed in commit 150c3eb - I can confirm the code now uses halstead.or(Some(default)) instead of Some(halstead.unwrap_or(default))
  • All CI checks passing - 6/6 checks successful including full CI suite
  • Tests verified - You confirmed 832 tests pass locally, and CI confirms success

Outstanding Items from Previous Reviews:

  • The shell: true discussion with Greptile is resolved - you correctly explained the Windows compatibility requirement
  • No critical concerns remain from either review

Current Assessment:
This PR is in excellent shape! The implementation is solid, feedback has been addressed, and all automated checks are green. The feature parity between Rust native engine and JS implementation is now complete with proper benchmark versioning.

What do you need?

Since you didn't include specific instructions with your @claude mention, could you let me know:

  • Do you need me to review any specific aspect?
  • Are you looking for final approval/merge recommendation?
  • Is there something specific you'd like me to check or help with?

The PR appears ready for merge from a technical perspective! 🚀

@carlos-alm carlos-alm merged commit 0e41ceb into main Feb 28, 2026
19 checks passed
@carlos-alm carlos-alm deleted the feat/native-halstead-loc-mi branch February 28, 2026 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant