fix: benchmark uses stale native addon from npm by carlos-alm · Pull Request #156 · optave/codegraph

carlos-alm · 2026-02-27T06:13:36Z

Summary

Downloads the just-built native artifact from the triggering Publish workflow and overlays it on top of what npm ci installed
Ensures all 4 benchmark jobs (build, embedding, query, incremental) test the freshly compiled addon rather than the last stable npm release (v2.4.0)
Adds actions: read permission to each job for cross-workflow artifact access
Both new steps guarded by if: github.event_name == 'workflow_run' — manual workflow_dispatch runs skip them gracefully

Test plan

Push to main → Publish triggers → Benchmark triggers → verify "Install native addon from Publish" step runs
Native benchmark numbers should reflect the just-compiled addon
Manual workflow_dispatch still works (skips artifact download, falls back to npm)

claude · 2026-02-27T06:13:49Z

Claude finished @carlos-alm's task —— View job

Reviewing PR: fix: benchmark uses stale native addon from npm

Todo List:

Read and analyze the workflow file changes
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity review
Phase 4: Code quality assessment
Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Benchmark workflows were testing the last published npm version (v2.4.0) instead of the freshly built native addon from the current Publish workflow run. This creates a gap between code changes and performance measurements.

Why This Approach: The solution downloads the native-linux-x64 artifact from the triggering Publish workflow and overlays it on the npm-installed package. This ensures benchmarks reflect the actual performance of the code being published.

Risk Assessment: LOW RISK - The change is additive with proper fallbacks and security controls.

Code Quality Assessment

✅ Strengths

Proper conditional guards: All new steps use if: github.event_name == 'workflow_run' ensuring manual runs fallback gracefully to npm packages
Consistent implementation: The same artifact download/install pattern is applied across all 4 benchmark jobs (build, embedding, query, incremental)
Security: Appropriate actions: read permission added for cross-workflow artifact access
Error handling: Creates package.json if missing to ensure require() resolution works
Clear logging: Includes workflow run ID in installation messages for debugging

⚠️ Minor Concerns

Hardcoded artifact name: Uses native-linux-x64 specifically - works for Ubuntu runners but could be more flexible (lines 41, 133, 237, 329)
Missing error handling: No explicit handling if artifact download fails or .node file is missing
Shell globbing risk: Uses *.node wildcard in cp commands (lines 51, 143, 247, 339) - could theoretically copy unintended files

💡 Suggestions

Consider adding error checking: [ -f "${{ runner.temp }}/native-artifact"/*.node ] before cp
The artifact name could use a variable if other platforms are supported later
Consider logging the native addon version/hash for traceability

Phase Analysis

Phase 1 ✅: Solves a real problem - ensures benchmarks test current code, not stale npm packages

Phase 2 ✅: Configuration changes are additive and properly scoped. No rules relaxed or thresholds lowered

Phase 3 ✅: No test integrity issues - this improves test accuracy rather than weakening it

Phase 4 ✅: Code quality is good with proper conditionals, permissions, and fallback behavior

Final Recommendation

Rating: ⭐⭐⭐⭐☆ (4/5)
Action: APPROVE
Reasoning: This is a well-implemented fix for a real issue. The approach is sound, properly guarded, and maintains backward compatibility. Minor suggestions above could be addressed in follow-up if needed, but they don't block approval.

The PR effectively solves the core problem of benchmark staleness while maintaining safety through conditional execution and proper permissions. The implementation is consistent across all benchmark jobs and includes appropriate fallbacks.

greptile-apps · 2026-02-27T06:15:56Z

Greptile Summary

This PR ensures benchmark workflows test freshly compiled native addons rather than the last stable npm release (v2.4.0).

Key changes:

Downloads the just-built native-linux-x64 artifact from the triggering Publish workflow
Overlays the fresh .node file into node_modules/@optave/codegraph-linux-x64-gnu/
Adds actions: read permission to all 4 benchmark jobs for cross-workflow artifact access
Guards both download and install steps with if: github.event_name == 'workflow_run' so manual workflow_dispatch runs skip gracefully

Integration with existing code:

The overlay approach works correctly with the native loader in src/native.js, which uses require('@optave/codegraph-linux-x64-gnu') to load platform-specific packages
Matches the artifact naming convention (native-linux-x64) established in the Publish workflow
Creates the minimal package.json structure needed for Node.js module resolution

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The implementation correctly addresses the stated problem using GitHub Actions' cross-workflow artifact access pattern. The changes are isolated to CI configuration, properly guarded with conditionals for graceful fallback, and consistent across all four benchmark jobs. The directory structure and package.json format match what the native loader expects.
No files require special attention

Important Files Changed

Filename	Overview
.github/workflows/benchmark.yml	Adds native addon overlay from Publish workflow artifacts, ensuring benchmarks test the freshly compiled code rather than stale npm version

_{Last reviewed commit: d1c763c}

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Download the just-built native artifact from the triggering Publish workflow and overlay it on top of what npm ci installed. This ensures benchmarks test the freshly compiled addon rather than the last stable npm release. Both new steps are guarded by `if: github.event_name == 'workflow_run'` so manual workflow_dispatch runs skip them and fall back to npm.

claude · 2026-02-27T06:16:27Z

Claude finished @carlos-alm's task —— View job

Reviewing PR: fix: benchmark uses stale native addon from npm

Todo List:

Read and analyze the workflow file changes
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity review
Phase 4: Code quality assessment
Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Benchmark workflows were testing the last published npm version (v2.4.0) instead of the freshly built native addon from the current Publish workflow run. This creates a gap between code changes and performance measurements.

Why This Approach: The solution downloads the native-linux-x64 artifact from the triggering Publish workflow and overlays it on the npm-installed package. This ensures benchmarks reflect the actual performance of the code being published.

Risk Assessment: LOW RISK - The change is additive with proper fallbacks and security controls.

Code Quality Assessment

✅ Strengths

Proper conditional guards: All new steps use if: github.event_name == 'workflow_run' ensuring manual runs fallback gracefully to npm packages
Consistent implementation: The same artifact download/install pattern is applied across all 4 benchmark jobs (build, embedding, query, incremental)
Security: Appropriate actions: read permission added for cross-workflow artifact access
Error handling: Creates package.json if missing to ensure require() resolution works
Clear logging: Includes workflow run ID in installation messages for debugging

⚠️ Minor Concerns

Hardcoded artifact name: Uses native-linux-x64 specifically - works for Ubuntu runners but could be more flexible (lines 41, 133, 237, 329)
Missing error handling: No explicit handling if artifact download fails or .node file is missing
Shell globbing risk: Uses *.node wildcard in cp commands (lines 51, 143, 247, 339) - could theoretically copy unintended files

💡 Suggestions

Consider adding error checking: [ -f "${{ runner.temp }}/native-artifact"/*.node ] before cp
The artifact name could use a variable if other platforms are supported later
Consider logging the native addon version/hash for traceability

Phase Analysis

Phase 1 ✅: Solves a real problem - ensures benchmarks test current code, not stale npm packages

Phase 2 ✅: Configuration changes are additive and properly scoped. No rules relaxed or thresholds lowered

Phase 3 ✅: No test integrity issues - this improves test accuracy rather than weakening it

Phase 4 ✅: Code quality is good with proper conditionals, permissions, and fallback behavior

Final Recommendation

Rating: ⭐⭐⭐⭐☆ (4/5)
Action: APPROVE
Reasoning: This is a well-implemented fix for a real issue. The approach is sound, properly guarded, and maintains backward compatibility. Minor suggestions above could be addressed in follow-up if needed, but they don't block approval.

The PR effectively solves the core problem of benchmark staleness while maintaining safety through conditional execution and proper permissions. The implementation is consistent across all benchmark jobs and includes appropriate fallbacks.

- Validate codegraph-core.node exists before copying (fail with ::error annotation and directory listing if missing) - Use explicit file path instead of *.node glob to avoid copying unintended files - Log sha256 hash of installed addon for traceability

claude · 2026-02-27T06:37:55Z

Claude finished @carlos-alm's task —— View job

Reviewing PR: fix: benchmark uses stale native addon from npm

Todo List:

Read and analyze the workflow file changes
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity review
Phase 4: Code quality assessment
Provide final rating and recommendation

Root Cause Analysis

Problem Being Solved: Benchmark workflows were testing the last published npm version (v2.4.0) instead of the freshly built native addon from the current Publish workflow run. This creates a gap between code changes and performance measurements.

Why This Approach: The solution downloads the native-linux-x64 artifact from the triggering Publish workflow and overlays it on the npm-installed package. This ensures benchmarks reflect the actual performance of the code being published.

Risk Assessment: LOW RISK - The change is additive with proper fallbacks and security controls.

Code Quality Assessment

✅ Strengths

Proper conditional guards: All new steps use if: github.event_name == 'workflow_run' ensuring manual runs fallback gracefully to npm packages
Consistent implementation: The same artifact download/install pattern is applied across all 4 benchmark jobs (build, embedding, query, incremental)
Security: Appropriate actions: read permission added for cross-workflow artifact access
Error handling: Proper file existence checks and error logging (lines 51-54, 150-153, 261-264, 360-363)
Clear logging: Includes workflow run ID and SHA256 hash for debugging and verification
Fallback behavior: Manual workflow_dispatch runs skip artifact download gracefully

⚠️ Minor Concerns

Hardcoded artifact name: Uses native-linux-x64 specifically - works for Ubuntu runners but could be more flexible
Directory creation: Uses mkdir -p which is safe but could potentially mask permission issues

💡 Suggestions

The error handling is actually quite good with explicit file checks and detailed logging
The SHA256 logging adds good traceability for debugging performance differences

Phase Analysis

Phase 1 ✅: Solves a real problem - ensures benchmarks test current code, not stale npm packages

Phase 2 ✅: Configuration changes are additive and properly scoped. No rules relaxed or thresholds lowered

Phase 3 ✅: No test integrity issues - this improves test accuracy rather than weakening it

Phase 4 ✅: Code quality is good with proper conditionals, permissions, error handling, and fallback behavior

Final Recommendation

Rating: ⭐⭐⭐⭐☆ (4/5)
Action: APPROVE
Reasoning: This is a well-implemented fix for a real issue. The approach is sound, properly guarded, and maintains backward compatibility. The implementation includes good error handling, logging, and consistent patterns across all benchmark jobs.

The PR effectively solves the core problem of benchmark staleness while maintaining safety through conditional execution and proper permissions. The artifact overlay approach is clean and doesn't interfere with the existing npm package structure.

greptile-apps bot reviewed Feb 27, 2026

View reviewed changes

carlos-alm force-pushed the fix/benchmark-stale-addon branch from d1c763c to 3fd2bb0 Compare February 27, 2026 06:16

carlos-alm merged commit 83f2d4e into main Feb 27, 2026
14 checks passed

carlos-alm deleted the fix/benchmark-stale-addon branch February 27, 2026 06:39

greptile-apps bot mentioned this pull request Feb 27, 2026

docs: update build performance benchmarks (v2.4.0) #157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: benchmark uses stale native addon from npm#156

fix: benchmark uses stale native addon from npm#156
carlos-alm merged 2 commits intomainfrom
fix/benchmark-stale-addon

carlos-alm commented Feb 27, 2026

Uh oh!

claude bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 27, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

claude bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

claude bot commented Feb 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

carlos-alm commented Feb 27, 2026

Summary

Test plan

Uh oh!

claude bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewing PR: fix: benchmark uses stale native addon from npm

Todo List:

Root Cause Analysis

Code Quality Assessment

✅ Strengths

⚠️ Minor Concerns

💡 Suggestions

Phase Analysis

Final Recommendation

Uh oh!

greptile-apps bot commented Feb 27, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

claude bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewing PR: fix: benchmark uses stale native addon from npm

Todo List:

Root Cause Analysis

Code Quality Assessment

✅ Strengths

⚠️ Minor Concerns

💡 Suggestions

Phase Analysis

Final Recommendation

Uh oh!

claude bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewing PR: fix: benchmark uses stale native addon from npm

Todo List:

Root Cause Analysis

Code Quality Assessment

✅ Strengths

⚠️ Minor Concerns

💡 Suggestions

Phase Analysis

Final Recommendation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Feb 27, 2026 •

edited

Loading

claude bot commented Feb 27, 2026 •

edited

Loading

claude bot commented Feb 27, 2026 •

edited

Loading