Skip to content

feat: Rust core engine via napi-rs (Phase 1)#8

Merged
carlos-alm merged 12 commits intomainfrom
feat/rust-core
Feb 22, 2026
Merged

feat: Rust core engine via napi-rs (Phase 1)#8
carlos-alm merged 12 commits intomainfrom
feat/rust-core

Conversation

@carlos-alm
Copy link
Contributor

Summary

Add a native Rust parsing engine via napi-rs as an optional high-performance alternative to the existing WASM pipeline. The WASM path remains the always-available baseline; the native engine activates automatically when platform binaries are installed.

Phase 1 — Rust core (napi-rs dual-engine)

  • All 11 language extractors (JS/TS/TSX, Python, Go, Rust, Java, C#, Ruby, PHP, HCL) ported to Rust with a SymbolExtractor trait
  • Rayon-based parallel file parsing
  • 6-level import resolution with confidence scoring
  • Tarjan's SCC cycle detection
  • Incremental parse tree cache for watch mode

Phase 1.2 — Unified API

  • parseFileAuto, parseFilesAuto, getActiveEngine as the single entry point in parser.js
  • Builder and watcher use the unified API — no direct native imports

Phase 1.3 — Incremental parsing

  • Native ParseTreeCache exposed via napi-rs for watch mode old-tree hints

Phase 1.4 — Native import resolution

  • resolveImportPath and computeConfidence extracted to src/resolve.js with native dispatch
  • Batch pre-resolution of all imports in a single native call

Phase 1.5 — Graceful degradation & diagnostics

  • --engine flag plumbed to watch command
  • codegraph info diagnostic command
  • CI workflows for cross-platform test matrix and native build pipeline

Changes

  • 40 files changed, ~5,500 lines added
  • New Rust crate at crates/codegraph-core/
  • New JS modules: src/native.js, src/resolve.js
  • Refactored: src/builder.js, src/parser.js, src/cli.js, src/watcher.js, src/cycles.js
  • CI: build-native.yml (4-platform matrix), ci.yml
  • Tests: engine parity, unified API, import resolution parity, build parity, incremental cache

Test plan

  • npm test — full WASM test suite passes (no native required)
  • Cross-engine parity tests pass when native binary is available
  • Build parity test confirms identical graph output from both engines
  • codegraph info reports correct engine status
  • --engine wasm forces WASM even when native is available
  • Watch mode falls back gracefully when native unavailable

🤖 Generated with Claude Code

Move CPU-intensive parsing, import resolution, and cycle detection to
Rust via napi-rs while keeping JS for CLI, SQLite, MCP, and embeddings.
The WASM path remains as a fallback for unsupported platforms.

Rust crate (crates/codegraph-core):
- All 9 language extractors (JS/TS/TSX, Python, Go, Rust, Java, C#,
  Ruby, PHP, HCL) ported with SymbolExtractor trait
- Rayon-based parallel file parsing
- 6-level import resolution with confidence scoring
- Tarjan's SCC cycle detection
- Incremental parse tree cache for watch mode
- napi-rs API: parseFile, parseFiles, resolveImport, resolveImports,
  computeConfidence, detectCycles, engineName, engineVersion

JS integration:
- src/native.js: platform-aware addon loader with graceful fallback
- src/builder.js: dual-engine path (native fast path + WASM fallback)
- src/cycles.js: dispatches to native detectCycles when available
- src/watcher.js: uses native parseFile in watch mode
- src/cli.js: --engine <native|wasm|auto> global option
- src/index.js: exports isNativeAvailable

CI & packaging:
- GitHub Actions workflow for 4-platform matrix build
- Platform optionalDependencies in package.json

Tests:
- Rust unit tests for JS, Python, Go extractors and cycle detection
- Cross-engine parity tests for all 11 languages (skip when native N/A)
- Full build parity test comparing SQLite output
- Extended JS cycle tests for findCyclesJS
…se 1.2)

Consolidate dual-engine dispatch into parser.js so builder.js and
watcher.js never import native.js directly. The new parseFileAuto,
parseFilesAuto, and getActiveEngine exports handle engine detection,
delegation, and snake_case→camelCase normalization in one place.

- parser.js: add unified API (parseFileAuto, parseFilesAuto, getActiveEngine)
- builder.js: replace ~150 lines of dual-engine code with single parseFilesAuto call
- watcher.js: replace dual-path updateFile with parseFileAuto call
- index.js: export new unified API functions
- tests/parsers/unified.test.js: 12 new tests for the unified API
Expose the Rust ParseTreeCache to JS via napi-rs so watch mode can
reuse cached parse trees (old-tree hint) instead of full re-parses.
WASM falls back to parseFileAuto when native is unavailable.
… 1.4)

Extract resolveImportPath, computeConfidence into src/resolve.js with
native dispatch and JS fallback. Batch pre-resolve all imports in a
single native call before edge building. Add parity tests for import
resolution, confidence scoring, and cycle detection.
…ation

Proposal for (when, do, what) triplets as a complement to session-graph's
(subject, predicate, object) knowledge graph, enabling scoped behavioral
rules attached to codegraph's function-level dependency nodes.
- Plumb --engine flag to watch command (was hardcoded to 'auto')
- Add `codegraph info` diagnostic command for engine/platform reporting
- Stage CI workflows: cross-platform test matrix and native build pipeline
The 0.4 version constraint had no matching crate on crates.io —
the published version is 1.1.0.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 22, 2026

Greptile Summary

This PR introduces a Rust-based native parsing engine as an optional high-performance alternative to the existing WASM pipeline, maintaining backward compatibility through a dual-engine architecture.

Key architecture decisions:

  • Native engine activates automatically when platform binaries are installed via optional dependencies
  • WASM remains the always-available baseline requiring no native compilation
  • All 11 language extractors ported to Rust with a unified SymbolExtractor trait
  • Rayon-based parallel file parsing for performance improvements
  • Unified API (parseFileAuto, parseFilesAuto) abstracts engine selection
  • Import resolution and cycle detection also ported to Rust with JS fallbacks
  • Incremental parsing via ParseTreeCache for watch mode with old-tree hints

Implementation quality:

  • Clean separation of concerns with src/native.js handling graceful loading
  • Comprehensive test coverage including cross-engine parity tests for all languages
  • Build parity test ensures identical SQLite graph output from both engines
  • CI matrix tests on 3 platforms × 2 Node versions without requiring native builds
  • Platform-specific npm packages published separately via build-native.yml workflow

Deployment strategy:

  • 4 platform-specific packages as optionalDependencies (@optave/codegraph-{platform})
  • Native binaries distributed separately, main package works without them
  • --engine flag allows explicit selection (native/wasm/auto)
  • New codegraph info command for diagnostics

Confidence Score: 5/5

  • This PR is safe to merge with excellent architecture and comprehensive testing
  • The implementation demonstrates exceptional quality: (1) clean dual-engine architecture with graceful degradation, (2) comprehensive parity tests ensuring WASM and native produce identical output, (3) proper use of optional dependencies for platform binaries, (4) extensive cross-platform CI coverage, (5) well-structured Rust code mirroring the JS extractors, (6) proper error handling and fallbacks throughout. The incremental approach across phases shows careful planning, and the unified API prevents breaking changes.
  • No files require special attention

Important Files Changed

Filename Overview
crates/codegraph-core/src/lib.rs Clean Rust napi-rs entry point exposing parsing, import resolution, and cycle detection functions
src/native.js Graceful native addon loader with platform-specific package resolution and WASM fallback
src/parser.js Unified parsing API with engine auto-detection, normalizes native/WASM output, adds parseTreeCache support
src/resolve.js Import resolution layer with native dispatch and JS fallback, includes batch optimization
src/builder.js Refactored to use unified parseFilesAuto API instead of direct WASM calls, cleaner separation of concerns
crates/codegraph-core/src/extractors/javascript.rs Rust implementation of JS/TS/TSX extractor mirroring the WASM extraction logic
crates/codegraph-core/src/import_resolution.rs Rust port of import resolution with alias support and extension probing
crates/codegraph-core/src/incremental.rs ParseTreeCache for watch mode with SendWrapper to satisfy napi Send bounds
.github/workflows/build-native.yml 4-platform build matrix generating platform-specific npm packages for native binaries
.github/workflows/ci.yml Cross-platform test matrix and Rust compilation check, runs on all platforms without requiring native build
tests/engines/parity.test.js Cross-engine parity tests verifying identical output from native and WASM for all 11 languages
tests/integration/build-parity.test.js End-to-end build parity test comparing full SQLite graphs from both engines

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User: npm install] --> B{Native binary<br/>available?}
    B -->|Yes| C[Load platform package<br/>via native.js]
    B -->|No| D[Use WASM fallback]
    
    C --> E[parseFilesAuto]
    D --> E
    
    E --> F{engine option}
    F -->|'native'| G[Native Rust Parser<br/>Rayon parallel]
    F -->|'wasm'| H[WASM tree-sitter]
    F -->|'auto'| I{Native available?}
    
    I -->|Yes| G
    I -->|No| H
    
    G --> J[Rust extractors<br/>11 languages]
    H --> K[JS extractors<br/>11 languages]
    
    J --> L[normalizeNativeSymbols]
    K --> L
    
    L --> M[builder.js]
    M --> N[Import resolution<br/>resolve.js + native]
    N --> O[Cycle detection<br/>cycles.js + native]
    O --> P[SQLite graph.db]
    
    Q[Watch mode] --> R[ParseTreeCache<br/>incremental.rs]
    R --> G
Loading

Last reviewed commit: 151fceb

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

40 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

carlos-alm and others added 5 commits February 21, 2026 19:11
tree-sitter-php 0.23 exports LANGUAGE_PHP directly without
a feature gate. The "php" feature doesn't exist in this crate.
- Pin @napi-rs/cli to v3 to match napi crate dependencies
- Remove invalid --platform flag (v2 only) from napi build command
- Bump napi-build from v2 to v3 to align with napi/napi-derive v3
- Replace find with shell glob for Windows runner compatibility
The review action completed successfully but couldn't post comments
because it only had read access to pull requests.
…sion

The claude-code-review action requires the workflow file to be identical
on the PR branch and default branch. Also napi-build v3 does not exist
on crates.io — latest is 2.x.
@carlos-alm carlos-alm merged commit 9944f37 into main Feb 22, 2026
9 checks passed
@carlos-alm carlos-alm deleted the feat/rust-core branch February 22, 2026 02:58
carlos-alm added a commit that referenced this pull request Feb 22, 2026
feat: Rust core engine via napi-rs (Phase 1)

Add a native Rust parsing engine as an optional high-performance
  alternative to the existing WASM pipeline. The WASM path remains the
  always-available baseline; the native engine activates automatically
  when platform binaries are installed.

  Rust crate (crates/codegraph-core):
  - All 11 language extractors ported with SymbolExtractor trait
  - Rayon-based parallel file parsing
  - 6-level import resolution with confidence scoring
  - Tarjan's SCC cycle detection
  - Incremental parse tree cache for watch mode
  - napi-rs bindings: parseFile, parseFiles, resolveImport,
    resolveImports, computeConfidence, detectCycles

  JS integration:
  - Unified API via parseFileAuto/parseFilesAuto/getActiveEngine
  - Builder and watcher use unified API — no direct native imports
  - Import resolution extracted to src/resolve.js with native dispatch
  - --engine <native|wasm|auto> CLI flag and `codegraph info` command
  - Graceful fallback when native binary is unavailable

  CI & packaging:
  - 4-platform matrix build workflow (build-native.yml)
  - Cross-platform test matrix (ci.yml)

  40 files changed, ~5200 additions
@greptile-apps greptile-apps bot mentioned this pull request Mar 3, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant