Skip to content

Migrate runtime from AST post-order execution to bytecode IR + VM with execution-only benchmarks and VM stack optimization#64

Merged
ashyanSpada merged 8 commits intomasterfrom
copilot/transform-execution-engine
Apr 10, 2026
Merged

Migrate runtime from AST post-order execution to bytecode IR + VM with execution-only benchmarks and VM stack optimization#64
ashyanSpada merged 8 commits intomasterfrom
copilot/transform-execution-engine

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 9, 2026

This PR transitions execution from recursive AST post-order traversal to a linear bytecode pipeline while preserving public API behavior and expression semantics. Expressions are still parsed into AST, then compiled into bytecode, and executed by a stack-based VM.

  • Runtime architecture

    • Introduces an explicit 3-stage pipeline: parse (AST)compile (bytecode IR)execute (VM).
    • Keeps parse_expression and AST types intact for compatibility.
  • Bytecode IR and compiler

    • Adds src/bytecode.rs with:
      • Program { instructions, constants }
      • Instruction opcodes for literals, references, function calls, unary/binary/postfix ops, list/map construction, statement chaining, and control-flow jumps.
    • Implements AST→bytecode compilation with parity-preserving evaluation order.
    • Encodes setter ops (=, +=, etc.) with explicit reference target metadata.
  • VM executor

    • Adds a linear instruction-pointer interpreter with operand stack semantics.
    • Preserves dynamic runtime resolution via existing managers:
      • InfixOpManager, PrefixOpManager, PostfixOpManager
      • Context function lookup with fallback to InnerFunctionManager
    • Preserves current ternary semantics and error mapping behavior (including type and operator/function errors).
    • Optimizes value-stack handling with:
      • preallocated stack capacity based on instruction count,
      • explicit stack pointer tracking,
      • capacity growth before push when full.
  • API integration

    • Updates execute to use parse_expressioncompile_expressionexecute_program.
    • Leaves AST .exec() path in place for compatibility and parity reference.
    • Exposes bytecode module publicly to support direct execution benchmarking.
  • Performance benchmarking additions

    • Extends benches/execute_expression.rs with execution-only comparisons between:
      • ast.exec(&mut ctx)
      • bytecode::execute_program(&program, &mut ctx)
    • Covers multiple scenarios: short expression, long expression, function call, list/map mix, and ternary chain.
    • Adds benchmarks_execution_only.md with captured benchmark results and summary.
  • Testing/coverage

    • Adds VM-focused unit tests covering stack growth/reserve behavior, jump target validation, non-bool jump conditions, and pop underflow to raise execution-level coverage.
  • Docs update

    • Updates crate-level docs and README wording to reflect parse/compile/execute architecture.
pub fn execute(expr: &str, mut ctx: context::Context) -> Result<Value> {
    let ast = parse_expression(expr)?;
    let program = bytecode::compile_expression(&ast)?;
    bytecode::execute_program(&program, &mut ctx)
}

@ashyanSpada
Copy link
Copy Markdown
Owner

@copilot give me a comparative overview of the execution performance of these two execution algorithms

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 91.39785% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.96%. Comparing base (fddc8ae) to head (9879a9f).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/bytecode.rs 91.25% 16 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #64      +/-   ##
==========================================
+ Coverage   90.66%   90.96%   +0.29%     
==========================================
  Files          11       12       +1     
  Lines        1211     1394     +183     
==========================================
+ Hits         1098     1268     +170     
- Misses        113      126      +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot give me a comparative overview of the execution performance of these two execution algorithms

I benchmarked both versions with the existing Criterion target (cargo bench --bench execute_expression).

  • AST execution (baseline fddc8ae)
    • execute_expression: 3.4835–3.5751 µs
    • parse_expression: 2.1322–2.1451 µs
  • Bytecode VM execution (current ceb93eb)
    • execute_expression: 3.6680–3.7981 µs
    • parse_expression: 2.0876–2.0910 µs

So on this benchmark, execution is currently ~5–7% slower with VM, while parse-only is slightly faster (small difference, likely noise-level). The current execute() path still pays parse+compile+run each call; the VM advantage should become clearer when we cache/reuse compiled bytecode across repeated evaluations.

Copilot AI requested a review from ashyanSpada April 9, 2026 07:44
@ashyanSpada
Copy link
Copy Markdown
Owner

@copilot compare the execution performance from multiple dimensions I'm curious about why Bytecode based algorithm is slower

@ashyanSpada ashyanSpada marked this pull request as ready for review April 9, 2026 09:14
Copilot AI review requested due to automatic review settings April 9, 2026 09:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the expression runtime from recursive AST execution to a bytecode compilation step executed by a stack-based VM, while keeping the existing parse-to-AST API and expression semantics intact.

Changes:

  • Updated execute() to parse → compile (bytecode) → execute (VM).
  • Added a new bytecode IR, compiler (AST→IR), and VM interpreter with parity tests against AST execution.
  • Updated crate/docs wording to reflect the parse/compile/execute pipeline.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/lib.rs Routes execute() through AST parsing, bytecode compilation, and VM execution; updates crate-level docs.
src/bytecode.rs Introduces Program/Instruction, compiler, VM executor, and AST-vs-VM parity tests.
README.md Updates introduction to describe the new AST→bytecode→VM architecture.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +319 to +337
mod tests {
use super::{compile_expression, execute_program, Instruction};
use crate::{create_context, parse_expression, Value};
fn run_ast(expr: &str) -> crate::Result<Value> {
let ast = parse_expression(expr)?;
let mut ctx = create_context!(
"d" => 3,
"f" => Arc::new(|_| Ok(Value::from(3)))
);
ast.exec(&mut ctx)
}

fn run_vm(expr: &str) -> crate::Result<Value> {
let ast = parse_expression(expr)?;
let program = compile_expression(&ast)?;
let mut ctx = create_context!(
"d" => 3,
"f" => Arc::new(|_| Ok(Value::from(3)))
);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arc is used in the test context setup (Arc::new(...)) but the test module does not import it, so this file won’t compile. Add use std::sync::Arc; in the tests module (or qualify the uses) to fix the compilation error.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot compare the execution performance from multiple dimensions I'm curious about why Bytecode based algorithm is slower

From the same Criterion run (cargo bench --bench execute_expression), here’s a multi-dimensional comparison:

  • End-to-end latency (current API path)

    • AST baseline (fddc8ae): 3.48–3.58 µs
    • Bytecode VM (ceb93eb): 3.67–3.80 µs
    • Delta: VM is ~5–7% slower in this benchmark.
  • Parse stage latency

    • AST baseline: 2.132–2.145 µs
    • VM commit: 2.088–2.091 µs
    • Parse is slightly faster now; slowdown is not from parsing.
  • Non-parse runtime budget (approx = execute - parse)

    • AST path: ~1.35–1.44 µs
    • VM path: ~1.58–1.71 µs
    • Extra ~0.2–0.3 µs is where VM currently loses.

Why VM is slower right now (likely contributors):

  1. Extra compile stage per call: execute() does parse + compile + run every invocation, while old path was parse + direct AST eval.
  2. Interpreter dispatch overhead: VM adds an instruction loop (match per opcode + stack push/pop) instead of direct recursive AST calls.
  3. Allocation/churn overhead: transient vectors/reversals for call args/list/map construction in VM execution.
  4. Dynamic lookup overhead in hot loop: operator/function manager access in each opcode path adds lock/lookups.

So the current numbers are expected for one-shot execution. The VM architecture should show its advantage when we reuse compiled bytecode across repeated evaluations (amortizing compile cost).

@ashyanSpada
Copy link
Copy Markdown
Owner

@copilot Compare the execution performance solely from the execution level, for instance execute AST and Byte code directly. The benchmark should cover multiple scenarios like short expression, long expression, function call, and so on. finally, please dump the benchmark into an
md file.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot Compare the execution performance solely from the execution level, for instance execute AST and Byte code directly. The benchmark should cover multiple scenarios like short expression, long e...

Implemented in commit cb75c79.

I added execution-only benchmarks that compare direct ast.exec() vs direct bytecode::execute_program() across multiple scenarios: short expression, long expression, function call, list/map mix, and ternary chain.

I also dumped the measured benchmark output into /home/runner/work/expression_engine_rs/expression_engine_rs/benchmarks_execution_only.md.

No UI changes in this PR, so there is no UI screenshot to provide.

Copilot AI changed the title Migrate runtime from AST post-order execution to bytecode IR + VM Migrate runtime from AST post-order execution to bytecode IR + VM with execution-only benchmarks Apr 9, 2026
Copy link
Copy Markdown
Owner

@ashyanSpada ashyanSpada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot current implementation of value stack is rather slow. Consider pre allocating value stack with limit length and using current pointer to push and pop value. Notice if current pointer exceed value stack, expand it at first for more space

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot current implementation of value stack is rather slow. Consider pre allocating value stack with limit length and using current pointer to push and pop value. Notice if current pointer exceed v...

Implemented in commit 38a730f (with the core stack change introduced in f620115).

The VM now preallocates stack capacity from instruction count, tracks an explicit stack_top pointer, and grows capacity when full before push. Push/pop paths now use pointer-based invariants to avoid repeated dynamic-growth overhead in the hot loop.

Screenshot: N/A (no UI changes in this backend/runtime update).

Copilot AI changed the title Migrate runtime from AST post-order execution to bytecode IR + VM with execution-only benchmarks Migrate runtime from AST post-order execution to bytecode IR + VM with execution-only benchmarks and VM stack optimization Apr 9, 2026
Copilot AI requested a review from ashyanSpada April 9, 2026 19:05
@@ -0,0 +1,33 @@
# Execution-only benchmark: AST vs Bytecode VM
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot According to the current implementation, update the performance comparison data.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit aa0d38b.

benchmarks_execution_only.md now reflects freshly re-run execution-only Criterion data from the current implementation (including the VM stack optimization), with updated per-scenario deltas and takeaways.

Screenshot: N/A (no UI changes; benchmark documentation update only).

Copilot AI requested a review from ashyanSpada April 9, 2026 21:48
@ashyanSpada
Copy link
Copy Markdown
Owner

@codex[agent] Please increase the coverage rate of Unit tests.

Co-authored-by: ashyanSpada <22587148+ashyanSpada@users.noreply.github.com>
@Codex
Copy link
Copy Markdown

Codex AI commented Apr 10, 2026

@codex[agent] Please increase the coverage rate of Unit tests.

Added VM-focused unit tests covering stack growth/reserve, jump validation bounds, JumpIfFalse type errors, and pop underflow to raise execution-level coverage (commit 9879a9f).

@ashyanSpada ashyanSpada merged commit eb0041b into master Apr 10, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants