feat: Domain Specific Language (DSL) using JIT or AOT#252
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR introduces a new “Proposal 2” DSL runtime pipeline (JIT via Cranelift, Native AoT, and WASM), adds benchmarking/examples around the runtime backends, and adds an example R package (pharmsolr) that compiles and simulates DSL models via extendr.
Changes:
- Add a new DSL frontend + runtime compilation targets (JIT/AoT/WASM), plus a shared Rust backend code emitter and AoT loader/exporter.
- Add Criterion benchmarks and multiple examples to compare runtime backends and demonstrate usage.
- Remove legacy JSON/exa codegen/loading paths and add a new
pharmsolrR package wrapper for the DSL runtime.
Reviewed changes
Copilot reviewed 59 out of 86 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/json/errors.rs | Removed legacy JSON model error types (JSON pipeline removal). |
| src/json/codegen/sde.rs | Removed legacy JSON SDE codegen placeholder module. |
| src/json/codegen/ode.rs | Removed legacy JSON ODE codegen placeholder module. |
| src/json/codegen/mod.rs | Removed legacy JSON code generator implementation and tests. |
| src/json/codegen/closures.rs | Removed legacy JSON closure generator implementation and tests. |
| src/exa/mod.rs | Removed legacy exa module entrypoint. |
| src/exa/load.rs | Removed legacy dynamic library loader for exa. |
| src/exa/build.rs | Removed legacy cargo-template compilation logic for exa. |
| src/error/mod.rs | Added a new out-of-range error variant for output equation indices. |
| src/dsl/mod.rs | Added the new DSL module structure and public exports behind feature gates. |
| src/dsl/ast.rs | Added DSL AST types and pretty-printer for structured-block models. |
| src/dsl/diagnostic.rs | Added span/diagnostic rendering for parse/semantic errors. |
| src/dsl/lexer.rs | Added lexer/tokenizer for the new DSL syntax. |
| src/dsl/ir.rs | Added typed IR definitions for semantic/lowering phases. |
| src/dsl/rust_backend.rs | Added Rust source emitter for native/WASM runtime kernels + ABI symbols. |
| src/dsl/runtime.rs | Added runtime compilation/load entrypoints and cross-backend tests. |
| src/dsl/aot.rs | Added native AoT export/load implementation with API versioning. |
| src/build_support.rs | Introduced shared cargo-template build utilities used by runtime backends. |
| benches/proposal_runtime_matrix.rs | Added benchmark matrix for compile + prediction across backends/kinds. |
| examples/proposal_dsl_runtime_jit.rs | Added runtime JIT example for ODE/analytical/SDE. |
| examples/proposal_dsl_runtime_native_aot.rs | Added runtime Native AoT example for ODE/analytical/SDE. |
| examples/proposal_dsl_runtime_wasm.rs | Added runtime WASM example for ODE/analytical/SDE. |
| examples/proposal_dsl_runtime_meta.rs | Added example comparing the same model across JIT/AoT/WASM. |
| examples/bimodal_ke_entrypoint_meta.rs | Added entrypoint comparison example across public DSL APIs/backends. |
| examples/bimodal_ke_dsl_wasm.rs | Added direct DSL→WASM artifact compile+load example. |
| examples/bimodal_ke_dsl_aot.rs | Added direct DSL→AoT artifact compile+load example. |
| examples/bimodal_ke_dsl_runtime_jit.rs | Added runtime JIT example for a small ODE model. |
| examples/bimodal_ke_dsl_runtime_native_aot.rs | Added runtime Native AoT example for a small ODE model. |
| examples/bimodal_ke_dsl_runtime_wasm.rs | Added runtime WASM example for a small ODE model. |
| examples/exa.rs | Removed legacy exa example. |
| examples/json_exa.rs | Removed legacy JSON+exa comparison example. |
| pharmsolr/src/rust/Cargo.toml | Added an embedded Rust staticlib crate for the R package via extendr. |
| pharmsolr/src/rust/src/lib.rs | Added extendr bindings exposing compile+simulate and metadata helpers. |
| pharmsolr/src/entrypoint.c | Added R routine registration for extendr-generated wrappers. |
| pharmsolr/src/Makevars | Added Unix build rules to compile the Rust staticlib during R install. |
| pharmsolr/src/Makevars.win | Added Windows build rules (GNU target) for Rust staticlib during R install. |
| pharmsolr/R/pharmsolr.R | Added R-friendly data.frame API and name→index resolution helpers. |
| pharmsolr/R/extendr-wrappers.R | Added generated extendr wrapper stubs and dynlib registration. |
| pharmsolr/man/compile_model.Rd | Added R documentation for compile_model(). |
| pharmsolr/man/simulate_subject.Rd | Added R documentation for simulate_subject(). |
| pharmsolr/man/model_metadata.Rd | Added R documentation for routes/outputs/params/covariates accessors. |
| pharmsolr/man/route.Rd | Added R documentation for route lookup helpers. |
| pharmsolr/man/outeq.Rd | Added R documentation for output lookup helpers. |
| pharmsolr/inst/examples/onecmt.R | Added end-to-end R example using the name-based API. |
| pharmsolr/inst/examples/bench.R | Added a simple R benchmark script for compile+simulate throughput. |
| pharmsolr/README.md | Added package README describing installation, usage, and DSL format. |
| pharmsolr/NAMESPACE | Added roxygen-generated exports and dynlib registration. |
| pharmsolr/DESCRIPTION | Added R package metadata for pharmsolr. |
| pharmsolr/.gitignore | Added ignore rules for R + embedded Rust build artifacts. |
| Cargo.toml | Added DSL backend features and new optional deps (Cranelift/wasmtime) + bench entry. |
| CHANGELOG.md | Added changelog entries for runtime benchmarks and a WASM buffer fix note. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| idx[is.na(col)] <- 1L | ||
| as.integer(idx - 1L) | ||
| } else { | ||
| as.integer(col) |
There was a problem hiding this comment.
For non-character cmt/outeq columns, resolve_indices() returns as.integer(col) without handling NA. In R, NA_integer_ becomes -2147483648, which will be seen in Rust as a negative i32 and rejected (even when the field is irrelevant for the given evid). Consider mirroring the character-path behavior by mapping NA to a safe default (e.g., 0) and/or validating bounds before calling into Rust.
| as.integer(col) | |
| idx <- as.integer(col) | |
| idx[is.na(col)] <- 0L | |
| bad <- !is.na(col) & (idx < 0L | idx >= length(names_vec)) | |
| if (any(bad)) { | |
| stop(sprintf( | |
| "invalid %s index value(s): [%s]; valid range: [0, %d]", | |
| label, | |
| paste(unique(idx[bad]), collapse = ", "), | |
| length(names_vec) - 1L | |
| )) | |
| } | |
| idx |
| let cmt = i32_to_usize(cmts[i].0, "cmt", i)?; | ||
| let outeq = i32_to_usize(outeqs[i].0, "outeq", i)?; | ||
| builder = match evid { | ||
| // EVID=1 is a dose; pharmsol distinguishes bolus from infusion | ||
| // by whether `dur` is zero (matches NONMEM semantics). | ||
| 1 => { | ||
| if dur > 0.0 { | ||
| builder.infusion(t, amt, cmt, dur) | ||
| } else { | ||
| builder.bolus(t, amt, cmt) | ||
| } | ||
| } | ||
| 0 => builder.observation(t, 0.0, outeq), |
There was a problem hiding this comment.
The Rust bridge validates/converts both cmt and outeq for every row before branching on evid. This will error if either column contains the NA sentinel (or any negative value) even when that field is unused for the event type (e.g., outeq for dose rows, cmt for observation rows). A concrete fix is to move the i32_to_usize calls inside the match evid arms so only the relevant index is validated per event kind, and treat the NA sentinel explicitly if you want to accept NA for unused fields.
| let cmt = i32_to_usize(cmts[i].0, "cmt", i)?; | |
| let outeq = i32_to_usize(outeqs[i].0, "outeq", i)?; | |
| builder = match evid { | |
| // EVID=1 is a dose; pharmsol distinguishes bolus from infusion | |
| // by whether `dur` is zero (matches NONMEM semantics). | |
| 1 => { | |
| if dur > 0.0 { | |
| builder.infusion(t, amt, cmt, dur) | |
| } else { | |
| builder.bolus(t, amt, cmt) | |
| } | |
| } | |
| 0 => builder.observation(t, 0.0, outeq), | |
| builder = match evid { | |
| // EVID=1 is a dose; pharmsol distinguishes bolus from infusion | |
| // by whether `dur` is zero (matches NONMEM semantics). | |
| 1 => { | |
| let cmt = i32_to_usize(cmts[i].0, "cmt", i)?; | |
| if dur > 0.0 { | |
| builder.infusion(t, amt, cmt, dur) | |
| } else { | |
| builder.bolus(t, amt, cmt) | |
| } | |
| } | |
| 0 => { | |
| let outeq = i32_to_usize(outeqs[i].0, "outeq", i)?; | |
| builder.observation(t, 0.0, outeq) | |
| } |
| let _ = &model_name; | ||
| event_callback("log".into(), output); |
There was a problem hiding this comment.
model_name is captured but not used (only let _ = &model_name;). Either remove the parameter to avoid misleading API surface, or incorporate it into the callback payload (e.g., include it in the event kind or prepend it to the message) so concurrent builds can be attributed correctly.
| let _ = &model_name; | |
| event_callback("log".into(), output); | |
| let message = format!("[{model_name}] {output}"); | |
| event_callback("log".into(), message); |
| if flavor.emits_wasm_allocators() { | ||
| writeln!(source, "#[no_mangle]").unwrap(); | ||
| writeln!( | ||
| source, | ||
| "pub extern \"C\" fn {ALLOC_F64_BUFFER_SYMBOL}(len: usize) -> *mut f64 {{" | ||
| ) | ||
| .unwrap(); | ||
| writeln!(source, " if len == 0 {{").unwrap(); | ||
| writeln!(source, " return core::ptr::null_mut();").unwrap(); | ||
| writeln!(source, " }}").unwrap(); | ||
| writeln!( | ||
| source, | ||
| " let mut buffer = Vec::<f64>::with_capacity(len);" | ||
| ) | ||
| .unwrap(); | ||
| writeln!(source, " let ptr = buffer.as_mut_ptr();").unwrap(); | ||
| writeln!(source, " core::mem::forget(buffer);").unwrap(); | ||
| writeln!(source, " ptr").unwrap(); | ||
| writeln!(source, "}}").unwrap(); | ||
| writeln!(source).unwrap(); | ||
| writeln!(source, "#[no_mangle]").unwrap(); | ||
| writeln!( | ||
| source, | ||
| "pub unsafe extern \"C\" fn {FREE_F64_BUFFER_SYMBOL}(ptr: *mut f64, len: usize) {{" | ||
| ) | ||
| .unwrap(); | ||
| writeln!(source, " if ptr.is_null() || len == 0 {{").unwrap(); | ||
| writeln!(source, " return;").unwrap(); | ||
| writeln!(source, " }}").unwrap(); | ||
| writeln!( | ||
| source, | ||
| " drop(Vec::<f64>::from_raw_parts(ptr, len, len));" | ||
| ) | ||
| .unwrap(); | ||
| writeln!(source, "}}").unwrap(); |
There was a problem hiding this comment.
The generated WASM allocator ABI relies on a strict contract: the caller must later free the pointer with the exact same len used for allocation, and the pointer is assumed to come from this allocator. This is a sharp edge for host integrations (and easy to misuse when passing buffers across FFI). Add explicit doc comments (in the emitted source and/or the host-facing docs) describing the required call pattern and invariants (including that the allocation uses capacity and the len argument must match on free).
|
| Branch | feat/jit-cranelift-wasm |
| Testbed | mhovd-pgx |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result nanoseconds (ns) (Result Δ%) | Upper Boundary nanoseconds (ns) (Limit %) |
|---|---|---|---|
| Conditional dose modification | 📈 view plot 🚷 view threshold | 1,167.10 ns(-3.74%)Baseline: 1,212.49 ns | 1,262.56 ns (92.44%) |
| Create large dataset (100 subjects) | 📈 view plot 🚷 view threshold | 54,563.00 ns(-0.53%)Baseline: 54,854.75 ns | 57,269.30 ns (95.27%) |
| Data expand complex (1h intervals) | 📈 view plot 🚷 view threshold | 27,839.00 ns(-0.67%)Baseline: 28,025.78 ns | 30,294.28 ns (91.90%) |
| Data expand simple (1h intervals) | 📈 view plot 🚷 view threshold | 469.94 ns(-4.45%)Baseline: 491.80 ns | 521.43 ns (90.13%) |
| Data expand with additional time | 📈 view plot 🚷 view threshold | 37,970.00 ns(-2.32%)Baseline: 38,873.19 ns | 42,165.24 ns (90.05%) |
| Filter exclude subjects | 📈 view plot 🚷 view threshold | 30,752.00 ns(-0.10%)Baseline: 30,781.97 ns | 31,516.25 ns (97.58%) |
| Filter include subjects | 📈 view plot 🚷 view threshold | 7,795.10 ns(-1.91%)Baseline: 7,946.52 ns | 8,384.87 ns (92.97%) |
| Modify all bolus doses | 📈 view plot 🚷 view threshold | 1,139.80 ns(-3.17%)Baseline: 1,177.07 ns | 1,222.12 ns (93.26%) |
| Modify all infusion doses | 📈 view plot 🚷 view threshold | 1,156.40 ns(-4.36%)Baseline: 1,209.08 ns | 1,256.16 ns (92.06%) |
| SubjectBuilder multi-occasion | 📈 view plot 🚷 view threshold | 265.91 ns(+0.41%)Baseline: 264.83 ns | 275.27 ns (96.60%) |
| SubjectBuilder simple | 📈 view plot 🚷 view threshold | 103.76 ns(-0.67%)Baseline: 104.45 ns | 109.36 ns (94.88%) |
| SubjectBuilder with covariates | 📈 view plot 🚷 view threshold | 279.90 ns(+0.91%)Baseline: 277.37 ns | 294.54 ns (95.03%) |
| nca_auc_cmax_metrics | 📈 view plot 🚷 view threshold | 571.58 ns(-2.72%)Baseline: 587.57 ns | 616.97 ns (92.64%) |
| nca_population/10 | 📈 view plot 🚷 view threshold | 47,132.00 ns(+0.06%)Baseline: 47,106.00 ns | 49,825.90 ns (94.59%) |
| nca_population/100 | 📈 view plot 🚷 view threshold | 125,600.00 ns(+2.53%)Baseline: 122,500.59 ns | 127,988.34 ns (98.13%) |
| nca_population/500 | 📈 view plot 🚷 view threshold | 370,920.00 ns(-8.39%)Baseline: 404,888.24 ns | 421,574.08 ns (87.98%) |
| nca_single_subject | 📈 view plot 🚷 view threshold | 994.79 ns(-1.78%)Baseline: 1,012.79 ns | 1,050.98 ns (94.65%) |
| one_compartment | 📈 view plot 🚷 view threshold | 28,562.00 ns(+21.76%)Baseline: 23,458.39 ns | 29,788.80 ns (95.88%) |
| one_compartment_covariates | 📈 view plot 🚷 view threshold | 43,495.00 ns(+39.23%)Baseline: 31,240.61 ns | 45,480.89 ns (95.63%) |
| readme 20 | 📈 view plot 🚷 view threshold | 336,110.00 ns(+1.81%)Baseline: 330,122.22 ns | 352,219.95 ns (95.43%) |
| two_compartment | 📈 view plot 🚷 view threshold | 39,146.00 ns(+47.58%)Baseline: 26,525.53 ns | 40,919.79 ns (95.67%) |
|
|
||
| ### Added | ||
|
|
||
| - Add Proposal 2 runtime benchmark matrix across ODE, analytical, and SDE models. | ||
| - Add Proposal 2 release-readiness summary and compatibility/performance baseline notes. | ||
|
|
||
| ### Fixed | ||
|
|
||
| - Zero the reusable WASM guest output buffer before sparse kernel calls so diffusion outputs cannot reuse stale values across invocations. |
There was a problem hiding this comment.
Not necessary to populate changelog for PRs
| argmin-math = "0.5.1" | ||
| tracing = "0.1.41" | ||
| moka = { version = "0.12.14", features = ["sync"] } | ||
| wasmtime = { version = "28.0.1", optional = true } |
There was a problem hiding this comment.
Latest version is 44.0.0, but we use 28.0.1 ?
| cranelift = { version = "0.115", optional = true } | ||
| cranelift-jit = { version = "0.115", optional = true } | ||
| cranelift-module = { version = "0.115", optional = true } | ||
| cranelift-native = { version = "0.115", optional = true } |
There was a problem hiding this comment.
Should these also be updated?
| @@ -0,0 +1,342 @@ | |||
| use std::env; | |||
There was a problem hiding this comment.
I assume this file is no longer needed?
| fn subject_for_indices(route_index: usize, output_index: usize) -> Subject { | ||
| let mut builder = Subject::builder(MODEL_NAME).infusion(0.0, 500.0, route_index, 0.5); | ||
| for time in OBSERVATION_TIMES { | ||
| builder = builder.missing_observation(time, output_index); | ||
| } | ||
| builder.build() | ||
| } | ||
|
|
||
| pub fn legacy_subject() -> Subject { | ||
| subject_for_indices(0, 0) | ||
| } |
There was a problem hiding this comment.
legacy_subject seems unnecessary here?
maybe rename to subject_with_index
| //! Backend-neutral frontend crate for the pharmsol Proposal 2 DSL. | ||
| //! | ||
| //! Slice 2 moves the parsing frontend here on top of the shared frontend data | ||
| //! modules already extracted in Slice 1: | ||
| //! | ||
| //! - AST and model syntax types | ||
| //! - diagnostic and report types | ||
| //! - typed semantic IR | ||
| //! - lexical analysis | ||
| //! - canonical parse entrypoints | ||
| //! - authoring desugaring used by the parser | ||
| //! - semantic analysis and diagnostics | ||
| //! | ||
| //! Execution lowering now also lives here, while `pharmsol::dsl` continues to | ||
| //! re-export the stable runtime-facing surface during the migration. |
| @@ -0,0 +1,22 @@ | |||
| model recommended_style { | |||
| @@ -0,0 +1,13 @@ | |||
| model = recommended_style | |||
| pub(crate) fn process_events( | ||
| &self, | ||
| reorder: Option<(&Fa, &Lag, &[f64], &Covariates)>, | ||
| ignore: bool, | ||
| _ignore: bool, | ||
| ) -> Vec<Event> { | ||
| let mut occ = self.clone(); | ||
| occ.add_lagtime(reorder); | ||
| occ.add_bioavailability(reorder); | ||
|
|
There was a problem hiding this comment.
If we ignore the ignore argument, then we should remove it from the function.
| @@ -0,0 +1,132 @@ | |||
| model one_cmt_oral_iv { | |||
No description provided.