chore: Refactor benchmarks#258
Merged
Merged
Conversation
Contributor
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors the Criterion benchmark suite to focus on more representative end-to-end workloads (predictions, NCA, and population likelihood), while removing older micro/DSL-matrix benchmarks and tightening benchmark target configuration in Cargo.toml.
Changes:
- Disable Cargo’s automatic benchmark discovery and explicitly define the benchmark targets that should be built/run.
- Add new benchmarks for prediction generation and population likelihood (matrix + batch).
- Simplify and consolidate the NCA benchmark to a fixed-size population run, and remove older benchmark files.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml | Disables autobenches and replaces the previous bench list with explicit predictions, nca, and likelihood targets. |
| benches/predictions.rs | Adds a new benchmark covering analytical vs ODE prediction generation with cold/hot cache variants. |
| benches/nca.rs | Refactors NCA benchmarks into a single grouped population benchmark with throughput + flat sampling. |
| benches/likelihood.rs | Adds a new benchmark for log_likelihood_matrix and log_likelihood_batch population workloads. |
| benches/runtime_matrix.rs | Removes the previous DSL runtime compile/predict benchmark matrix. |
| benches/performance.rs | Removes the previous “readme”-style performance benchmark. |
| benches/ode.rs | Removes the previous handwritten ODE benchmarks. |
| benches/likelihood_matrix.rs | Removes the prior likelihood benchmark suite (breakdown + matrix/batch variants). |
| benches/data.rs | Removes the previous dataset/builder operation benchmarks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Siel
approved these changes
May 14, 2026
Comment on lines
+1824
to
+1828
| let key = ( | ||
| subject.hash(), | ||
| crate::simulator::equation::spphash(support_point), | ||
| error_models.hash(), | ||
| ); |
Comment on lines
+240
to
+245
| model | ||
| .estimate_predictions( | ||
| black_box(&subject), | ||
| black_box(&theta), | ||
| ) | ||
| .unwrap(), |
Comment on lines
+1833
to
+1839
| let predictions = model.estimate_predictions(subject, support_point)?; | ||
| let log_lik = predictions.log_likelihood(error_models)?; | ||
| cache.insert(key, log_lik); | ||
| Ok(log_lik) | ||
| } else { | ||
| let predictions = model.estimate_predictions(subject, support_point)?; | ||
| predictions.log_likelihood(error_models) |
|
|
||
| let predictions = NativeSdeModel::estimate_predictions(self, subject, support_point)?; | ||
| let likelihood = match bound_error_models.as_ref() { | ||
| Some(error_models) => Some(predictions.log_likelihood(error_models)?.exp()), |
This was referenced May 14, 2026
Closed
Merged
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We need more meaningful benchmarks that actually test what we want to evaluate.