perf: monomorphic dispatch for LogUp constraints#593
Conversation
Store concrete copies of LookupBatchedTermConstraint and LookupAccumulatedConstraint alongside the boxed versions. Override compute_transition_prover on AirWithBuses to call the concrete types directly, enabling the compiler to inline fingerprint computation and eliminate vtable overhead for the expensive extension-field constraints. Base (domain) constraints still use dyn dispatch since they are heterogeneous user-defined types.
Codex Code ReviewNo actionable issues found in the PR diff. I reviewed the added monomorphic LogUp prover dispatch and the cloned direct constraint construction in Verification note: I attempted |
| // Phase 2: LogUp term constraints — monomorphic dispatch (concrete type) | ||
| for c in &self.logup_term_constraints_direct { | ||
| c.evaluate_verifier(evaluation_context, ext_evals); | ||
| } | ||
|
|
||
| // Phase 3: LogUp accumulated constraint — monomorphic dispatch (concrete type) | ||
| if let Some(c) = &self.logup_accumulated_direct { | ||
| c.evaluate_verifier(evaluation_context, ext_evals); | ||
| } |
There was a problem hiding this comment.
Medium — evaluate_verifier called where evaluate_prover is expected
This calls evaluate_verifier directly on the concrete types, which works today only because the default evaluate_prover implementation for extension constraints is just a thin wrapper that delegates to evaluate_verifier. But this is fragile:
- It's semantically inconsistent —
compute_transition_provercallsevaluate_verifieron Phase 2/3 while Phase 1 correctly callsevaluate_prover. - If a future contributor adds a real prover-specific fast path by overriding
evaluate_proveron either concrete type (e.g., skipping some verifier-only checks), it will be silently bypassed here.
The static/monomorphic dispatch benefit is preserved either way since these are concrete types. Prefer:
| // Phase 2: LogUp term constraints — monomorphic dispatch (concrete type) | |
| for c in &self.logup_term_constraints_direct { | |
| c.evaluate_verifier(evaluation_context, ext_evals); | |
| } | |
| // Phase 3: LogUp accumulated constraint — monomorphic dispatch (concrete type) | |
| if let Some(c) = &self.logup_accumulated_direct { | |
| c.evaluate_verifier(evaluation_context, ext_evals); | |
| } | |
| // Phase 2: LogUp term constraints — monomorphic dispatch (concrete type) | |
| for c in &self.logup_term_constraints_direct { | |
| c.evaluate_prover(evaluation_context, base_evals, ext_evals); | |
| } | |
| // Phase 3: LogUp accumulated constraint — monomorphic dispatch (concrete type) | |
| if let Some(c) = &self.logup_accumulated_direct { | |
| c.evaluate_prover(evaluation_context, base_evals, ext_evals); | |
| } |
The compiler will inline evaluate_prover → evaluate_verifier in one pass, so there's no performance difference, but the call chain is now honest.
Review: perf: monomorphic dispatch for LogUp constraintsThe optimization is sound and well-documented. The 8.5% wall-clock improvement is credible for this kind of devirtualization. One semantic issue worth fixing, one low-level maintenance note. Medium —
|
|
/bench 5 |
|
/bench 5 |
Benchmark — fib_iterative_8M (median of 7)Table parallelism: auto (cores / 3)
Commit: 45152db · Baseline: cached · Runner: self-hosted bench |
|
/bench 7 1 |
Summary
LookupBatchedTermConstraintandLookupAccumulatedConstraintonAirWithBusesalongside the boxeddyn TransitionConstraintEvaluatorversionscompute_transition_proverto dispatch LogUp constraints via concrete types, enabling the compiler to inline fingerprint computationBenchmark results
fib_iterative_4M, PARALLEL_TABLES=1, 5 samples:
Verification: baseline verifier accepts the proof.
Why it works
The prover's constraint evaluation hot loop calls
evaluate_prover()through adyn TransitionConstraintEvaluatorvtable for every constraint at every LDE domain point. For LogUp constraints (which do expensive extension-field fingerprint computation), this prevents the compiler from inlining across the call boundary. Storing concrete types and calling them directly lets LLVM inline the fingerprint logic and optimize across constraint boundaries.Test plan
cargo test --release -p stark(124/124 pass)cargo test --release -p lambda-vm-prover(347/347 pass, 3 pre-existing failures unrelated)/benchon CI runner