perf(verifier-guest): Cache preprocessed-table commitments#601
perf(verifier-guest): Cache preprocessed-table commitments#601nicole-graus wants to merge 8 commits into
Conversation
Codex Code ReviewFindings
|
| .copied() | ||
| .unwrap_or_else(|| { | ||
| page::precomputed_commitment_cached(config, proof_options) | ||
| }); |
There was a problem hiding this comment.
Medium – Silent fallback on pages length mismatch
When vkey.pages is shorter than page_configs (e.g. serialized against an older ELF with fewer pages), the loop silently falls back to recomputing the missing slots instead of surfacing an error. This is benign today because Fiat-Shamir would catch a genuine mismatch, but it can mask a caller bug (wrong vkey supplied) that you'd want to detect early.
Consider returning an error, or at minimum logging a warning, when i >= vk.pages.len() for a non-private-input page:
vkey.and_then(|vk| {
if i >= vk.pages.len() {
log::warn!("vkey.pages shorter than page_configs (index {i}); recomputing");
}
vk.pages.get(i)
})Or tighten the contract in verify_with_options_with_vkey to validate vkey.pages.len() == page_configs.len() before entering VmAirs::new_with_vkey.
| /// Keccak256 fingerprint of the postcard-serialized vkey. Stable as long | ||
| /// as the field layout (and [`VKEY_VERSION`]) does not change. | ||
| pub fn compute_digest(&self) -> [u8; 32] { | ||
| let bytes = postcard::to_allocvec(self) |
There was a problem hiding this comment.
Low – compute_digest panics on serialization failure
postcard::to_allocvec returns a Result. Using .expect here panics on any serialization error (e.g. if the struct ever gains a field type that postcard can't serialize). Since compute_digest is meant to be called in security-sensitive contexts (binding to a VmProof), consider returning Result<[u8; 32], postcard::Error> so callers can handle failures gracefully:
| let bytes = postcard::to_allocvec(self) | |
| pub fn compute_digest(&self) -> Result<[u8; 32], postcard::Error> { | |
| let bytes = postcard::to_allocvec(self)?; | |
| let mut hasher = Keccak256::new(); | |
| hasher.update(&bytes); | |
| Ok(hasher.finalize().into()) | |
| } |
There was a problem hiding this comment.
Not addressed. Postcard over [u8; 32] and Vec<[u8; 32]> has no failure mode in practice.
| #[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)] | ||
| pub struct VmVerifyingKey { | ||
| /// Layout version. See [`VKEY_VERSION`]. | ||
| pub version: u32, |
There was a problem hiding this comment.
Low – version is pub and mutable with no enforcement today
The field is public and Clone/Deserialize allow callers to produce a VmVerifyingKey where version != VKEY_VERSION with no warning. test_vkey_version_mismatch_today_accepts intentionally pins the "advisory-only" behavior, which is fine as a stopgap.
For the upcoming vk_digest PR: consider making version private and only settable via from_elf_and_options, so the version is structurally correct by construction and any migration has to go through the constructor. That removes the risk of a future compute_digest check silently passing on a vkey with a mutated version.
Review: Cache preprocessed-table commitments (
|
|
/bench |
Benchmark — fib_iterative_8M (median of 3)Table parallelism: auto (cores / 3)
Commit: 26603cd · Baseline: cached · Runner: self-hosted bench |
…co/lambda_vm into cache-verifier-precomp-tables
Codex Code ReviewFindings
No other concrete issues found in the PR diff. |
| /// same `elf_bytes` they intend to verify against. A future PR will bind | ||
| /// `vkey.compute_digest()` into [`VmProof`] so this trust assumption is | ||
| /// enforced cryptographically. | ||
| pub fn verify_with_options_with_vkey( |
There was a problem hiding this comment.
[High] elf_bytes no longer guards program identity when vkey is provided
The docstring acknowledges this, but it's worth spelling out the exploit path clearly: a caller who obtains a valid VmProof for program A and presents it with vkey_A + elf_bytes_B will receive Ok(true) even though the proof has nothing to do with program B. The elf_bytes parameter implies it is the source of truth for what program is being verified, but when a vkey is supplied it is effectively a no-op for the security-critical commitments (decode, register, pages).
Two suggestions to reduce the chance of misuse before the vk_digest PR lands:
- Rename the parameter to something like
elf_bytes_for_table_countsor add a#[doc(alias)]/ inline comment at the call site that makes clear the elf is only used forpage_configsreconstruction, not for authenticating the proof. - In debug builds, assert that
vkey.decodematchesdecode::commitment_from_elf(parsed_elf, proof_options)as an early-exit sanity check. This costs nothing in release and would have caught the class of mistake described above during development.
#[cfg(debug_assertions)]
if let Some(vk) = vkey {
let expected = decode::commitment_from_elf(&program, proof_options)
.expect("decode commitment");
assert_eq!(vk.decode, expected,
"vkey was derived from a different ELF than elf_bytes");
}| if let Some(vk) = vkey | ||
| && vk.pages.len() != page_configs.len() | ||
| { | ||
| log::warn!( |
There was a problem hiding this comment.
[Medium] Silent fallback on page count mismatch masks caller bugs
A log::warn is easy to miss in production (especially inside the recursion guest where logging may be suppressed). The comment even says "likely a caller bug", yet execution continues — using the vkey for some page slots and recomputing others. This creates a hybrid state that is correct but invisible to the caller.
If a length mismatch genuinely indicates a programming error, it should surface as one:
if let Some(vk) = vkey
&& vk.pages.len() != page_configs.len()
{
panic!(
"vkey.pages length ({}) does not match page_configs length ({})",
vk.pages.len(),
page_configs.len()
);
}Or, if you need this to stay infallible, at minimum document which slots get which treatment so callers can reason about it. The current fallback is safe (Fiat-Shamir catches any mismatch), but silently mixing sources makes it hard to trust the vkey path is actually being exercised.
| /// (e.g. by hashing only non-private-input slots, or by asserting these | ||
| /// slots equal `[0u8; 32]` before hashing) so a malicious supplier cannot | ||
| /// produce two functionally-equivalent vkeys with different digests. | ||
| const PRIVATE_INPUT_PAGE_PLACEHOLDER: Commitment = [0u8; 32]; |
There was a problem hiding this comment.
[Medium] Zero placeholder in pages makes compute_digest() unstable across page configurations
Two vkeys for the same program can produce different digests if they were built with different numbers of private-input pages — the all-zero placeholder slots are included in the postcard bytes that get hashed. So vkey_A.compute_digest() != vkey_B.compute_digest() even when both vkeys would produce identical verification outcomes.
The NOTE acknowledges this will need fixing before vk_digest is bound into VmProof. Worth tracking as a hard requirement: if compute_digest() is used for caching, equality checks, or any binding before that fix lands, it will silently produce wrong results.
One concrete option is to exclude private-input slots from the digest entirely:
pub fn compute_digest(&self) -> [u8; 32] {
// Hash only the fields that are semantically meaningful:
// private-input page slots are always zero placeholders and
// must not affect the digest.
// TODO: replace with a proper canonical form once vk_digest
// is bound into VmProof.
let mut hasher = Keccak256::new();
hasher.update(self.version.to_le_bytes());
hasher.update(self.bitwise);
hasher.update(self.decode);
hasher.update(self.register);
hasher.update(self.keccak_rc);
for page in &self.pages {
if page != &PRIVATE_INPUT_PAGE_PLACEHOLDER {
hasher.update(page);
}
}
hasher.finalize().into()
}(This still has the ordering issue if a non-private page happens to be all zeros, but that's a Merkle root — astronomically unlikely in practice.)
| /// Keccak256 fingerprint of the postcard-serialized vkey. Stable as long | ||
| /// as the field layout (and [`VKEY_VERSION`]) does not change. | ||
| pub fn compute_digest(&self) -> [u8; 32] { | ||
| let bytes = postcard::to_allocvec(self) |
There was a problem hiding this comment.
[Low] compute_digest panics on serialization failure
postcard::to_allocvec returns a Result and the expect will panic if it ever fails. For a struct containing only u32, [u8; 32], and Vec<[u8; 32]> postcard serialization should not fail in practice, but propagating the error is cleaner and avoids any surprise in guest environments where the allocator may behave differently:
pub fn compute_digest(&self) -> Result<[u8; 32], postcard::Error> {
let bytes = postcard::to_allocvec(self)?;
let mut hasher = Keccak256::new();
hasher.update(&bytes);
Ok(hasher.finalize().into())
}There was a problem hiding this comment.
Not addressed. postcard::to_allocvec over [u8; 32] and Vec<[u8; 32]> has no failure mode in practice.
Review: Cache preprocessed-table commitmentsGood concept and well-documented PR. The 609× cycle reduction is compelling and the Fiat-Shamir argument for why a tampered vkey gets caught is correct. The mismatch tests are a good addition. Four issues to address before merging:
Issue #1 is the most consequential. The Issue #2: the comment at line 431 says "likely a caller bug" — if that's true, a panic (or at minimum an Issue #3 affects the future |
diegokingston
left a comment
There was a problem hiding this comment.
Review — caching preprocessed-table commitments
Clean, thoroughly tested, honestly documented PR with a large, real win (609× in-VM cycles for the recursion guest). CI is green.
Security model
The key shift: with a vkey, the verifier no longer recomputes the preprocessed-table commitments from the ELF — it reads them from a caller-supplied input, so for those tables elf_bytes stops binding the proof to the program. The PR is admirably upfront about this (the IMPORTANT doc block, the deferred vk_digest plan, and the tests that pin the current advisory behavior).
Per the intended architecture this is sound: the vkey is consumed inside the recursion guest as a public input, and the outer / "real" verifier authenticates the vkey after recursion — so it is a checked value, not a blindly-trusted one. Two things worth making explicit so that intent is enforced, not just documented:
vk_digestis the binding and isn't in this PR. Untilvkey.compute_digest()is bound intoVmProof(the deferred follow-up), nothing in-band ties a vkey to a proof.verify_with_options_with_vkey(.., Some(vkey))ispub; called outside the recursion-guest flow with a valid vkey for the wrong program it returnsOk(true)— exactly the case Fiat-Shamir does not catch (it only catches tampering, which the sixtest_vkey_*_mismatch_rejectstests cover well). Recommendvk_digestland as the immediate next PR, and theSome(vkey)path stay confined to the recursion guest until then.verify_with_optionsis unchanged (None→ recompute everything), so existing callers keep the full security model. Good.
Code quality
- The
VmAirs::new→new_with_vkeysplit (oldnew= thinNonewrapper) preserves the API cleanly; thevkey.map(..).unwrap_or_else(recompute)pattern is consistent across all five tables. - Test coverage is excellent — equivalence, per-table tamper rejection ×5, page mismatch, version-advisory, empty-pages fallback, and
test_vkey_fields_match_air_commitmentsdirectly asserting each cached commitment equals whatVmAirs::newbuilds. PRIVATE_INPUT_PAGE_PLACEHOLDERalready documents the future digest-malleability concern — good forward awareness.
Minor
- New dep
postcardpulls a transitive tree (heapless,embedded-io×2,cobs,atomic-polyfill,critical-section,hash32,spin). Justified — the no_std recursion guest needs to deserialize the vkey in-VM, whichbincode(already a workspace dep) can't do as cleanly. Worth one line in the PR description stating that rationale. VKEY_VERSION = 3for a brand-new struct is slightly odd (bumped during development).- Bot comments: page-length fallback → addressed via the added
log::warn!;compute_digest.expect→ fine to leave (postcard over[u8;32]/Vec<[u8;32]>has no practical failure mode);versionprivate → correctly deferred to thevk_digestPR.
Verdict
Mergeable. The only non-code item is staging: commit to vk_digest next, and keep the Some(vkey) path inside the outer-authenticated recursion flow until that binding exists.
Description
Adds
VmVerifyingKey(prover/src/vkey.rs). Caches all 5 preprocessed-table Merkle commitments (bitwise, decode, register, keccak_rc, per-page) so the verifier can read them as input instead of recomputing the FFT + Merkle pipeline. New entryverify_with_options_with_vkey(..., Some(vkey))reads the cached value. Plainverify_with_optionsstays as a thin wrapper passingNoneto preserve the existing API.This optimization targets the recursion guest specifically, i.e. the verifier compiled as a RISC-V program. The 609× cycle reduction below is in-VM RISC-V cycles, not host wall-clock.
Security
The vkey is a precomputed source for values that used to be recomputed. They are fed into the verifier's transcript in exactly the same place. A tampered vkey is caught by Fiat-Shamir (different transcript → different challenges → openings fail). The
test_vkey_*_mismatch_rejectstests assert this for each cached table.Cycle reduction
Measured on the recursion guest (the verifier compiled as a RISC-V program verifying an empty inner proof, blowup=2, 1 query):