Skip to content

perf(verifier-guest): Cache preprocessed-table commitments#601

Open
nicole-graus wants to merge 8 commits into
mainfrom
cache-verifier-precomp-tables
Open

perf(verifier-guest): Cache preprocessed-table commitments#601
nicole-graus wants to merge 8 commits into
mainfrom
cache-verifier-precomp-tables

Conversation

@nicole-graus
Copy link
Copy Markdown
Collaborator

Description

Adds VmVerifyingKey (prover/src/vkey.rs). Caches all 5 preprocessed-table Merkle commitments (bitwise, decode, register, keccak_rc, per-page) so the verifier can read them as input instead of recomputing the FFT + Merkle pipeline. New entry verify_with_options_with_vkey(..., Some(vkey)) reads the cached value. Plain verify_with_options stays as a thin wrapper passing None to preserve the existing API.

This optimization targets the recursion guest specifically, i.e. the verifier compiled as a RISC-V program. The 609× cycle reduction below is in-VM RISC-V cycles, not host wall-clock.

Security

The vkey is a precomputed source for values that used to be recomputed. They are fed into the verifier's transcript in exactly the same place. A tampered vkey is caught by Fiat-Shamir (different transcript → different challenges → openings fail). The test_vkey_*_mismatch_rejects tests assert this for each cached table.

Cycle reduction

Measured on the recursion guest (the verifier compiled as a RISC-V program verifying an empty inner proof, blowup=2, 1 query):

Caching Cycles vs original
No cache 40.5 B
bitwise 2.85 B 14×
+ page 68.5 M 591×
+ decode + register + keccak_rc 67 M 609×

@github-actions
Copy link
Copy Markdown

Codex Code Review

Findings

  • High - verify_with_options_with_vkey no longer authenticates elf_bytes when vkey is supplied.
    In prover/src/lib.rs, the verifier still accepts elf_bytes, but VmAirs::new_with_vkey replaces the ELF-derived decode, register, and page commitments with caller-supplied vkey fields at prover/src/lib.rs and prover/src/lib.rs. That means a proof produced for program A can be verified against program B’s elf_bytes if the caller also supplies program A’s vkey and the page layout is compatible. Fiat-Shamir only proves consistency with the supplied commitments; it does not prove those commitments came from the elf_bytes argument. This is a core soundness/API issue for any caller treating this function like verify_with_options plus caching. The vkey must be trusted and bound to the expected program through an external expected digest/key, or this API should recompute/validate the vkey against elf_bytes before using it.

  • Low - version is documented as validation-relevant but is never enforced.
    prover/src/vkey.rs says an older layout “stops validating,” but verify_with_options_with_vkey never reads version, and the new test explicitly shows a mutated version still accepts. This can mislead integrators into thinking serialized vkeys are version-checked. Either enforce the version before using the vkey or change the docs to say it is only included in compute_digest() and currently advisory.

Comment thread prover/src/lib.rs
.copied()
.unwrap_or_else(|| {
page::precomputed_commitment_cached(config, proof_options)
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium – Silent fallback on pages length mismatch

When vkey.pages is shorter than page_configs (e.g. serialized against an older ELF with fewer pages), the loop silently falls back to recomputing the missing slots instead of surfacing an error. This is benign today because Fiat-Shamir would catch a genuine mismatch, but it can mask a caller bug (wrong vkey supplied) that you'd want to detect early.

Consider returning an error, or at minimum logging a warning, when i >= vk.pages.len() for a non-private-input page:

vkey.and_then(|vk| {
    if i >= vk.pages.len() {
        log::warn!("vkey.pages shorter than page_configs (index {i}); recomputing");
    }
    vk.pages.get(i)
})

Or tighten the contract in verify_with_options_with_vkey to validate vkey.pages.len() == page_configs.len() before entering VmAirs::new_with_vkey.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5d8f91a

Comment thread prover/src/vkey.rs
/// Keccak256 fingerprint of the postcard-serialized vkey. Stable as long
/// as the field layout (and [`VKEY_VERSION`]) does not change.
pub fn compute_digest(&self) -> [u8; 32] {
let bytes = postcard::to_allocvec(self)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low – compute_digest panics on serialization failure

postcard::to_allocvec returns a Result. Using .expect here panics on any serialization error (e.g. if the struct ever gains a field type that postcard can't serialize). Since compute_digest is meant to be called in security-sensitive contexts (binding to a VmProof), consider returning Result<[u8; 32], postcard::Error> so callers can handle failures gracefully:

Suggested change
let bytes = postcard::to_allocvec(self)
pub fn compute_digest(&self) -> Result<[u8; 32], postcard::Error> {
let bytes = postcard::to_allocvec(self)?;
let mut hasher = Keccak256::new();
hasher.update(&bytes);
Ok(hasher.finalize().into())
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed. Postcard over [u8; 32] and Vec<[u8; 32]> has no failure mode in practice.

Comment thread prover/src/vkey.rs
#[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
pub struct VmVerifyingKey {
/// Layout version. See [`VKEY_VERSION`].
pub version: u32,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low – version is pub and mutable with no enforcement today

The field is public and Clone/Deserialize allow callers to produce a VmVerifyingKey where version != VKEY_VERSION with no warning. test_vkey_version_mismatch_today_accepts intentionally pins the "advisory-only" behavior, which is fine as a stopgap.

For the upcoming vk_digest PR: consider making version private and only settable via from_elf_and_options, so the version is structurally correct by construction and any migration has to go through the constructor. That removes the risk of a future compute_digest check silently passing on a vkey with a mutated version.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5d8f91a

@claude
Copy link
Copy Markdown

claude Bot commented May 21, 2026

Review: Cache preprocessed-table commitments (VmVerifyingKey)

Overview

Adds VmVerifyingKey to cache the five preprocessed-table Merkle commitments (bitwise, decode, register, keccak_rc, per-page), avoiding the FFT + Merkle recomputation on every verifier call. The existing verify_with_options API is preserved as a thin wrapper — clean, non-breaking change. The 609× cycle reduction claim is plausible given it skips the heaviest preprocessing steps.


Security

Medium – Silent page-length mismatch (inline: prover/src/lib.rs:447)
When vkey.pages is shorter than page_configs, the loop silently recomputes the missing slots without any error or warning. This won't produce a wrong result (Fiat-Shamir protects), but it silently masks a caller error (wrong vkey provided). Worth at minimum a log warning, ideally a length check in verify_with_options_with_vkey before entering VmAirs::new_with_vkey.

Low – version unenforced (acknowledged) (inline: prover/src/vkey.rs:62)
The field is public and mutable. This is pinned as intentional in test_vkey_version_mismatch_today_accepts, so it's not a bug — but making version private and constructor-only would eliminate the class of "mutated but structurally valid vkey" before the vk_digest PR lands.

Acknowledged / deferred — vkey substitution without vk_digest
Until compute_digest() is embedded in VmProof, a caller can supply any vkey for a given ELF and the verifier won't fast-reject it; rejection only comes from Fiat-Shamir failing downstream. The PR describes this correctly. The deferred vk_digest PR should be a prerequisite for any context where the vkey arrives over a trust boundary.


Bugs

Low – compute_digest panics (inline: prover/src/vkey.rs:124)
postcard::to_allocvec returns Result; calling .expect panics on any serialization error. Should return Result<[u8; 32], _> since this is intended for security-sensitive use.


Tests

Good coverage: roundtrip, equivalence, and per-table tamper tests. The test_vkey_fields_match_air_commitments test that directly compares vkey commitments against AIR-constructed values is the right approach and will catch any future divergence between from_elf_and_options and VmAirs::new.

One gap: no test for vkey derived from one ProofOptions but verified against a different one (e.g., different blowup). This would fail via Fiat-Shamir, but an explicit test would document the expected behavior and catch any future short-circuit that bypasses the STARK check.


Overall

The implementation is clean, well-documented, and appropriately scoped. The main ask before merge is either a length validation (or at least a log warning) for the vkey.pages length mismatch, and making compute_digest return Result instead of panicking.

@nicole-graus nicole-graus changed the title opt(verifier-guest): Cache preprocessed-table commitments perf(verifier-guest): Cache preprocessed-table commitments May 21, 2026
@nicole-graus
Copy link
Copy Markdown
Collaborator Author

/bench

@github-actions
Copy link
Copy Markdown

Benchmark — fib_iterative_8M (median of 3)

Table parallelism: auto (cores / 3)

Metric main PR Δ
Peak heap 52516 MB 52111 MB -405 MB (-0.8%) ⚪
Prove time 25.536s 25.616s +0.080s (+0.3%) ⚪

✅ No significant change.

✅ Low variance (time: 0.7%, heap: 0.1%)

Commit: 26603cd · Baseline: cached · Runner: self-hosted bench

@nicole-graus nicole-graus marked this pull request as ready for review May 21, 2026 20:19
@github-actions
Copy link
Copy Markdown

Codex Code Review

Findings

  • High - verify_with_options_with_vkey can verify a proof against the wrong program. When vkey is Some, the verifier takes decode, register, and PAGE commitments from the supplied vkey instead of deriving them from elf_bytes, and the new docs explicitly state that a proof for program A can return Ok(true) when called with elf_bytes for program B. That breaks the existing verifier contract that elf_bytes authenticates the program being proven. This should not ship as a public verification API until the vkey digest is bound into VmProof and checked, or the API is made explicitly “verify against this trusted vkey” without accepting misleading elf_bytes. See prover/src/lib.rs and the vkey override path at prover/src/lib.rs.

No other concrete issues found in the PR diff.

Comment thread prover/src/lib.rs
/// same `elf_bytes` they intend to verify against. A future PR will bind
/// `vkey.compute_digest()` into [`VmProof`] so this trust assumption is
/// enforced cryptographically.
pub fn verify_with_options_with_vkey(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[High] elf_bytes no longer guards program identity when vkey is provided

The docstring acknowledges this, but it's worth spelling out the exploit path clearly: a caller who obtains a valid VmProof for program A and presents it with vkey_A + elf_bytes_B will receive Ok(true) even though the proof has nothing to do with program B. The elf_bytes parameter implies it is the source of truth for what program is being verified, but when a vkey is supplied it is effectively a no-op for the security-critical commitments (decode, register, pages).

Two suggestions to reduce the chance of misuse before the vk_digest PR lands:

  1. Rename the parameter to something like elf_bytes_for_table_counts or add a #[doc(alias)] / inline comment at the call site that makes clear the elf is only used for page_configs reconstruction, not for authenticating the proof.
  2. In debug builds, assert that vkey.decode matches decode::commitment_from_elf(parsed_elf, proof_options) as an early-exit sanity check. This costs nothing in release and would have caught the class of mistake described above during development.
#[cfg(debug_assertions)]
if let Some(vk) = vkey {
    let expected = decode::commitment_from_elf(&program, proof_options)
        .expect("decode commitment");
    assert_eq!(vk.decode, expected,
        "vkey was derived from a different ELF than elf_bytes");
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5d8f91a

Comment thread prover/src/lib.rs
if let Some(vk) = vkey
&& vk.pages.len() != page_configs.len()
{
log::warn!(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] Silent fallback on page count mismatch masks caller bugs

A log::warn is easy to miss in production (especially inside the recursion guest where logging may be suppressed). The comment even says "likely a caller bug", yet execution continues — using the vkey for some page slots and recomputing others. This creates a hybrid state that is correct but invisible to the caller.

If a length mismatch genuinely indicates a programming error, it should surface as one:

if let Some(vk) = vkey
    && vk.pages.len() != page_configs.len()
{
    panic!(
        "vkey.pages length ({}) does not match page_configs length ({})",
        vk.pages.len(),
        page_configs.len()
    );
}

Or, if you need this to stay infallible, at minimum document which slots get which treatment so callers can reason about it. The current fallback is safe (Fiat-Shamir catches any mismatch), but silently mixing sources makes it hard to trust the vkey path is actually being exercised.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5d8f91a

Comment thread prover/src/vkey.rs
/// (e.g. by hashing only non-private-input slots, or by asserting these
/// slots equal `[0u8; 32]` before hashing) so a malicious supplier cannot
/// produce two functionally-equivalent vkeys with different digests.
const PRIVATE_INPUT_PAGE_PLACEHOLDER: Commitment = [0u8; 32];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] Zero placeholder in pages makes compute_digest() unstable across page configurations

Two vkeys for the same program can produce different digests if they were built with different numbers of private-input pages — the all-zero placeholder slots are included in the postcard bytes that get hashed. So vkey_A.compute_digest() != vkey_B.compute_digest() even when both vkeys would produce identical verification outcomes.

The NOTE acknowledges this will need fixing before vk_digest is bound into VmProof. Worth tracking as a hard requirement: if compute_digest() is used for caching, equality checks, or any binding before that fix lands, it will silently produce wrong results.

One concrete option is to exclude private-input slots from the digest entirely:

pub fn compute_digest(&self) -> [u8; 32] {
    // Hash only the fields that are semantically meaningful:
    // private-input page slots are always zero placeholders and
    // must not affect the digest.
    // TODO: replace with a proper canonical form once vk_digest
    // is bound into VmProof.
    let mut hasher = Keccak256::new();
    hasher.update(self.version.to_le_bytes());
    hasher.update(self.bitwise);
    hasher.update(self.decode);
    hasher.update(self.register);
    hasher.update(self.keccak_rc);
    for page in &self.pages {
        if page != &PRIVATE_INPUT_PAGE_PLACEHOLDER {
            hasher.update(page);
        }
    }
    hasher.finalize().into()
}

(This still has the ordering issue if a non-private page happens to be all zeros, but that's a Merkle root — astronomically unlikely in practice.)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 5d8f91a

Comment thread prover/src/vkey.rs
/// Keccak256 fingerprint of the postcard-serialized vkey. Stable as long
/// as the field layout (and [`VKEY_VERSION`]) does not change.
pub fn compute_digest(&self) -> [u8; 32] {
let bytes = postcard::to_allocvec(self)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low] compute_digest panics on serialization failure

postcard::to_allocvec returns a Result and the expect will panic if it ever fails. For a struct containing only u32, [u8; 32], and Vec<[u8; 32]> postcard serialization should not fail in practice, but propagating the error is cleaner and avoids any surprise in guest environments where the allocator may behave differently:

pub fn compute_digest(&self) -> Result<[u8; 32], postcard::Error> {
    let bytes = postcard::to_allocvec(self)?;
    let mut hasher = Keccak256::new();
    hasher.update(&bytes);
    Ok(hasher.finalize().into())
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed. postcard::to_allocvec over [u8; 32] and Vec<[u8; 32]> has no failure mode in practice.

@claude
Copy link
Copy Markdown

claude Bot commented May 21, 2026

Review: Cache preprocessed-table commitments

Good concept and well-documented PR. The 609× cycle reduction is compelling and the Fiat-Shamir argument for why a tampered vkey gets caught is correct. The mismatch tests are a good addition.

Four issues to address before merging:

# Severity Location Summary
1 High lib.rs:784 elf_bytes no longer guards program identity when vkey is supplied — exploit path exists before vk_digest PR lands
2 Medium lib.rs:429 Silent fallback on page count mismatch should be a hard error, not a warn
3 Medium vkey.rs:58 compute_digest() includes zero placeholders for private-input pages, making it unstable across page configurations
4 Low vkey.rs:127 compute_digest panics on postcard failure; should return Result

Issue #1 is the most consequential. The verify_with_options_with_vkey API accepts elf_bytes which implies it is the anchor for what program is being verified, but when a vkey is provided the decode/register/page commitments come from the vkey — not the elf. A caller passing (proof_A, elf_B, vkey_A) gets Ok(true). Until vk_digest is bound into VmProof, a debug-build assertion comparing vkey.decode against decode::commitment_from_elf would catch this class of mistake during development at zero release cost.

Issue #2: the comment at line 431 says "likely a caller bug" — if that's true, a panic (or at minimum an Err) is the right response, not a warn. The current fallback silently mixes vkey-sourced and recomputed commitments, which makes it hard to verify the vkey path is actually being exercised.

Issue #3 affects the future vk_digest binding: two functionally identical vkeys built with different numbers of private-input pages produce different digests today. See inline comment for a straightforward fix.

Copy link
Copy Markdown
Collaborator

@diegokingston diegokingston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — caching preprocessed-table commitments

Clean, thoroughly tested, honestly documented PR with a large, real win (609× in-VM cycles for the recursion guest). CI is green.

Security model

The key shift: with a vkey, the verifier no longer recomputes the preprocessed-table commitments from the ELF — it reads them from a caller-supplied input, so for those tables elf_bytes stops binding the proof to the program. The PR is admirably upfront about this (the IMPORTANT doc block, the deferred vk_digest plan, and the tests that pin the current advisory behavior).

Per the intended architecture this is sound: the vkey is consumed inside the recursion guest as a public input, and the outer / "real" verifier authenticates the vkey after recursion — so it is a checked value, not a blindly-trusted one. Two things worth making explicit so that intent is enforced, not just documented:

  1. vk_digest is the binding and isn't in this PR. Until vkey.compute_digest() is bound into VmProof (the deferred follow-up), nothing in-band ties a vkey to a proof. verify_with_options_with_vkey(.., Some(vkey)) is pub; called outside the recursion-guest flow with a valid vkey for the wrong program it returns Ok(true) — exactly the case Fiat-Shamir does not catch (it only catches tampering, which the six test_vkey_*_mismatch_rejects tests cover well). Recommend vk_digest land as the immediate next PR, and the Some(vkey) path stay confined to the recursion guest until then.
  2. verify_with_options is unchanged (None → recompute everything), so existing callers keep the full security model. Good.

Code quality

  • The VmAirs::newnew_with_vkey split (old new = thin None wrapper) preserves the API cleanly; the vkey.map(..).unwrap_or_else(recompute) pattern is consistent across all five tables.
  • Test coverage is excellent — equivalence, per-table tamper rejection ×5, page mismatch, version-advisory, empty-pages fallback, and test_vkey_fields_match_air_commitments directly asserting each cached commitment equals what VmAirs::new builds.
  • PRIVATE_INPUT_PAGE_PLACEHOLDER already documents the future digest-malleability concern — good forward awareness.

Minor

  • New dep postcard pulls a transitive tree (heapless, embedded-io×2, cobs, atomic-polyfill, critical-section, hash32, spin). Justified — the no_std recursion guest needs to deserialize the vkey in-VM, which bincode (already a workspace dep) can't do as cleanly. Worth one line in the PR description stating that rationale.
  • VKEY_VERSION = 3 for a brand-new struct is slightly odd (bumped during development).
  • Bot comments: page-length fallback → addressed via the added log::warn!; compute_digest .expect → fine to leave (postcard over [u8;32]/Vec<[u8;32]> has no practical failure mode); version private → correctly deferred to the vk_digest PR.

Verdict

Mergeable. The only non-code item is staging: commit to vk_digest next, and keep the Some(vkey) path inside the outer-authenticated recursion flow until that binding exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants