Skip to content

Add preprocessed columns for PAGE/REGISTER tables and introduce VmProof API#321

Merged
MauroToscano merged 14 commits into
mainfrom
precompute_init_memory
Feb 12, 2026
Merged

Add preprocessed columns for PAGE/REGISTER tables and introduce VmProof API#321
MauroToscano merged 14 commits into
mainfrom
precompute_init_memory

Conversation

@ColoCarletti
Copy link
Copy Markdown
Collaborator

  • Preprocessed OFFSET+INIT columns: PAGE and REGISTER tables now commit to their OFFSET and INIT columns
    via Merkle roots, allowing the verifier to independently recompute and verify initial memory/register
    state from the ELF. Follows the existing DECODE preprocessed pattern (FFT → LDE → Merkle tree).
    Zero-init pages share a cached commitment via OnceLock.
  • Fixed page_configs_from_elf: Was creating zero_init for ALL pages including ELF data pages, so the
    verifier couldn't reconstruct correct init data. Now extracts actual ELF byte data per page.
  • VmProof API: Replaces raw MultiProof return type with a VmProof bundle containing proof, stack_size,
    and proof_options. The verifier extracts these from the proof instead of relying on hardcoded constants.
    stack_size is a hint (preprocessed commitments enforce correctness).

@github-actions
Copy link
Copy Markdown

Kimi Code Review

Security Vulnerabilities

  1. High: The use of OnceLock for caching commitments in page.rs and register.rs can lead to a potential race condition if multiple threads are accessing these locks simultaneously. This is because OnceLock is not thread-safe and can cause data races when used in a multi-threaded context. To mitigate this, consider using thread-safe alternatives like std::sync::Mutex or std::sync::RwLock for caching commitments.

Potential Bugs

  1. Medium: In lib.rs, the changes to create_register_air and create_page_air involve calling .with_preprocessed with a commitment. However, there is no check to ensure that the commitment is valid or non-zero before using it. This could lead to incorrect behavior if the commitment is invalid or zero. It is recommended to add checks to ensure the validity of the commitment before using it.

Performance Issues

  1. Low: The introduction of precomputed_commitment_cached and compute_precomputed_commitment in page.rs and register.rs involves additional computations for polynomial evaluation and Merkle tree building. While these are necessary for correctness, they could introduce performance overhead. It is important to benchmark and profile these changes to ensure they do not significantly impact the overall performance.

Simplicity

  1. Medium: The changes introduce additional complexity with the use of OnceLock for caching and the computation of precomputed commitments. While these are necessary for correctness, they add complexity to the code. It is important to balance correctness with simplicity and readability. Consider documenting the rationale behind these changes to help maintainers understand the trade-offs.

Overall, the changes introduce necessary functionality for precomputed commitments and caching, but they also bring potential security vulnerabilities and performance concerns. It is important to address these issues and ensure the changes are well-documented and tested.


Automated review by Kimi (Moonshot AI)

@github-actions
Copy link
Copy Markdown

Codex Code Review

Findings (ordered by severity)

  • High (security/correctness): Commitments are cached without including ProofOptions. register::preprocessed_commitment and page::precomputed_commitment_cached reuse a single Commitment even if blowup_factor or coset_offset change, which will produce incorrect commitments for subsequent proofs and can cause verification failure or mismatched public data. Fix by keying the cache by the options (or remove caching). Files: prover/src/tables/register.rs, prover/src/tables/page.rs.
  • Low (stability/DoS): compute_precomputed_commitment uses assert!/expect on input-dependent data (page size, FFT/LDE, Merkle build). If any of these are user-controlled or can be corrupted, this becomes a hard panic. Consider returning a Result and handling errors in the caller. Files: prover/src/tables/page.rs, prover/src/tables/register.rs.

No other security, bug, or significant performance issues found in this diff.

@claude
Copy link
Copy Markdown

claude Bot commented Feb 11, 2026

Security & Code Review Findings

I've reviewed PR #321. Here are the issues found, categorized by severity:


🔴 [Medium] Security: Proof Options Trust Issue

Location: prover/src/lib.rs - prove_with_stack() and verify() functions

Issue: The verifier extracts proof_options from the VmProof bundle and uses them directly without validation. A malicious prover could:

  1. Use weak proof options (e.g., low blowup factor, compromised security parameters)
  2. Package these weak options in the proof
  3. The verifier blindly accepts and uses these weak options

Impact: Proofs could be generated with intentionally weakened security parameters.

Recommendation:

  • Enforce minimum security requirements on proof_options in the verifier
  • OR take proof_options as a verifier parameter and validate they match
  • OR clearly document that the verifier trusts proof_options in the bundle

🟡 [Medium] Security: ELF Data Extraction Change

Location: prover/src/tables/trace_builder.rs:1176-1179 - page_configs_from_elf()

Issue: This PR changes how ELF page data is extracted:

  • Old: Created zero_init for ALL pages
  • New: Extracts actual byte values from ELF segments

The docstring (line 1155-1157) says "page_configs should be embedded in proof public inputs" but they're NOT. The verifier reconstructs them from the ELF alone. The stack_size is documented as a "hint" (VmProof docstring) but is trusted.

Potential Impact: If a malicious prover uses different initial page values during proving, the verifier might reconstruct different configs.

Status: PR description acknowledges this: "stack_size is a hint (preprocessed commitments enforce correctness)". This relies on the AIR constraints properly binding to initial data through preprocessed commitments. This should be verified by reviewing the constraint system.


🟠 [Low] Bug Risk: Cache Invalidation

Locations:

  • prover/src/tables/page.rs:271-277 - precomputed_commitment_cached()
  • prover/src/tables/register.rs:215 - preprocessed_commitment()

Issue: Both use OnceLock caching but don't key by ProofOptions. If called with different options (different blowup factor, coset offset):

  • First call with options A sets the cache
  • Second call with options B reuses cached value from A → wrong commitment

Current Risk: Low in practice if the same ProofOptions instance is used throughout, but fragile.

Recommendation: Either:

  1. Use HashMap<OptionsHash, Commitment> to key by options
  2. Document that all calls must use identical ProofOptions
  3. Consider removing caching if it's not a performance bottleneck

🔵 [Low] Potential Bounds Check

Location: prover/src/tables/page.rs:228-234 - compute_precomputed_commitment()

Issue: If config.init_values exists but has length < page_size, accessing init_vals[i] will panic.

Recommendation: Add bounds check or validate init_values.len() == page_size when constructing PageConfig.


🔵 [Low] Documentation Mismatch

Location: prover/src/lib.rs:6-11 - module docstring example

Issue: Example code hasn't been updated to reflect the API change:

let proof = lambda_vm_prover::prove(&elf_bytes).unwrap();

Should be:

let vm_proof = lambda_vm_prover::prove(&elf_bytes).unwrap();

The variable name proof is misleading since it's now a VmProof bundle, not a raw MultiProof.


Summary

Critical issues: None
High priority: 0
Medium priority: 2 (proof options trust, ELF data extraction trust model)
Low priority: 3 (caching, bounds check, docs)

The main concerns are around the trust model: the verifier trusts both proof_options and stack_size from the proof bundle. This is acceptable IF the preprocessed commitments properly enforce correctness through the constraint system. I recommend verifying this through constraint review or additional testing.

Comment thread prover/src/lib.rs
Comment thread prover/src/tables/page.rs Outdated
Comment thread prover/src/tables/page.rs
Comment thread prover/src/tables/register.rs
Comment thread prover/src/lib.rs
Comment thread prover/src/tables/trace_builder.rs
Comment thread prover/src/lib.rs
@claude
Copy link
Copy Markdown

claude Bot commented Feb 11, 2026

Additional Finding - Bounds Check is Actually Safe

Re: my earlier inline comment about bounds checking at page.rs:231:

After reviewing the code more thoroughly, I found that PageConfig::with_data() at line 128-136 properly handles this:

pub fn with_data(page_base: u64, page_size: usize, data: Vec<u8>) -> Self {
    assert!(data.len() <= page_size, "Data exceeds page size");
    let mut init_values = data;
    init_values.resize(page_size, 0); // Pad with zeros ✅
    ...
}

The resize(page_size, 0) ensures init_values.len() == page_size, so the access at line 231 in compute_precomputed_commitment() is always safe. The bounds check concern I raised is not an issue. Good defensive programming! 👍

Comment thread prover/src/tables/page.rs
Comment thread prover/src/tables/trace_builder.rs
@claude
Copy link
Copy Markdown

claude Bot commented Feb 11, 2026

✅ Review Complete

I've completed the security and code review for PR #321.

Summary of Findings

Total issues found: 6

  • 🔴 Medium severity: 2 (security trust model concerns)
  • 🟠 Low severity: 4 (caching, error handling, performance, docs)
  • Critical/High: None found

Key Takeaways

  1. Security Model: The PR introduces a trust model where the verifier uses proof_options and stack_size from the proof bundle. This is acceptable IF the preprocessed commitments properly bind to these parameters through the constraint system. Recommend adding validation tests for this.

  2. Correctness Fix: The ELF data extraction change (line 1179 in trace_builder.rs) is a critical correctness fix - the old code incorrectly created zero_init for ALL pages including ELF data pages.

  3. Code Quality: Overall good defensive programming with proper assertions and validations. The bounds check concern I initially raised turned out to be properly handled by resize() in PageConfig::with_data().

  4. Performance: The code is straightforward with no significant performance issues. The page data cloning is acceptable for typical program sizes.

Recommendations

  1. Add validation tests ensuring a prover cannot create valid proofs with mismatched stack_size or manipulated proof_options
  2. Consider adding minimum security requirements validation in the verifier
  3. Address the caching issue with ProofOptions (either key by options hash or document the requirement)
  4. Update the example code in the module docstring

No Critical Issues Found ✅

This PR is ready to merge from a security/correctness perspective, pending team discussion on the trust model design decisions.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 11, 2026

Benchmark Results for unmodified programs 🚀

Command Mean [ms] Min [ms] Max [ms] Relative
base binary_search 52.3 ± 0.8 51.4 53.3 1.00
head binary_search 58.0 ± 9.3 51.6 81.3 1.11 ± 0.18
Command Mean [ms] Min [ms] Max [ms] Relative
base bitwise_ops 53.9 ± 2.8 51.3 61.2 1.01 ± 0.06
head bitwise_ops 53.1 ± 1.1 51.2 53.9 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base fibonacci_26 57.2 ± 0.8 56.0 58.1 1.00
head fibonacci_26 58.9 ± 3.1 55.9 66.4 1.03 ± 0.06
Command Mean [ms] Min [ms] Max [ms] Relative
base hashmap 112.3 ± 1.1 111.1 114.5 1.00
head hashmap 112.9 ± 1.0 111.5 114.0 1.00 ± 0.01
Command Mean [ms] Min [ms] Max [ms] Relative
base keccak 116.3 ± 2.8 113.9 121.6 1.00 ± 0.03
head keccak 115.7 ± 2.0 112.7 120.4 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base matrix_multiply 64.2 ± 0.9 63.1 66.0 1.00
head matrix_multiply 64.8 ± 2.0 62.6 69.2 1.01 ± 0.03
Command Mean [ms] Min [ms] Max [ms] Relative
base modular_exp 53.8 ± 1.1 52.0 56.3 1.01 ± 0.02
head modular_exp 53.5 ± 0.6 52.4 54.1 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base quicksort 56.7 ± 1.5 55.2 59.1 1.00
head quicksort 56.9 ± 0.7 55.5 57.9 1.00 ± 0.03
Command Mean [ms] Min [ms] Max [ms] Relative
base sieve 60.7 ± 0.8 59.4 62.0 1.00
head sieve 60.9 ± 0.7 59.3 61.7 1.00 ± 0.02
Command Mean [ms] Min [ms] Max [ms] Relative
base sum_array 70.8 ± 1.1 68.7 72.3 1.02 ± 0.02
head sum_array 69.7 ± 1.1 68.1 71.4 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
base syscall_commit 87.8 ± 4.8 84.3 96.7 1.02 ± 0.06
head syscall_commit 85.9 ± 1.4 82.9 88.0 1.00

@ColoCarletti
Copy link
Copy Markdown
Collaborator Author

/claude

@claude
Copy link
Copy Markdown

claude Bot commented Feb 11, 2026

Review Summary

I've reviewed this PR focusing on security, bugs, performance, and code simplicity. Overall, the implementation is solid with good attention to cryptographic correctness. Below are the findings:

Critical Issues

1. [CRITICAL] OnceLock Cache Invalidation Risk (prover/src/tables/page.rs:213, prover/src/tables/register.rs:160)

The static caches use OnceLock but have a critical invariant documented in comments: All callers within a process must use identical ProofOptions. This is not enforced and could lead to soundness bugs.

Problem: If different ProofOptions (e.g., different blowup factors or coset offsets) are used across multiple prove/verify calls in the same process, the cache will return commitments computed with the first options, not the current ones. This breaks the cryptographic binding.

Attack scenario:

  • Prover uses weak options (low blowup factor) → cache stores weak commitment
  • Verifier uses strong options → cache returns weak commitment → verifier accepts weak proof

Recommended fix: Either:

  1. Make the cache key include a hash of ProofOptions
  2. Remove caching entirely (simplicity over micro-optimization)
  3. Add runtime validation that compares current options with cached options and panics/errors on mismatch

This is marked CRITICAL because it directly affects soundness of the zero-knowledge proof system.


High Issues

2. [HIGH] Stack Size Not Validated in Verifier (prover/src/lib.rs:276)

The verifier accepts vm_proof.stack_size as a hint from the prover without bounds checking. While the PR description claims preprocessed commitments enforce correctness, there's a window for issues:

Problem: A malicious prover could provide an arbitrarily large stack_size (e.g., u64::MAX), causing the verifier to:

  • Allocate excessive memory in page_configs_from_elf
  • Potentially trigger integer overflows in page calculations
  • DoS the verifier

Impact: The verifier could crash or hang before getting to validate the preprocessed commitments.

Recommended fix: Add validation for reasonable stack size bounds


Medium Issues

3. [MEDIUM] Byte Extraction Endianness Assumption (prover/src/tables/trace_builder.rs:1171)

The byte extraction code assumes little-endian byte ordering. RISC-V is little-endian, so this is likely correct, but it's not explicitly documented.

Recommended fix: Add a comment clarifying the endianness assumption or use explicit methods like to_le_bytes() for clarity.


4. [MEDIUM] Unused proof_options in VmProof (prover/src/lib.rs:48-60)

The VmProof struct stores proof_options but the comment says they are metadata only and NOT trusted by the verifier. The verifier now correctly ignores them (after the most recent changes).

Issue: Storing untrusted data in the proof increases serialization size and could confuse API users about what's actually validated.

Recommended fix: Either remove proof_options from VmProof entirely, or clearly document why they're included (e.g., for proof reproducibility / debugging).


Low Issues

5. [LOW] Clone of PageConfig init_data (prover/src/tables/trace_builder.rs:1198)

For 4KB pages, this clones 4KB vectors. Not a major issue since this runs once per page, but could be optimized.


Positive Observations

  • Good separation of proving/verification concerns with the new VmProof API
  • Preprocessed columns implementation follows existing DECODE pattern correctly
  • Proper use of expect with descriptive messages for crypto operations that shouldn't fail
  • Test coverage updated comprehensively
  • Comments explain the security model (even if implementation has gaps)

Summary

Must fix before merge:

Should fix:

Optional:

Comment thread prover/src/tables/page.rs
Comment thread prover/src/tables/register.rs
Comment thread prover/src/lib.rs
Comment thread prover/src/tables/trace_builder.rs
Comment thread prover/src/lib.rs Outdated
Copy link
Copy Markdown
Contributor

@MauroToscano MauroToscano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API for prove and verify needs to be simplified, a comment on what part of the table is precomputed is needed, and stack_top should be taken from the VM

@MauroToscano MauroToscano added this pull request to the merge queue Feb 12, 2026
Merged via the queue into main with commit ce9f998 Feb 12, 2026
9 checks passed
@MauroToscano MauroToscano deleted the precompute_init_memory branch February 12, 2026 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants