Add preprocessed columns for PAGE/REGISTER tables and introduce VmProof API by ColoCarletti · Pull Request #321 · yetanotherco/lambda_vm

ColoCarletti · 2026-02-11T15:44:14Z

Preprocessed OFFSET+INIT columns: PAGE and REGISTER tables now commit to their OFFSET and INIT columns
via Merkle roots, allowing the verifier to independently recompute and verify initial memory/register
state from the ELF. Follows the existing DECODE preprocessed pattern (FFT → LDE → Merkle tree).
Zero-init pages share a cached commitment via OnceLock.
Fixed page_configs_from_elf: Was creating zero_init for ALL pages including ELF data pages, so the
verifier couldn't reconstruct correct init data. Now extracts actual ELF byte data per page.
VmProof API: Replaces raw MultiProof return type with a VmProof bundle containing proof, stack_size,
and proof_options. The verifier extracts these from the proof instead of relying on hardcoded constants.
stack_size is a hint (preprocessed commitments enforce correctness).

github-actions · 2026-02-11T15:44:37Z

Kimi Code Review

Security Vulnerabilities

High: The use of OnceLock for caching commitments in page.rs and register.rs can lead to a potential race condition if multiple threads are accessing these locks simultaneously. This is because OnceLock is not thread-safe and can cause data races when used in a multi-threaded context. To mitigate this, consider using thread-safe alternatives like std::sync::Mutex or std::sync::RwLock for caching commitments.

Potential Bugs

Medium: In lib.rs, the changes to create_register_air and create_page_air involve calling .with_preprocessed with a commitment. However, there is no check to ensure that the commitment is valid or non-zero before using it. This could lead to incorrect behavior if the commitment is invalid or zero. It is recommended to add checks to ensure the validity of the commitment before using it.

Performance Issues

Low: The introduction of precomputed_commitment_cached and compute_precomputed_commitment in page.rs and register.rs involves additional computations for polynomial evaluation and Merkle tree building. While these are necessary for correctness, they could introduce performance overhead. It is important to benchmark and profile these changes to ensure they do not significantly impact the overall performance.

Simplicity

Medium: The changes introduce additional complexity with the use of OnceLock for caching and the computation of precomputed commitments. While these are necessary for correctness, they add complexity to the code. It is important to balance correctness with simplicity and readability. Consider documenting the rationale behind these changes to help maintainers understand the trade-offs.

Overall, the changes introduce necessary functionality for precomputed commitments and caching, but they also bring potential security vulnerabilities and performance concerns. It is important to address these issues and ensure the changes are well-documented and tested.

Automated review by Kimi (Moonshot AI)

github-actions · 2026-02-11T15:45:13Z

Codex Code Review

Findings (ordered by severity)

High (security/correctness): Commitments are cached without including ProofOptions. register::preprocessed_commitment and page::precomputed_commitment_cached reuse a single Commitment even if blowup_factor or coset_offset change, which will produce incorrect commitments for subsequent proofs and can cause verification failure or mismatched public data. Fix by keying the cache by the options (or remove caching). Files: prover/src/tables/register.rs, prover/src/tables/page.rs.
Low (stability/DoS): compute_precomputed_commitment uses assert!/expect on input-dependent data (page size, FFT/LDE, Merkle build). If any of these are user-controlled or can be corrupted, this becomes a hard panic. Consider returning a Result and handling errors in the caller. Files: prover/src/tables/page.rs, prover/src/tables/register.rs.

No other security, bug, or significant performance issues found in this diff.

claude · 2026-02-11T15:46:52Z

Security & Code Review Findings

I've reviewed PR #321. Here are the issues found, categorized by severity:

🔴 [Medium] Security: Proof Options Trust Issue

Location: prover/src/lib.rs - prove_with_stack() and verify() functions

Issue: The verifier extracts proof_options from the VmProof bundle and uses them directly without validation. A malicious prover could:

Use weak proof options (e.g., low blowup factor, compromised security parameters)
Package these weak options in the proof
The verifier blindly accepts and uses these weak options

Impact: Proofs could be generated with intentionally weakened security parameters.

Recommendation:

Enforce minimum security requirements on proof_options in the verifier
OR take proof_options as a verifier parameter and validate they match
OR clearly document that the verifier trusts proof_options in the bundle

🟡 [Medium] Security: ELF Data Extraction Change

Location: prover/src/tables/trace_builder.rs:1176-1179 - page_configs_from_elf()

Issue: This PR changes how ELF page data is extracted:

Old: Created zero_init for ALL pages
New: Extracts actual byte values from ELF segments

The docstring (line 1155-1157) says "page_configs should be embedded in proof public inputs" but they're NOT. The verifier reconstructs them from the ELF alone. The stack_size is documented as a "hint" (VmProof docstring) but is trusted.

Potential Impact: If a malicious prover uses different initial page values during proving, the verifier might reconstruct different configs.

Status: PR description acknowledges this: "stack_size is a hint (preprocessed commitments enforce correctness)". This relies on the AIR constraints properly binding to initial data through preprocessed commitments. This should be verified by reviewing the constraint system.

🟠 [Low] Bug Risk: Cache Invalidation

Locations:

prover/src/tables/page.rs:271-277 - precomputed_commitment_cached()
prover/src/tables/register.rs:215 - preprocessed_commitment()

Issue: Both use OnceLock caching but don't key by ProofOptions. If called with different options (different blowup factor, coset offset):

First call with options A sets the cache
Second call with options B reuses cached value from A → wrong commitment

Current Risk: Low in practice if the same ProofOptions instance is used throughout, but fragile.

Recommendation: Either:

Use HashMap<OptionsHash, Commitment> to key by options
Document that all calls must use identical ProofOptions
Consider removing caching if it's not a performance bottleneck

🔵 [Low] Potential Bounds Check

Location: prover/src/tables/page.rs:228-234 - compute_precomputed_commitment()

Issue: If config.init_values exists but has length < page_size, accessing init_vals[i] will panic.

Recommendation: Add bounds check or validate init_values.len() == page_size when constructing PageConfig.

🔵 [Low] Documentation Mismatch

Location: prover/src/lib.rs:6-11 - module docstring example

Issue: Example code hasn't been updated to reflect the API change:

let proof = lambda_vm_prover::prove(&elf_bytes).unwrap();

Should be:

let vm_proof = lambda_vm_prover::prove(&elf_bytes).unwrap();

The variable name proof is misleading since it's now a VmProof bundle, not a raw MultiProof.

Summary

Critical issues: None
High priority: 0
Medium priority: 2 (proof options trust, ELF data extraction trust model)
Low priority: 3 (caching, bounds check, docs)

The main concerns are around the trust model: the verifier trusts both proof_options and stack_size from the proof bundle. This is acceptable IF the preprocessed commitments properly enforce correctness through the constraint system. I recommend verifying this through constraint review or additional testing.

claude · 2026-02-11T15:47:51Z

Additional Finding - Bounds Check is Actually Safe

Re: my earlier inline comment about bounds checking at page.rs:231:

After reviewing the code more thoroughly, I found that PageConfig::with_data() at line 128-136 properly handles this:

pub fn with_data(page_base: u64, page_size: usize, data: Vec<u8>) -> Self {
    assert!(data.len() <= page_size, "Data exceeds page size");
    let mut init_values = data;
    init_values.resize(page_size, 0); // Pad with zeros ✅
    ...
}

The resize(page_size, 0) ensures init_values.len() == page_size, so the access at line 231 in compute_precomputed_commitment() is always safe. The bounds check concern I raised is not an issue. Good defensive programming! 👍

claude · 2026-02-11T15:49:02Z

✅ Review Complete

I've completed the security and code review for PR #321.

Summary of Findings

Total issues found: 6

🔴 Medium severity: 2 (security trust model concerns)
🟠 Low severity: 4 (caching, error handling, performance, docs)
✅ Critical/High: None found

Key Takeaways

Security Model: The PR introduces a trust model where the verifier uses proof_options and stack_size from the proof bundle. This is acceptable IF the preprocessed commitments properly bind to these parameters through the constraint system. Recommend adding validation tests for this.
Correctness Fix: The ELF data extraction change (line 1179 in trace_builder.rs) is a critical correctness fix - the old code incorrectly created zero_init for ALL pages including ELF data pages.
Code Quality: Overall good defensive programming with proper assertions and validations. The bounds check concern I initially raised turned out to be properly handled by resize() in PageConfig::with_data().
Performance: The code is straightforward with no significant performance issues. The page data cloning is acceptable for typical program sizes.

Recommendations

Add validation tests ensuring a prover cannot create valid proofs with mismatched stack_size or manipulated proof_options
Consider adding minimum security requirements validation in the verifier
Address the caching issue with ProofOptions (either key by options hash or document the requirement)
Update the example code in the module docstring

No Critical Issues Found ✅

This PR is ready to merge from a security/correctness perspective, pending team discussion on the trust model design decisions.

github-actions · 2026-02-11T15:51:14Z

Benchmark Results for unmodified programs 🚀

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base binary_search`	52.3 ± 0.8	51.4	53.3	1.00
`head binary_search`	58.0 ± 9.3	51.6	81.3	1.11 ± 0.18

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base bitwise_ops`	53.9 ± 2.8	51.3	61.2	1.01 ± 0.06
`head bitwise_ops`	53.1 ± 1.1	51.2	53.9	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base fibonacci_26`	57.2 ± 0.8	56.0	58.1	1.00
`head fibonacci_26`	58.9 ± 3.1	55.9	66.4	1.03 ± 0.06

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base hashmap`	112.3 ± 1.1	111.1	114.5	1.00
`head hashmap`	112.9 ± 1.0	111.5	114.0	1.00 ± 0.01

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base keccak`	116.3 ± 2.8	113.9	121.6	1.00 ± 0.03
`head keccak`	115.7 ± 2.0	112.7	120.4	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base matrix_multiply`	64.2 ± 0.9	63.1	66.0	1.00
`head matrix_multiply`	64.8 ± 2.0	62.6	69.2	1.01 ± 0.03

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base modular_exp`	53.8 ± 1.1	52.0	56.3	1.01 ± 0.02
`head modular_exp`	53.5 ± 0.6	52.4	54.1	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base quicksort`	56.7 ± 1.5	55.2	59.1	1.00
`head quicksort`	56.9 ± 0.7	55.5	57.9	1.00 ± 0.03

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base sieve`	60.7 ± 0.8	59.4	62.0	1.00
`head sieve`	60.9 ± 0.7	59.3	61.7	1.00 ± 0.02

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base sum_array`	70.8 ± 1.1	68.7	72.3	1.02 ± 0.02
`head sum_array`	69.7 ± 1.1	68.1	71.4	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base syscall_commit`	87.8 ± 4.8	84.3	96.7	1.02 ± 0.06
`head syscall_commit`	85.9 ± 1.4	82.9	88.0	1.00

ColoCarletti · 2026-02-11T16:00:29Z

/claude

claude · 2026-02-11T16:03:32Z

Review Summary

I've reviewed this PR focusing on security, bugs, performance, and code simplicity. Overall, the implementation is solid with good attention to cryptographic correctness. Below are the findings:

Critical Issues

1. [CRITICAL] OnceLock Cache Invalidation Risk (prover/src/tables/page.rs:213, prover/src/tables/register.rs:160)

The static caches use OnceLock but have a critical invariant documented in comments: All callers within a process must use identical ProofOptions. This is not enforced and could lead to soundness bugs.

Problem: If different ProofOptions (e.g., different blowup factors or coset offsets) are used across multiple prove/verify calls in the same process, the cache will return commitments computed with the first options, not the current ones. This breaks the cryptographic binding.

Attack scenario:

Prover uses weak options (low blowup factor) → cache stores weak commitment
Verifier uses strong options → cache returns weak commitment → verifier accepts weak proof

Recommended fix: Either:

Make the cache key include a hash of ProofOptions
Remove caching entirely (simplicity over micro-optimization)
Add runtime validation that compares current options with cached options and panics/errors on mismatch

This is marked CRITICAL because it directly affects soundness of the zero-knowledge proof system.

High Issues

2. [HIGH] Stack Size Not Validated in Verifier (prover/src/lib.rs:276)

The verifier accepts vm_proof.stack_size as a hint from the prover without bounds checking. While the PR description claims preprocessed commitments enforce correctness, there's a window for issues:

Problem: A malicious prover could provide an arbitrarily large stack_size (e.g., u64::MAX), causing the verifier to:

Allocate excessive memory in page_configs_from_elf
Potentially trigger integer overflows in page calculations
DoS the verifier

Impact: The verifier could crash or hang before getting to validate the preprocessed commitments.

Recommended fix: Add validation for reasonable stack size bounds

Medium Issues

3. [MEDIUM] Byte Extraction Endianness Assumption (prover/src/tables/trace_builder.rs:1171)

The byte extraction code assumes little-endian byte ordering. RISC-V is little-endian, so this is likely correct, but it's not explicitly documented.

Recommended fix: Add a comment clarifying the endianness assumption or use explicit methods like to_le_bytes() for clarity.

4. [MEDIUM] Unused proof_options in VmProof (prover/src/lib.rs:48-60)

The VmProof struct stores proof_options but the comment says they are metadata only and NOT trusted by the verifier. The verifier now correctly ignores them (after the most recent changes).

Issue: Storing untrusted data in the proof increases serialization size and could confuse API users about what's actually validated.

Recommended fix: Either remove proof_options from VmProof entirely, or clearly document why they're included (e.g., for proof reproducibility / debugging).

Low Issues

5. [LOW] Clone of PageConfig init_data (prover/src/tables/trace_builder.rs:1198)

For 4KB pages, this clones 4KB vectors. Not a major issue since this runs once per page, but could be optimized.

Positive Observations

Good separation of proving/verification concerns with the new VmProof API
Preprocessed columns implementation follows existing DECODE pattern correctly
Proper use of expect with descriptive messages for crypto operations that shouldn't fail
Test coverage updated comprehensively
Comments explain the security model (even if implementation has gaps)

Summary

Must fix before merge:

Issue Initial vm implementation #1 (OnceLock cache) - Critical soundness issue
Issue Add ci #2 (stack_size validation) - High security issue

Should fix:

Issue execution bugfixes #3 (endianness clarity)
Issue Send loc reports #4 (unused proof_options)

Optional:

Issue Add workspace #5 (clone optimization)

MauroToscano

API for prove and verify needs to be simplified, a comment on what part of the table is precomputed is needed, and stack_top should be taken from the VM

make memory precomputed

7cfb44a

change verify api

4022c2b