Add preprocessed columns for PAGE/REGISTER tables and introduce VmProof API#321
Conversation
Kimi Code ReviewSecurity Vulnerabilities
Potential Bugs
Performance Issues
Simplicity
Overall, the changes introduce necessary functionality for precomputed commitments and caching, but they also bring potential security vulnerabilities and performance concerns. It is important to address these issues and ensure the changes are well-documented and tested. Automated review by Kimi (Moonshot AI) |
Codex Code ReviewFindings (ordered by severity)
No other security, bug, or significant performance issues found in this diff. |
Security & Code Review FindingsI've reviewed PR #321. Here are the issues found, categorized by severity: 🔴 [Medium] Security: Proof Options Trust IssueLocation: Issue: The verifier extracts
Impact: Proofs could be generated with intentionally weakened security parameters. Recommendation:
🟡 [Medium] Security: ELF Data Extraction ChangeLocation: Issue: This PR changes how ELF page data is extracted:
The docstring (line 1155-1157) says "page_configs should be embedded in proof public inputs" but they're NOT. The verifier reconstructs them from the ELF alone. The Potential Impact: If a malicious prover uses different initial page values during proving, the verifier might reconstruct different configs. Status: PR description acknowledges this: "stack_size is a hint (preprocessed commitments enforce correctness)". This relies on the AIR constraints properly binding to initial data through preprocessed commitments. This should be verified by reviewing the constraint system. 🟠 [Low] Bug Risk: Cache InvalidationLocations:
Issue: Both use
Current Risk: Low in practice if the same Recommendation: Either:
🔵 [Low] Potential Bounds CheckLocation: Issue: If Recommendation: Add bounds check or validate 🔵 [Low] Documentation MismatchLocation: Issue: Example code hasn't been updated to reflect the API change: let proof = lambda_vm_prover::prove(&elf_bytes).unwrap();Should be: let vm_proof = lambda_vm_prover::prove(&elf_bytes).unwrap();The variable name SummaryCritical issues: None The main concerns are around the trust model: the verifier trusts both |
Additional Finding - Bounds Check is Actually SafeRe: my earlier inline comment about bounds checking at After reviewing the code more thoroughly, I found that pub fn with_data(page_base: u64, page_size: usize, data: Vec<u8>) -> Self {
assert!(data.len() <= page_size, "Data exceeds page size");
let mut init_values = data;
init_values.resize(page_size, 0); // Pad with zeros ✅
...
}The |
✅ Review CompleteI've completed the security and code review for PR #321. Summary of FindingsTotal issues found: 6
Key Takeaways
Recommendations
No Critical Issues Found ✅This PR is ready to merge from a security/correctness perspective, pending team discussion on the trust model design decisions. |
|
Benchmark Results for unmodified programs 🚀
|
|
/claude |
Review SummaryI've reviewed this PR focusing on security, bugs, performance, and code simplicity. Overall, the implementation is solid with good attention to cryptographic correctness. Below are the findings: Critical Issues1. [CRITICAL] OnceLock Cache Invalidation Risk (prover/src/tables/page.rs:213, prover/src/tables/register.rs:160) The static caches use OnceLock but have a critical invariant documented in comments: All callers within a process must use identical ProofOptions. This is not enforced and could lead to soundness bugs. Problem: If different ProofOptions (e.g., different blowup factors or coset offsets) are used across multiple prove/verify calls in the same process, the cache will return commitments computed with the first options, not the current ones. This breaks the cryptographic binding. Attack scenario:
Recommended fix: Either:
This is marked CRITICAL because it directly affects soundness of the zero-knowledge proof system. High Issues2. [HIGH] Stack Size Not Validated in Verifier (prover/src/lib.rs:276) The verifier accepts vm_proof.stack_size as a hint from the prover without bounds checking. While the PR description claims preprocessed commitments enforce correctness, there's a window for issues: Problem: A malicious prover could provide an arbitrarily large stack_size (e.g., u64::MAX), causing the verifier to:
Impact: The verifier could crash or hang before getting to validate the preprocessed commitments. Recommended fix: Add validation for reasonable stack size bounds Medium Issues3. [MEDIUM] Byte Extraction Endianness Assumption (prover/src/tables/trace_builder.rs:1171) The byte extraction code assumes little-endian byte ordering. RISC-V is little-endian, so this is likely correct, but it's not explicitly documented. Recommended fix: Add a comment clarifying the endianness assumption or use explicit methods like to_le_bytes() for clarity. 4. [MEDIUM] Unused proof_options in VmProof (prover/src/lib.rs:48-60) The VmProof struct stores proof_options but the comment says they are metadata only and NOT trusted by the verifier. The verifier now correctly ignores them (after the most recent changes). Issue: Storing untrusted data in the proof increases serialization size and could confuse API users about what's actually validated. Recommended fix: Either remove proof_options from VmProof entirely, or clearly document why they're included (e.g., for proof reproducibility / debugging). Low Issues5. [LOW] Clone of PageConfig init_data (prover/src/tables/trace_builder.rs:1198) For 4KB pages, this clones 4KB vectors. Not a major issue since this runs once per page, but could be optimized. Positive Observations
SummaryMust fix before merge:
Should fix:
Optional:
|
MauroToscano
left a comment
There was a problem hiding this comment.
API for prove and verify needs to be simplified, a comment on what part of the table is precomputed is needed, and stack_top should be taken from the VM
via Merkle roots, allowing the verifier to independently recompute and verify initial memory/register
state from the ELF. Follows the existing DECODE preprocessed pattern (FFT → LDE → Merkle tree).
Zero-init pages share a cached commitment via OnceLock.
verifier couldn't reconstruct correct init data. Now extracts actual ELF byte data per page.
and proof_options. The verifier extracts these from the proof instead of relying on hardcoded constants.
stack_size is a hint (preprocessed commitments enforce correctness).