Skip to content

Test/ef io edge cases#589

Open
MauroToscano wants to merge 12 commits into
mainfrom
test/ef-io-edge-cases
Open

Test/ef io edge cases#589
MauroToscano wants to merge 12 commits into
mainfrom
test/ef-io-edge-cases

Conversation

@MauroToscano
Copy link
Copy Markdown
Contributor

No description provided.

jotabulacios and others added 12 commits May 7, 2026 15:34
   for len in load_bytes instead of
  an implicit cast.
Guest programs:
- ef_io_no_input: read_input with no private input (buf_size==0), then
  write_output with hardcoded "ok"
- ef_io_zero_write: write_output(ptr, 0) followed by write_output(ptr, 5)
- ef_io_multi_write: 10 single-byte write_output calls concatenating to
  "abcdefghij"

Unit tests (memory.rs):
- Zero-length commit after data is a no-op
- Zero-length commit at exact 1 MB cap boundary succeeds
- Failed commit (cap exceeded) does not modify the output buffer

Integration tests (rust.rs):
- test_ef_io_no_input
- test_ef_io_zero_write
- test_ef_io_multi_write

Prover E2E test:
- test_prove_ef_io_no_input: full prove+verify with no private input
@github-actions
Copy link
Copy Markdown

Codex Code Review

No issues found in the PR diff.

I attempted a lightweight cargo check -p executor -p lambda-vm-syscalls -p prover, but the environment’s Rust toolchain tried to write under /home/runner/.rustup and failed due a read-only filesystem, so I could not verify compilation locally.

@claude
Copy link
Copy Markdown

claude Bot commented May 18, 2026

PR Review: EF IO Edge Cases

High

load_half panics on unaligned access instead of returning Err (executor/src/vm/memory.rs:126)
All other memory access functions in this PR now return Result<_, MemoryError>. load_half still calls unimplemented!() on an unaligned address, which aborts the host process. A malformed guest can trigger this. Should return Err(MemoryError::UnalignedAccess) to match the new error-handling posture.


Medium

MAX_PUBLIC_OUTPUT_TOTAL_SIZE silently increased 1024x (executor/src/vm/memory.rs:44)
Old per-call limit was 1 KB; new cumulative limit is 1 MB. Public output directly affects proof size/cost in a zkVM. The 1 MB value appears arbitrary — needs justification either against the EF zkVM IO spec or against COMMIT AIR constraints.

read_input sets buf_ptr to a non-null value even when len == 0 (syscalls/src/ef_io.rs:39)
The EF spec says buf_ptr is unspecified when buf_size == 0. Callers that check buf_ptr == null to detect "no input" will be misled. Should set *buf_ptr = core::ptr::null() when len == 0.


Low

debug_assert_eq! strips a proof-correctness invariant in release builds (prover/src/lib.rs:654)
The check that the executor view matches trace-reconstructed output is a correctness invariant for generated proofs. debug_assert_eq! is elided in release builds, where a silent divergence would produce incorrect proofs with no runtime signal. Use assert_eq! unless there is a documented performance reason.

SyscallNumbers lacks #[repr(usize)] (syscalls/src/syscalls.rs:12)
The enum is cast with as usize in inline asm. Without #[repr(usize)], the discriminant representation is implementation-defined. Add #[repr(usize)] to make the ABI contract explicit.

write_output inline asm does not declare a0 as clobbered (syscalls/src/ef_io.rs:61)
ecall writes its return value into a0. If Rust has a0 live after the asm block, the compiler may miscompile. Add lateout("a0") _ to the asm constraints (same issue exists in syscalls.rs for the other ecalls).

Comment thread executor/src/vm/memory.rs
@@ -117,7 +129,11 @@ impl Memory {
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[High] load_half panics on unaligned access while all other memory functions in this PR return Err. A malformed guest can abort the host process.

Suggested change
}
return Err(MemoryError::UnalignedAccess);

(Also add UnalignedAccess to the MemoryError enum if not already present.)

Comment thread executor/src/vm/memory.rs
/// Total cap on public output bytes across all `commit_public_output` calls.
/// The COMMIT AIR concatenates calls via the running `x254` index, so this
/// is enforced as a running-total budget rather than a per-call limit.
pub const MAX_PUBLIC_OUTPUT_TOTAL_SIZE: u64 = 1024 * 1024;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] This is a 1024x increase from the old 1 KB per-call limit and a semantic shift to a cumulative limit. Public output size directly affects proof cost. What is this value based on — the EF zkVM IO spec, COMMIT AIR constraints, or something else? Needs a comment or PR description justification.

Comment thread syscalls/src/ef_io.rs
unsafe {
let len_ptr = PRIVATE_INPUT_START as *const u32;
let len = core::ptr::read_volatile(len_ptr) as usize;
*buf_ptr = (PRIVATE_INPUT_START + 4) as *const u8;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] The EF IO spec says buf_ptr is unspecified when buf_size == 0. This always writes a non-null pointer, which misleads callers that test buf_ptr == null to detect "no input" — a common pattern in portable zkVM code.

Suggested change
*buf_ptr = (PRIVATE_INPUT_START + 4) as *const u8;
*buf_ptr = if len == 0 {
core::ptr::null()
} else {
(PRIVATE_INPUT_START + 4) as *const u8
};

Comment thread prover/src/lib.rs
.filter(|c| c.is_private_input)
.count();

debug_assert_eq!(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low] debug_assert_eq! is compiled out in release builds. If executor view and trace-reconstructed output silently diverge in production, incorrect proofs are generated with no runtime signal. Use assert_eq! here — or document why eliding this check in release is safe.

Comment thread syscalls/src/syscalls.rs

#[cfg(target_arch = "riscv64")]
enum SyscallNumbers {
pub enum SyscallNumbers {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low] The enum is cast as usize in inline asm. Without #[repr(usize)] the discriminant representation is implementation-defined. Add the repr to make the ABI contract explicit.

Suggested change
pub enum SyscallNumbers {
#[repr(usize)]
pub enum SyscallNumbers {

Comment thread syscalls/src/ef_io.rs
in("a2") size,
in("a7") SyscallNumbers::Commit as usize,
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Low] ecall writes its return value into a0. Without a clobber declaration, the compiler may keep a live value in a0 across this asm block and get it silently overwritten. Add lateout("a0") _ to the constraints. (The same issue exists in the ecall blocks in syscalls.rs.)

@github-actions
Copy link
Copy Markdown

Benchmark Results for modified programs 🚀

Command Mean [ms] Min [ms] Max [ms] Relative
head hashmap 162.4 ± 29.8 128.8 199.4 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
head keccak 127.9 ± 2.9 125.4 133.1 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
head syscall_commit 91.9 ± 0.7 91.2 93.5 1.00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants