Skip to content

feat(prover): Memory-mapped private input for guest programs#512

Merged
MauroToscano merged 29 commits into
mainfrom
feat/private-input-v2
Apr 24, 2026
Merged

feat(prover): Memory-mapped private input for guest programs#512
MauroToscano merged 29 commits into
mainfrom
feat/private-input-v2

Conversation

@ColoCarletti
Copy link
Copy Markdown
Collaborator

Summary

Replace the GetPrivateInputs ecall with a memory-mapped approach: private input is pre-loaded at address 0xFF000000 as part of the initial memory image. The guest reads it via normal
RISC-V loads — no ecall, no invisible writes, no bus imbalance.

Problem

The old GetPrivateInputs ecall (syscall #4) copied input bytes into a guest buffer inside the executor without emitting any Log entries. The prover never saw those writes, so when
the guest later read the buffer, the MEMW table had old_value = actual_byte but the PAGE table had init = 0 (zero-init page). This caused a Memory bus imbalance and made verification
fail for any program reading private input.

How it works now

  1. Host side: Executor::new(elf, private_input) pre-loads input at 0xFF000000 with a 4-byte LE length prefix followed by the raw data bytes. This happens before execution starts.

  2. Prover side: MemoryState::add_private_input() mirrors the same layout at timestamp 0. A shared helper build_init_page_data(elf, private_input) produces the PAGE init values
    for both the prover (trace generation) and verifier (page config reconstruction), ensuring they agree exactly.

  3. Guest side: get_private_input() does a read_volatile of the length at 0xFF000000, then slice::from_raw_parts for the data bytes. No ecall — just normal RISC-V loads that
    the prover tracks through the standard MEMW → PAGE chain.

  4. Proof artifact: VmProof includes private_input: Vec<u8> so the verifier can reconstruct the matching PAGE preprocessed commitments at 0xFF000000.

API

// Prove with private input
let proof = prove_with_input(&elf_bytes, input_bytes)?;

// Prove without (backwards compatible)
let proof = prove(&elf_bytes)?;

// Verify (uses proof.private_input to reconstruct PAGE commitments)
let ok = verify(&proof, &elf_bytes)?;

Changes

- executor/src/vm/memory.rs — Make PRIVATE_INPUT_START_INDEX and MAX_PRIVATE_INPUT_SIZE public. Remove unused load_private_inputs().
- executor/src/vm/instruction/execution.rs — Remove GetPrivateInputs ecall handler and enum variant.
- syscalls/src/syscalls.rs — Rewrite get_private_input() as read_volatile from 0xFF000000. Remove GetPrivateInputs from guest-side enum.
- prover/src/lib.rs — New prove_with_input() wrapper. Add private_input: Vec<u8> to VmProof. Thread through prove_with_options and verify_with_options.
- prover/src/tables/trace_builder.rs — Add MemoryState::add_private_input(). Extract build_init_page_data() shared helper. Thread private_input through from_elf_and_logs,
generate_page_tables, collect_bitwise_from_page, and page_configs_from_elf_and_runtime.
- bin/cli/src/main.rs — Pass vec![] for new private_input parameter.

@ColoCarletti ColoCarletti marked this pull request as draft April 20, 2026 14:26
@github-actions
Copy link
Copy Markdown

Codex Code Review

  1. High - Sensitive data disclosure (private_input is now part of proof payload)
    prover/src/lib.rs:141, prover/src/lib.rs:591, prover/src/lib.rs:623
    VmProof now includes raw private_input and verifier consumes it directly. If “private input” is expected to remain confidential, this PR declassifies it to anyone holding the proof.
    Fix: keep raw private input out of VmProof; include only a commitment (hash/Merkle root) and verify against that.

  2. Medium - Unchecked length can cause UB/DoS in guest syscall helper
    syscalls/src/syscalls.rs:90, syscalls/src/syscalls.rs:93
    get_private_input() trusts a memory-loaded length and immediately calls from_raw_parts(data_ptr, len) with no bound check. If length is corrupted/tampered in guest memory, this can create invalid slices or huge allocations (to_vec()), leading to crashes/panics.
    Fix: validate len against a hard cap before constructing a slice, and return SyscallError on overflow.

  3. Medium - Verifier-side DoS risk from unbounded vm_proof.private_input
    prover/src/lib.rs:623, prover/src/tables/trace_builder.rs:1377
    Verification does not enforce a maximum private-input size before reconstructing page init data, and build_init_page_data copies all bytes again into all_bytes. A maliciously large proof input can force large extra allocations/work during verify.
    Fix: enforce private_input.len() <= MAX_PRIVATE_INPUT_SIZE early in verify_with_options and avoid the extra all_bytes copy path.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

Benchmark Results for modified programs 🚀

Command Mean [ms] Min [ms] Max [ms] Relative
head hashmap 150.0 ± 6.9 142.7 161.7 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
head keccak 137.3 ± 2.6 133.2 141.3 1.00
Command Mean [ms] Min [ms] Max [ms] Relative
head syscall_commit 93.7 ± 0.9 92.4 94.9 1.00

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

Benchmark Results for unmodified programs 🚀

Command Mean [ms] Min [ms] Max [ms] Relative
base binary_search 59.4 ± 1.0 58.5 61.6 1.00
head binary_search 61.5 ± 7.0 58.5 81.3 1.03 ± 0.12
Command Mean [ms] Min [ms] Max [ms] Relative
base bitwise_ops 59.3 ± 1.2 58.2 61.6 1.00
head bitwise_ops 59.6 ± 1.4 58.1 61.3 1.00 ± 0.03
Command Mean [ms] Min [ms] Max [ms] Relative
base fibonacci_26 63.6 ± 1.5 62.5 67.5 1.00
head fibonacci_26 63.8 ± 1.1 62.8 66.4 1.00 ± 0.03
Command Mean [ms] Min [ms] Max [ms] Relative
base matrix_multiply 63.5 ± 0.5 62.7 64.4 1.00
head matrix_multiply 63.5 ± 1.1 61.9 65.2 1.00 ± 0.02
Command Mean [ms] Min [ms] Max [ms] Relative
base modular_exp 59.2 ± 0.8 58.6 61.2 1.00
head modular_exp 59.4 ± 0.7 58.4 60.5 1.00 ± 0.02
Command Mean [ms] Min [ms] Max [ms] Relative
base quicksort 62.3 ± 0.7 61.1 63.1 1.00
head quicksort 87.7 ± 12.0 65.4 101.9 1.41 ± 0.19
Command Mean [ms] Min [ms] Max [ms] Relative
base sieve 64.8 ± 0.6 63.9 65.7 1.00
head sieve 64.8 ± 0.5 64.0 65.3 1.00 ± 0.01
Command Mean [ms] Min [ms] Max [ms] Relative
base sum_array 74.2 ± 0.6 73.5 75.7 1.01 ± 0.01
head sum_array 73.3 ± 0.5 72.9 74.2 1.00

@ColoCarletti ColoCarletti marked this pull request as ready for review April 22, 2026 14:54
@ColoCarletti
Copy link
Copy Markdown
Collaborator Author

/bench

@github-actions
Copy link
Copy Markdown

Codex Code Review

  1. High | Memory safety / security
    syscalls/src/syscalls.rs:90, syscalls/src/syscalls.rs:93
    get_private_input() reads len from guest memory and immediately does from_raw_parts(data_ptr, len) with no bounds check. Because this is unsafe, a corrupted/spoofed length can create an invalid slice (UB) and read outside the mapped private-input region.
    Action: validate len before slice creation (at least len <= MAX_PRIVATE_INPUT_SIZE, and ideally ensure it stays within the mapped region).

  2. Medium | Functional regression in verification path
    bin/cli/src/main.rs:140, bin/cli/src/main.rs:460, prover/src/lib.rs:662
    Proof verification now depends on private inputs (verify_with_inputs / verify_with_options(..., private_inputs)), but CLI Verify has no --private-input argument and always passes &[]. So proofs produced with non-empty private input cannot be verified via CLI even when valid.
    Action: add --private-input to Verify and pass bytes into verify_with_options; keep &[] default only when omitted.

No other concrete security/bug/perf issues stood out in this diff.

@github-actions
Copy link
Copy Markdown

Benchmark — fib_iterative_8M (median of 3)

Table parallelism: 32 (auto = cores / 3)

Metric main PR Δ
Peak heap 68024 MB 66606 MB -1418 MB (-2.1%) ⚪
Prove time 33.694s 33.744s +0.050s (+0.1%) ⚪

✅ No significant change.

✅ Low variance (time: 0.3%, heap: 3.3%)

Commit: 7adb2d9 · Baseline: built from main · Runner: self-hosted bench

@claude
Copy link
Copy Markdown

claude Bot commented Apr 22, 2026

PR Review: Memory-mapped private input

Overall the architecture is sound — replacing the ecall with a memory-mapped region at 0xFF000000 is a clean approach. A few issues to address, one of them a functional regression.


[High] CLI verify silently ignores private input — verifying private-input proofs always fails

The Verify subcommand has no --private-input flag, so cmd_verify always calls verify_with_options(..., &[]). Any proof produced with prove --private-input <file> will fail verification at the CLI, because the verifier reconstructs PAGE table commitments against an empty input.

Fix: Add --private-input: Option<PathBuf> to the Verify subcommand struct and thread it through to verify_with_options.


[Medium] WrongPrivateInputSize is a dead error variant — get_private_input is infallible

get_private_input() returns Result<Vec<u8>, SyscallError> but can never produce Err: both read_volatile and from_raw_parts are infallible in this context, and WrongPrivateInputSize is never constructed anywhere. The signature should either be -> Vec<u8>, or the dead variant should be removed with a comment noting the function is currently infallible.


[Medium] Missing soundness test: verifier must reject a proof when the wrong private input is supplied

The new tests prove with input A and verify with input A (happy path). There is no test proving with input A and verifying with input B — which is the core soundness guarantee of this scheme. Without it, a regression could let a prover cheat by using mismatched inputs.


[Low] build_init_page_data is called twice per proof

Once via collect_bitwise_from_page (line 1387) and once via generate_page_tables (line 1547). For large inputs (~6.7 MB), this allocates and iterates over the full input twice. The result should be computed once in build_traces and passed into both callees. (Performance note, not a correctness issue.)

Comment thread bin/cli/src/main.rs Outdated
Comment thread syscalls/src/syscalls.rs
Comment thread prover/src/tables/trace_builder.rs
Comment thread prover/src/tests/prove_elfs_tests.rs
@ColoCarletti ColoCarletti marked this pull request as draft April 22, 2026 19:43
@ColoCarletti ColoCarletti marked this pull request as ready for review April 22, 2026 21:09
@github-actions
Copy link
Copy Markdown

Codex Code Review

  1. High – Unsafe memory read with unbounded length from guest-controlled memory
  • File: syscalls/src/syscalls.rs
  • get_private_input() reads len from memory, then does from_raw_parts(data_ptr, len) with no bounds check.
  • This is inside an unsafe block and can violate Rust’s safety preconditions if len is corrupted (e.g. overwritten in guest memory), leading to UB/panic/OOM behavior.
  • Action: bound-check len against a hard max (the same private-input max used by executor) before building the slice, and return an error or halt on invalid length.
  1. Medium – Verifier DoS vector via unbounded num_private_input_pages
  • File: prover/src/tables/trace_builder.rs
  • page_configs_from_elf_and_runtime() trusts num_private_input_pages from proof input and loops 0..num_private_input_pages without validation.
  • In verification (verify_with_options), this can force huge allocations/work from untrusted proof data before any proof-count sanity check.
  • Action: validate num_private_input_pages against a strict upper bound derived from MAX_PRIVATE_INPUT_SIZE and page size before constructing configs.

No other clear security/correctness/perf issues stood out in this diff.

Comment thread prover/src/tables/trace_builder.rs
Comment thread syscalls/src/syscalls.rs
Comment thread prover/src/tables/page.rs
Comment thread prover/src/lib.rs
@claude
Copy link
Copy Markdown

claude Bot commented Apr 22, 2026

Code Review — feat(prover): Memory-mapped private input

The core design (pre-loading private input at 0xFF000000 before execution, using normal RISC-V loads instead of an ecall, and committing init values as main trace for ZK) is sound and the bus-balance fix it addresses is real. The implementation is clean overall. A few issues worth addressing before merge:


High

Attacker-controlled allocation in verifier (prover/src/tables/trace_builder.rs:2247)
num_private_input_pages is read from the untrusted VmProof and used directly in an unbounded loop. A crafted proof with num_private_input_pages = usize::MAX will OOM or spin the verifier. Needs an upper-bound check against MAX_PRIVATE_INPUT_SIZE / DEFAULT_PAGE_SIZE before the loop. See inline comment.


Medium

No bounds check on len before slice::from_raw_parts (syscalls/src/syscalls.rs:95)
The length read from 0xFF000000 is used directly in slice::from_raw_parts without validating against MAX_PRIVATE_INPUT_SIZE. In the VM this reads simulated memory, so there's no host-side UB, but a corrupted/overflowed length silently produces a slice that reads bytes well beyond the intended private input region. A hard bounds check is preferable to a debug assert. See inline comment.

Global page size bump from 4 KB → 256 KB (prover/src/tables/page.rs:51)
Every program now pays 64× more PAGE rows regardless of whether private input is used. For programs that touch a handful of 4 KB pages the trace size blows up significantly. Consider keeping the small page size for non-private-input pages, or at least document why 256 KB was chosen and what the measured proving-time impact is. See inline comment.


Low

PR description is stale (prover/src/lib.rs:144)
The description says VmProof will include private_input: Vec<u8>; the actual field is num_private_input_pages: usize. Worth updating so reviewers understand the design intent (fully ZK — verifier never reconstructs init values, only learns the page count from the proof).


Not issues (noted for clarity)

  • Removing SyscallError and changing get_private_input() to return Vec<u8> directly is a breaking API change, but it's clearly intentional here.
  • The private_input_bytes helper is called once in add_private_input and once in build_init_page_data. The duplicate allocation is negligible compared to proof overhead.
  • MemoryState::add_private_input correctly inserts cells at timestamp 0, consistent with ELF init cells in from_elf.

diegokingston and others added 7 commits April 23, 2026 17:17
Verify the verifier rejects proofs with tampered num_private_input_pages
(zeroed, inflated, exceeds max) and tampered public_output when private
input is present. Also confirm the proof struct does not carry raw
private input bytes.
@MauroToscano MauroToscano force-pushed the feat/private-input-v2 branch from 22a81c9 to b3cdaa5 Compare April 24, 2026 18:50
@MauroToscano MauroToscano enabled auto-merge April 24, 2026 19:05
@MauroToscano MauroToscano added this pull request to the merge queue Apr 24, 2026
Merged via the queue into main with commit 62bfbb9 Apr 24, 2026
16 checks passed
@MauroToscano MauroToscano deleted the feat/private-input-v2 branch April 24, 2026 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants