feat(prover): slim MEMW table and add MEMW_A aligned fast path by diegokingston · Pull Request #424 · yetanotherco/lambda_vm

diegokingston · 2026-03-13T19:30:13Z

Implement the new MEMW spec (PR #398) with two major changes:

MEMW slimdown (70→49 cols, 57→26 interactions):
- Replace address_add[7] DWordHL (28 cols) with add_limb_overflow[7] Bit (7 cols), making address_add virtual via BusValue::linear
- Remove 28 IS_HALFWORD interactions and 3 LT overflow checks
- Replace AddConstraint pairs with IsBitConstraint::unconditional
MEMW_A aligned fast path (30 cols, 22 interactions):
- New table for aligned accesses where all bytes share the same old_timestamp (covers all register ops + most memory ops)
- Address decomposed into high/mid/low parts
- AND_BYTE interaction for alignment check
- Single old_timestamp instead of per-byte

Operations are routed at trace build time: aligned ops with uniform timestamps go to MEMW_A, the rest go to the slimmed MEMW.

Implement the new MEMW spec (PR #398) with two major changes: 1. MEMW slimdown (70→49 cols, 57→26 interactions): - Replace address_add[7] DWordHL (28 cols) with add_limb_overflow[7] Bit (7 cols), making address_add virtual via BusValue::linear - Remove 28 IS_HALFWORD interactions and 3 LT overflow checks - Replace AddConstraint pairs with IsBitConstraint::unconditional 2. MEMW_A aligned fast path (30 cols, 22 interactions): - New table for aligned accesses where all bytes share the same old_timestamp (covers all register ops + most memory ops) - Address decomposed into high/mid/low parts - AND_BYTE interaction for alignment check - Single old_timestamp instead of per-byte Operations are routed at trace build time: aligned ops with uniform timestamps go to MEMW_A, the rest go to the slimmed MEMW.

diegokingston · 2026-03-13T19:30:22Z

/bench

github-actions · 2026-03-13T19:32:45Z

Codex Code Review

Critical — Unconstrained carry bits let prover forge byte addresses in MEMW
- In MEMW bus lookups, byte addresses 1..7 are now derived from add_limb_overflow[i] (lo = base_lo + i - 2^32*overflow, hi = base_hi + overflow) but constraints only enforce each overflow column is a bit.
- There is no constraint tying add_limb_overflow[i] to the actual carry of base_address_0 + i, so a malicious prover can choose wrong carry bits and redirect memory bus tokens to incorrect addresses while still satisfying constraints.
- References: memw.rs:333, memw.rs:409, memw.rs:995
- Action: add algebraic constraints that enforce exact carry semantics for each i (or restore explicit address_add columns with ADD constraints).
High — Overflow safety checks (R1–R3) were removed without equivalent enforcement
- The previous overflow LT checks (base_address < base_address + offset) were removed from both trace construction and MEMW LT bus interactions; now only timestamp LT checks remain.
- This drops explicit non-overflow guarantees for multi-byte unaligned accesses near address boundaries and changes soundness assumptions.
- References: trace_builder.rs:696, trace_builder.rs:748, memw.rs:754
- Action: restore R1–R3 checks (or add equivalent constraints proving no illegal wraparound).

Assumption

This review assumes the memory/address model still requires strict carry correctness and overflow safety as in prior MEMW behavior. If spec intentionally changed, that needs explicit proof-level justification plus tests covering boundary addresses.

claude · 2026-03-13T19:39:31Z

+    ]);
+
+    // CM18/19: byte 1 with w2
+    // For aligned accesses, adding 1 to the low byte never overflows to hi word


Medium — Overflow safety invariant depends on three independent range checks

The claim "alignment guarantees base_address_lo + width-1 < 2^32" holds, but only when all three lookup-enforced range checks are satisfied simultaneously:

IS_HALFWORD(MID) → MID ∈ [0, 2^16)

IS_BYTE(LOW[1]) → LOW[1] ∈ [0, 256)

AND_BYTE(LOW[0], mask) → LOW[0] ∈ [0, 256) (implicit byte range) + alignment

With these, base_addr_lo ≤ 0xFFFFFFF8 for width=8 aligned (max LOW[0]=248), so +7 ≤ 0xFFFFFFFF. The property is correct, but there's no single algebraic constraint encoding it — it emerges from three separate bus interactions. If any of those range checks were ever relaxed or skipped (e.g., due to future refactoring of the Bitwise table), this no-overflow guarantee would silently break.

Consider adding a comment that explicitly names all three preconditions this relies on, to make the invariant auditable in isolation.

claude · 2026-03-13T19:39:32Z

+}
+
+/// Creates all constraints for the MEMW_A table (2 total).
+pub fn constraints() -> Vec<Box<dyn TransitionConstraint<GoldilocksField, GoldilocksExtension>>> {


Low — No IS_BIT constraints for WRITE2, WRITE4, WRITE8, IS_REGISTER

MEMW_A has 2 algebraic constraints vs MEMW's 9 (which includes 7 IsBitConstraint::unconditional for ADD_LIMB_OVERFLOW). The write flags and IS_REGISTER here are only constrained to be binary via bus interactions (CO24/CO25 must match what CPU sends), not by local polynomial constraints.

This is consistent with the original MEMW design (write flags weren't IS_BIT-constrained there either), and the bus-level enforcement is sound. But it's worth noting that the AND_BYTE alignment mask write2*1 + write4*3 + write8*7 behaves correctly only if the flags are mutually exclusive and binary — a property that's enforced globally (via the bus) but not locally. Future auditors reading this table in isolation may miss this dependency.

claude · 2026-03-13T19:39:40Z

+/// An operation is aligned if:
+/// 1. For width > 1: base_address is aligned to width (low bits are zero)
+/// 2. All accessed bytes share the same old_timestamp
+fn is_aligned_op(op: &MemwOperation) -> bool {


Low — Routing logic is duplicated in two build paths

is_aligned_op + the partition call appear identically in both Traces::from_logs and the parallel build path. If the routing criteria ever change (e.g., a new alignment condition is added), both copies need to stay in sync — there's no compile-time guarantee they will.

Also note: the routing check here tests alignment on the full 32-bit low word (low & (width-1)), while MEMW_A's constraint tests only LOW[0] = addr & 0xFF. These are equivalent since masks ≤ 7 < 256, but they're expressed differently in two separate places. If width ever grows beyond 256, the two diverge.

claude · 2026-03-13T19:40:27Z

No critical or high issues found. The column reduction (70->49 for MEMW, new 30-col MEMW_A), replacement of LT overflow checks with bit-column arithmetic, and aligned routing are all correct.

Security analysis:

Removed LT overflow checks (R1-R3): Replaced by add_limb_overflow[i] bit columns with IsBitConstraint::unconditional. Sound: wrong overflow bits produce incorrect memory bus tokens not matched by PAGE/REGISTER, so bus imbalance rejects the proof.

MEMW_A address overflow invariant: base_addr_lo + i < 2^32 for aligned accesses is mathematically correct (max 8-byte aligned base_addr_lo = 0xFFFFFFF8, +7 = 0xFFFFFFFF). But this relies on IS_HALFWORD + IS_BYTE + AND_BYTE range checks all holding simultaneously. See inline comment at memw_aligned.rs:322.

Alignment mask write21 + write43 + write8*7 is correct when flags are mutually exclusive. Exclusivity is enforced via CO24/CO25 bus interactions (must match CPU fingerprint), not local polynomial constraints.

Issues (inline comments posted):

Medium | memw_aligned.rs:322 | No-overflow invariant relies on 3 separate lookup interactions with no single algebraic proof
Low | memw_aligned.rs:803 | WRITE2/4/8 and IS_REGISTER lack local IsBit constraints; binary-ness relies solely on bus enforcement
Low | trace_builder.rs:766 | Routing logic duplicated in two build paths; router checks alignment on full 32-bit low word, constraint checks only LOW[0] byte

github-actions · 2026-03-13T19:46:32Z

Benchmark — fib_iterative_2M (median of 3)

_{Table parallelism: 32 (auto = cores / 3)}

Metric	main	PR	Δ
Peak heap	122496 MB	174522 MB	+52026 MB (+42.5%) 🔴
Prove time	46.765s	33.763s	-13.002s (-27.8%) 🟢

⚠️ Regression detected — heap or time increased by more than 5%.

⚠️ Prove time spread: 11.3% (32.284s / 33.763s / 36.104s)
Consider re-running /bench

Memory Growth

_{Measured with TABLE_PARALLELISM=1 (sequential) · best of 2 samples per point}

Program	Steps	main (MB)	PR (MB)	Δ
fib_iterative_250k	250k	8761	7418	-1343 MB (-15.3%)
fib_iterative_500k	500k	13956	14265	+309 MB (+2.2%)
fib_iterative_1M	1M	20918	20720	-198 MB (-0.9%)
fib_iterative_2M	2M	32812	29994	-2818 MB (-8.6%)

Growth rate: 12145 MB / 1M steps (main: 13379, Δ: -9.2%)
Fit: R² = 0.9563 (main: 0.9907)

✅ No significant change in memory scaling.

_{Commit: 66ee8c6 · Baseline: cached · Runner: self-hosted bench}

Use Vec::with_capacity for known interaction counts (26 and 22) to avoid reallocations. Remove unnecessary clone on last mu_sum use.

diegokingston · 2026-03-13T20:14:58Z

/bench

diegokingston · 2026-03-13T20:35:54Z

/bench 10

- Remove IS_HALFWORD[mid] and IS_BYTE[low[1]] bus interactions from MEMW_A: these are spec assumptions (MEMW_A-A2, MEMW_A-A3) satisfied by the CPU's IS_BYTE range checks on its byte decomposition plus Memw bus propagation. MID and LOW[1] only appear inside the linear combination 2^16*MID + 2^8*LOW[1] + LOW[0], which the bus constrains to equal the CPU's base_address_0, so individual range checks are redundant. Saves 2 bus interactions per MEMW_A row. - Remove corresponding IS_HALFWORD and IS_BYTE bitwise lookups from collect_bitwise_from_memw_aligned in trace_builder. - Update stale LT comment labels in memw.rs: C7-C10 → MEMW-C4 through MEMW-C7 to match current spec numbering.

diegokingston · 2026-03-16T23:34:06Z

/bench

gabrielbosio · 2026-03-20T18:23:16Z

Closing, prove time improved but heap regressed too much.

claude Bot reviewed Mar 13, 2026

View reviewed changes

perf: pre-allocate bus interaction vectors in MEMW and MEMW_A

7265429

Use Vec::with_capacity for known interaction counts (26 and 22) to avoid reallocations. Remove unnecessary clone on last mu_sum use.

gabrielbosio closed this Mar 20, 2026

diegokingston deleted the feat/new-memw branch May 20, 2026 12:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(prover): slim MEMW table and add MEMW_A aligned fast path#424

feat(prover): slim MEMW table and add MEMW_A aligned fast path#424
diegokingston wants to merge 3 commits into
mainfrom
feat/new-memw

diegokingston commented Mar 13, 2026

Uh oh!

diegokingston commented Mar 13, 2026

Uh oh!

github-actions Bot commented Mar 13, 2026

Uh oh!

claude Bot Mar 13, 2026

Uh oh!

claude Bot Mar 13, 2026

Uh oh!

claude Bot Mar 13, 2026

Uh oh!

claude Bot commented Mar 13, 2026

Uh oh!

github-actions Bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

diegokingston commented Mar 13, 2026

Uh oh!

diegokingston commented Mar 13, 2026

Uh oh!

diegokingston commented Mar 16, 2026

Uh oh!

gabrielbosio commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

diegokingston commented Mar 13, 2026

Uh oh!

diegokingston commented Mar 13, 2026

Uh oh!

github-actions Bot commented Mar 13, 2026

Codex Code Review

Uh oh!

claude Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Mar 13, 2026

Uh oh!

github-actions Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark — fib_iterative_2M (median of 3)

Memory Growth

Uh oh!

diegokingston commented Mar 13, 2026

Uh oh!

diegokingston commented Mar 13, 2026

Uh oh!

diegokingston commented Mar 16, 2026

Uh oh!

gabrielbosio commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Mar 13, 2026 •

edited

Loading