Skip to content

feat(prover): slim MEMW table and add MEMW_A aligned fast path#424

Closed
diegokingston wants to merge 3 commits into
mainfrom
feat/new-memw
Closed

feat(prover): slim MEMW table and add MEMW_A aligned fast path#424
diegokingston wants to merge 3 commits into
mainfrom
feat/new-memw

Conversation

@diegokingston
Copy link
Copy Markdown
Collaborator

Implement the new MEMW spec (PR #398) with two major changes:

  1. MEMW slimdown (70→49 cols, 57→26 interactions):

    • Replace address_add[7] DWordHL (28 cols) with add_limb_overflow[7] Bit (7 cols), making address_add virtual via BusValue::linear
    • Remove 28 IS_HALFWORD interactions and 3 LT overflow checks
    • Replace AddConstraint pairs with IsBitConstraint::unconditional
  2. MEMW_A aligned fast path (30 cols, 22 interactions):

    • New table for aligned accesses where all bytes share the same old_timestamp (covers all register ops + most memory ops)
    • Address decomposed into high/mid/low parts
    • AND_BYTE interaction for alignment check
    • Single old_timestamp instead of per-byte

Operations are routed at trace build time: aligned ops with uniform timestamps go to MEMW_A, the rest go to the slimmed MEMW.

Implement the new MEMW spec (PR #398) with two major changes:

1. MEMW slimdown (70→49 cols, 57→26 interactions):
   - Replace address_add[7] DWordHL (28 cols) with add_limb_overflow[7]
     Bit (7 cols), making address_add virtual via BusValue::linear
   - Remove 28 IS_HALFWORD interactions and 3 LT overflow checks
   - Replace AddConstraint pairs with IsBitConstraint::unconditional

2. MEMW_A aligned fast path (30 cols, 22 interactions):
   - New table for aligned accesses where all bytes share the same
     old_timestamp (covers all register ops + most memory ops)
   - Address decomposed into high/mid/low parts
   - AND_BYTE interaction for alignment check
   - Single old_timestamp instead of per-byte

Operations are routed at trace build time: aligned ops with uniform
timestamps go to MEMW_A, the rest go to the slimmed MEMW.
@diegokingston
Copy link
Copy Markdown
Collaborator Author

/bench

@github-actions
Copy link
Copy Markdown

Codex Code Review

  1. Critical — Unconstrained carry bits let prover forge byte addresses in MEMW

    • In MEMW bus lookups, byte addresses 1..7 are now derived from add_limb_overflow[i] (lo = base_lo + i - 2^32*overflow, hi = base_hi + overflow) but constraints only enforce each overflow column is a bit.
    • There is no constraint tying add_limb_overflow[i] to the actual carry of base_address_0 + i, so a malicious prover can choose wrong carry bits and redirect memory bus tokens to incorrect addresses while still satisfying constraints.
    • References: memw.rs:333, memw.rs:409, memw.rs:995
    • Action: add algebraic constraints that enforce exact carry semantics for each i (or restore explicit address_add columns with ADD constraints).
  2. High — Overflow safety checks (R1–R3) were removed without equivalent enforcement

    • The previous overflow LT checks (base_address < base_address + offset) were removed from both trace construction and MEMW LT bus interactions; now only timestamp LT checks remain.
    • This drops explicit non-overflow guarantees for multi-byte unaligned accesses near address boundaries and changes soundness assumptions.
    • References: trace_builder.rs:696, trace_builder.rs:748, memw.rs:754
    • Action: restore R1–R3 checks (or add equivalent constraints proving no illegal wraparound).

Assumption

  • This review assumes the memory/address model still requires strict carry correctness and overflow safety as in prior MEMW behavior. If spec intentionally changed, that needs explicit proof-level justification plus tests covering boundary addresses.

]);

// CM18/19: byte 1 with w2
// For aligned accesses, adding 1 to the low byte never overflows to hi word
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — Overflow safety invariant depends on three independent range checks

The claim "alignment guarantees base_address_lo + width-1 < 2^32" holds, but only when all three lookup-enforced range checks are satisfied simultaneously:

  • IS_HALFWORD(MID)MID ∈ [0, 2^16)
  • IS_BYTE(LOW[1])LOW[1] ∈ [0, 256)
  • AND_BYTE(LOW[0], mask)LOW[0] ∈ [0, 256) (implicit byte range) + alignment

With these, base_addr_lo ≤ 0xFFFFFFF8 for width=8 aligned (max LOW[0]=248), so +7 ≤ 0xFFFFFFFF. The property is correct, but there's no single algebraic constraint encoding it — it emerges from three separate bus interactions. If any of those range checks were ever relaxed or skipped (e.g., due to future refactoring of the Bitwise table), this no-overflow guarantee would silently break.

Consider adding a comment that explicitly names all three preconditions this relies on, to make the invariant auditable in isolation.

}

/// Creates all constraints for the MEMW_A table (2 total).
pub fn constraints() -> Vec<Box<dyn TransitionConstraint<GoldilocksField, GoldilocksExtension>>> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — No IS_BIT constraints for WRITE2, WRITE4, WRITE8, IS_REGISTER

MEMW_A has 2 algebraic constraints vs MEMW's 9 (which includes 7 IsBitConstraint::unconditional for ADD_LIMB_OVERFLOW). The write flags and IS_REGISTER here are only constrained to be binary via bus interactions (CO24/CO25 must match what CPU sends), not by local polynomial constraints.

This is consistent with the original MEMW design (write flags weren't IS_BIT-constrained there either), and the bus-level enforcement is sound. But it's worth noting that the AND_BYTE alignment mask write2*1 + write4*3 + write8*7 behaves correctly only if the flags are mutually exclusive and binary — a property that's enforced globally (via the bus) but not locally. Future auditors reading this table in isolation may miss this dependency.

/// An operation is aligned if:
/// 1. For width > 1: base_address is aligned to width (low bits are zero)
/// 2. All accessed bytes share the same old_timestamp
fn is_aligned_op(op: &MemwOperation) -> bool {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — Routing logic is duplicated in two build paths

is_aligned_op + the partition call appear identically in both Traces::from_logs and the parallel build path. If the routing criteria ever change (e.g., a new alignment condition is added), both copies need to stay in sync — there's no compile-time guarantee they will.

Also note: the routing check here tests alignment on the full 32-bit low word (low & (width-1)), while MEMW_A's constraint tests only LOW[0] = addr & 0xFF. These are equivalent since masks ≤ 7 < 256, but they're expressed differently in two separate places. If width ever grows beyond 256, the two diverge.

@claude
Copy link
Copy Markdown

claude Bot commented Mar 13, 2026

No critical or high issues found. The column reduction (70->49 for MEMW, new 30-col MEMW_A), replacement of LT overflow checks with bit-column arithmetic, and aligned routing are all correct.

Security analysis:

Removed LT overflow checks (R1-R3): Replaced by add_limb_overflow[i] bit columns with IsBitConstraint::unconditional. Sound: wrong overflow bits produce incorrect memory bus tokens not matched by PAGE/REGISTER, so bus imbalance rejects the proof.

MEMW_A address overflow invariant: base_addr_lo + i < 2^32 for aligned accesses is mathematically correct (max 8-byte aligned base_addr_lo = 0xFFFFFFF8, +7 = 0xFFFFFFFF). But this relies on IS_HALFWORD + IS_BYTE + AND_BYTE range checks all holding simultaneously. See inline comment at memw_aligned.rs:322.

Alignment mask write21 + write43 + write8*7 is correct when flags are mutually exclusive. Exclusivity is enforced via CO24/CO25 bus interactions (must match CPU fingerprint), not local polynomial constraints.

Issues (inline comments posted):

  • Medium | memw_aligned.rs:322 | No-overflow invariant relies on 3 separate lookup interactions with no single algebraic proof
  • Low | memw_aligned.rs:803 | WRITE2/4/8 and IS_REGISTER lack local IsBit constraints; binary-ness relies solely on bus enforcement
  • Low | trace_builder.rs:766 | Routing logic duplicated in two build paths; router checks alignment on full 32-bit low word, constraint checks only LOW[0] byte

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 13, 2026

Benchmark — fib_iterative_2M (median of 3)

Table parallelism: 32 (auto = cores / 3)

Metric main PR Δ
Peak heap 122496 MB 174522 MB +52026 MB (+42.5%) 🔴
Prove time 46.765s 33.763s -13.002s (-27.8%) 🟢

⚠️ Regression detected — heap or time increased by more than 5%.

⚠️ Prove time spread: 11.3% (32.284s / 33.763s / 36.104s)
Consider re-running /bench

Memory Growth

Measured with TABLE_PARALLELISM=1 (sequential) · best of 2 samples per point

Program Steps main (MB) PR (MB) Δ
fib_iterative_250k 250k 8761 7418 -1343 MB (-15.3%)
fib_iterative_500k 500k 13956 14265 +309 MB (+2.2%)
fib_iterative_1M 1M 20918 20720 -198 MB (-0.9%)
fib_iterative_2M 2M 32812 29994 -2818 MB (-8.6%)

Growth rate: 12145 MB / 1M steps (main: 13379, Δ: -9.2%)
Fit: R² = 0.9563 (main: 0.9907)

✅ No significant change in memory scaling.

Commit: 66ee8c6 · Baseline: cached · Runner: self-hosted bench

Use Vec::with_capacity for known interaction counts (26 and 22)
to avoid reallocations. Remove unnecessary clone on last mu_sum use.
@diegokingston
Copy link
Copy Markdown
Collaborator Author

/bench

@diegokingston
Copy link
Copy Markdown
Collaborator Author

/bench 10

- Remove IS_HALFWORD[mid] and IS_BYTE[low[1]] bus interactions from
  MEMW_A: these are spec assumptions (MEMW_A-A2, MEMW_A-A3) satisfied
  by the CPU's IS_BYTE range checks on its byte decomposition plus
  Memw bus propagation. MID and LOW[1] only appear inside the linear
  combination 2^16*MID + 2^8*LOW[1] + LOW[0], which the bus constrains
  to equal the CPU's base_address_0, so individual range checks are
  redundant. Saves 2 bus interactions per MEMW_A row.
- Remove corresponding IS_HALFWORD and IS_BYTE bitwise lookups from
  collect_bitwise_from_memw_aligned in trace_builder.
- Update stale LT comment labels in memw.rs: C7-C10 → MEMW-C4
  through MEMW-C7 to match current spec numbering.
@diegokingston
Copy link
Copy Markdown
Collaborator Author

/bench

@gabrielbosio
Copy link
Copy Markdown
Collaborator

Closing, prove time improved but heap regressed too much.

@diegokingston diegokingston deleted the feat/new-memw branch May 20, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants