Skip to content

spec: ARE_BYTES#532

Merged
erik-3milabs merged 9 commits into
spec/mainfrom
spec/are_bytes
Apr 28, 2026
Merged

spec: ARE_BYTES#532
erik-3milabs merged 9 commits into
spec/mainfrom
spec/are_bytes

Conversation

@erik-3milabs
Copy link
Copy Markdown
Collaborator

@erik-3milabs erik-3milabs commented Apr 23, 2026

i) introduces the ARE_BYTES lookup,
ii) introduces the IS_BYTE<X, μ> template that maps to ARE_BYTES[0, X]/μ
iii) removes the IS_BYTE lookup

@erik-3milabs erik-3milabs self-assigned this Apr 23, 2026
@erik-3milabs erik-3milabs added the spec Updates and improvements to the spec document label Apr 23, 2026
@github-actions
Copy link
Copy Markdown

Codex Code Review

  1. Medium – IS_BYTE was redefined in the spec without the corresponding runtime/prover change.
    In spec/src/signatures.toml, spec/src/is_byte.toml, and spec/src/bitwise.toml, this PR turns IS_BYTE from a 1-argument interaction into a template over a new 2-byte ARE_BYTES bus and adds a new BITWISE multiplicity column μ_ARE_BYTES. That is not just documentation churn; it changes the lookup contract and BITWISE column layout. As written, the spec no longer matches the current prover code path, which still models IS_BYTE as its own single-byte lookup and hard-codes the old BITWISE layout. This should either land with the prover/trace-builder update in the same PR, or stay as IS_BYTE until that implementation change is ready.

No direct security issues stood out in the diff beyond that compatibility/correctness concern.

Comment thread spec/src/bitwise.toml
@@ -101,6 +101,11 @@ name = "μ_IS_BYTE"
type = "BaseField"
desc = ""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium – Dead multiplicity variable

μ_IS_BYTE is no longer referenced in any constraint in this file. Since IS_BYTE is now a template (not an interaction), BITWISE no longer contributes to an IS_BYTE interaction, so this multiplicity column serves no purpose. It should be removed.

Suggested change
desc = ""

Comment thread spec/src/is_byte.toml
[[constraints.all]]
kind = "interaction"
tag = "ARE_BYTES"
input = [0, "X"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low – Constant 0 consumes a BITWISE row slot unnecessarily

IS_BYTE<X, μ> expands to ARE_BYTES[0, X], always using the first byte slot as the constant 0. This means every IS_BYTE invocation occupies a full BITWISE row while only constraining one variable.

This is by design (the template exists for chips with an odd number of byte range-checks), but it's worth noting in the spec that the "waste" is intentional and that implementors should prefer pairing directly into ARE_BYTES[X, Y] whenever possible, as is_byte.typ already encourages.

No code change required — just flagging for visibility.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that implementors should prefer pairing directly into ARE_BYTES[X, Y] whenever possible ...

This is exactly what is written in the IS_BYTE template description.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 23, 2026

Review: spec: ARE_BYTES

Overview

This PR refactors byte range-checking in the spec:

  • Introduces ARE_BYTES[X, Y] — a single BITWISE interaction that range-checks two bytes at once.
  • Replaces the old IS_BYTE lookup (interaction) with an IS_BYTE<X, μ> template that expands to ARE_BYTES[0, X], for chips that need an odd number of byte checks.
  • Updates all callers (CPU, BRANCH, PAGE, SHA256, KECCAK_RND, etc.) to use the new template signature.

The design intent is clear and the optimization rationale (halving interaction count for chips with even numbers of byte checks) is well-motivated.


Issues

Medium – μ_IS_BYTE is dead code in bitwise.toml

μ_IS_BYTE remains as a multiplicity variable but is no longer referenced in any constraint. Since IS_BYTE is now a template (not an interaction), BITWISE no longer contributes to an IS_BYTE interaction row. The column should be removed. See inline comment.

Low – "deprecates" vs "removes" the IS_BYTE lookup

The PR description says "deprecates the IS_BYTE lookup", but signatures.toml removes the interaction signature entirely — it is not deprecated, it is replaced. Callers that previously used IS_BYTE as an interaction would break. A clarifying note in the PR description or a comment in signatures.toml would help downstream implementors.

Low – No dedicated doc page for ARE_BYTES

ARE_BYTES is now a first-class interaction usable directly by chips, but it only has a one-line comment in signatures.toml. Other interactions of similar prominence (MEMW, BRANCH, etc.) have dedicated .typ doc pages. Without one, the interface is less discoverable for chips that could merge pairs of IS_BYTE calls directly into ARE_BYTES.


Correctness

No concerns. The soundness of IS_BYTE<X, μ>ARE_BYTES[0, X] holds: 0 is trivially a valid byte, so the BITWISE table always has a row (0, X) for any X ∈ [0, 255], and the LogUp argument correctly enforces the range check on X. All callers reviewed (CPU, BRANCH, PAGE, SHA256*, KECCAK_RND) use the template correctly.

Copy link
Copy Markdown
Collaborator

@RobinJadoul RobinJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make an issue to manually add the batching to the spec once we're done with the rework of constraint numbering?

Comment thread spec/src/is_byte.toml Outdated
Comment on lines +8 to +11
[[variables.multiplicity]]
name = "μ"
type = "BaseField"
desc = ""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a cond instead?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During development, I named it cond initially, but then switched to mu to highlight that the multiplicity should be the second parameter: the IS_BYTE<BaseField, BaseField> interface will be OK with swapping X and cond

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could stick with it being a multiplicity, and have this be the first template which has a multiplicity (?)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to pass it as a parameter and not a template cond though?
It leading to an actual multiplicity seems no different from how NEG's cond leads to a multiplicity for ZERO to me. And just having IS_BYTE<x> vs IS_BYTE<x, 1> looks a lot more readable to me.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just having IS_BYTE vs IS_BYTE<x, 1> looks a lot more readable to me.

I completely agree with you.

It leading to an actual multiplicity seems no different from how NEG's cond leads to a multiplicity for ZERO to me.

There would be a difference: NEG's cond is assumed to be a Bit, where that would not be the case here. This is what was stopping me previously, but I agree that cond => IS_BIT<X> is much easier on the eyes

@erik-3milabs
Copy link
Copy Markdown
Collaborator Author

Let's make an issue to manually add the batching to the spec once we're done with the rework of constraint numbering?

I don't think we should batch them in the spec:

  1. No added clarity: Whoever reads the spec and decides to build a VM out of it, will have to read the IS_BYTE template section anyway to see how to expand the template. There they'll read about the optimization right away.
  2. Readability-optimization balance: when batching the IS_BYTEs present in each chip, we'll inevitably (now or in the future) find cases where IS_BYTEs located in different constraint sections should be combined for optimal performance. At that point, we'll have to decide to
    i. move one of the constraints to the other section, harming readability, or
    ii. leave it be, suggesting to the reader that this is an exceptional case where batching should NOT be performed (thus harming performance)
  3. Induced ordering: batching inside the spec means hardcoding which columns should be combined. However, perhaps there might be implementation reasons to batch in a different manner. In that case, one would have to deviate from the spec to gain performance.
  4. Constraint blowup: while batching will improve conciseness in some places, others (especially iterated interactions) will become less readable. E.g., x∈[0, 3], y∈[0,7]: IS_BYTE<A[x][y]> would be morphed into x∈[0, 3], y∈[0,3]: ARE_BYTES<A[x][2y], A[x][2y+1]>, which I find harder to parse.

@RobinJadoul
Copy link
Copy Markdown
Collaborator

I don't disagree with those arguments, my main reason to do it despite those would be to keep the clear correspondence between spec and implementation stricter.

@erik-3milabs
Copy link
Copy Markdown
Collaborator Author

my main reason to do it despite those would be to keep the clear correspondence between spec and implementation stricter.

Let us note that it would be to keep the possibility of having a stricter correspondence between spec and implementation: the fact that we increase the level of detail to the spec does NOT necessarily mean that the devs will (eternally) ensure the implementation matches the spec. I would argue that sticking with IS_BYTEs in favor of ARE_BYTES will actually lead to an overall stricter correspondence between spec and implementation.

But I'll bite: how do you propose we resolve the dilemma I spelled out in the second point?

@RobinJadoul
Copy link
Copy Markdown
Collaborator

I still agree with those points, and if indeed we'd end up saying the spec should reflect the batching, we'd have to make sacrifices.
I'm not necessarily trying to argue this is what we should do, just trying to bring it up for consideration.

I think we can probably table it for now, and keep it implicit. If at any point in the future we have reason to reconsider (e.g. we notice issues, or start extracting code from the spec that we want to keep simple) we can deal with the readability then.

@erik-3milabs
Copy link
Copy Markdown
Collaborator Author

It is now listed as issue #571

@erik-3milabs erik-3milabs merged commit 6872537 into spec/main Apr 28, 2026
2 checks passed
@erik-3milabs erik-3milabs deleted the spec/are_bytes branch April 28, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spec Updates and improvements to the spec document

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants