Skip to content

perf: skip redundant bit-reverse pair in R4 deep-composition LDE#566

Merged
MauroToscano merged 5 commits into
mainfrom
drop-redundant-bitrev-r4-lde
Apr 28, 2026
Merged

perf: skip redundant bit-reverse pair in R4 deep-composition LDE#566
MauroToscano merged 5 commits into
mainfrom
drop-redundant-bitrev-r4-lde

Conversation

@nicole-graus
Copy link
Copy Markdown
Collaborator

Optimization extracted from PR #545.

The R4 deep-composition LDE ran evaluate_fft (ends with a natural-order permute) followed by
in_place_bit_reverse_permute to feed FRI's commit phase, which already expects bit-reversed
evaluations. The two permutes cancel out.

This PR adds Polynomial::evaluate_fft_bit_reversed (same as evaluate_fft but skips the final
permute) and switches the R4 LDE site to use it, dropping both cancelling permutes.

@github-actions
Copy link
Copy Markdown

Codex Code Review

No issues found in the PR diff.

I checked the changed FFT ordering path: evaluate_fft_bit_reversed matches the previous behavior of evaluate_fft(...) followed by in_place_bit_reverse_permute(...), and the FRI commit path already expects bit-reversed evaluations.

Verification note: I attempted targeted cargo test runs, but they could not start because rustup tried to write under /home/runner/.rustup/tmp, which is read-only in this environment.

Comment thread crypto/math/src/fft/polynomial.rs
@claude
Copy link
Copy Markdown

claude Bot commented Apr 27, 2026

Review: perf: skip redundant bit-reverse pair in R4 deep-composition LDE

The optimization is correct. The Bowers DIF FFT kernel leaves data in bit-reversed order; evaluate_fft_cpu immediately applies in_place_bit_reverse_permute to restore natural order. The old prover code then re-applied in_place_bit_reverse_permute before handing the vector to commit_phase_from_evaluations, so the two permutes cancelled. The new evaluate_fft_bit_reversed skips both, saving O(N) work per proof.

One real issue

Medium – No test for evaluate_fft_bit_reversed (see inline comment)

This is a new public cryptographic API and the prover's soundness depends on it returning exactly bit-reversed evaluations. A wrong ordering would silently produce invalid proofs without panicking. A single round-trip test would close this gap:

let mut expected = Polynomial::evaluate_fft::<F>(&poly, 1, Some(n)).unwrap();
in_place_bit_reverse_permute(&mut expected);
let got = Polynomial::evaluate_fft_bit_reversed::<F>(&poly, 1, Some(n)).unwrap();
assert_eq!(got, expected);

Minor observations (no action required)

  • The validation block is duplicated verbatim from evaluate_fft. Fine as-is; a shared helper would add complexity without benefit at this scale.
  • The inline comment in prover.rs explains the why clearly.

@nicole-graus
Copy link
Copy Markdown
Collaborator Author

/bench 10

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 27, 2026

Benchmark — fib_iterative_8M (median of 3)

Table parallelism: 1

Metric main PR Δ
Peak heap 49533 MB 49533 MB +0 MB (+0.0%) ⚪
Prove time 70.222s 68.544s -1.678s (-2.4%) ⚪

✅ No significant change.

✅ Low variance (time: 0.7%, heap: 0.0%)

Commit: bc40d33 · Baseline: built from main · Runner: self-hosted bench

@nicole-graus
Copy link
Copy Markdown
Collaborator Author

/bench k=1

@nicole-graus nicole-graus marked this pull request as ready for review April 27, 2026 20:35
@github-actions
Copy link
Copy Markdown

Codex Code Review

No issues found in the PR diff.

The new evaluate_fft_bit_reversed matches the existing evaluate_fft path minus the final bit-reversal, and the STARK prover call site already wants bit-reversed evaluations for FRI folding.

I attempted the targeted test:

cargo test -p math test_fft_bit_reversed_matches_evaluate_fft_then_permute

but it could not run because Rustup tried to write under /home/runner/.rustup/tmp and the filesystem there is read-only.

Comment thread crypto/math/src/fft/polynomial.rs Outdated
Comment thread crypto/math/src/tests/fft_tests.rs Outdated
Comment thread crypto/math/src/fft/polynomial.rs
@claude
Copy link
Copy Markdown

claude Bot commented Apr 27, 2026

Review: perf: skip redundant bit-reverse pair in R4 deep-composition LDE

The optimization is correct and the property-based test validates it well. Three issues worth addressing:

Medium

  • Missing power-of-two validation on len in evaluate_fft_bit_reversed. The existing evaluate_fft path catches non-power-of-two sizes via evaluate_fft_cpu's guard; this new function bypasses that and calls dispatch_fft directly. With a non-power-of-two blowup_factor (e.g. 3) the buffer size won't be a power of two and the FFT will silently produce garbage. (See inline comment.)

Low

  • Spurious prop_filter in the test — poly.coeff_len().is_power_of_two() is redundant since both functions pad via next_power_of_two(). It reduces proptest coverage for no benefit. (See inline comment.)
  • Code duplication — the new function re-implements the validation + twiddle-setup from evaluate_fft_cpu with only the final permute removed. A small private helper or a permute: bool flag on the internal path would keep this in one place. (See inline comment.)

@MauroToscano MauroToscano added this pull request to the merge queue Apr 28, 2026
Merged via the queue into main with commit 1998ac1 Apr 28, 2026
10 checks passed
@MauroToscano MauroToscano deleted the drop-redundant-bitrev-r4-lde branch April 28, 2026 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants