-
Notifications
You must be signed in to change notification settings - Fork 0
Merkle cache reads and skip R4 permute #547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a3f58fc
ffcb572
9555385
74587c6
b6f798b
8266846
1fe9a22
d7b0dd3
d2cb6ea
8c4b9c7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -361,10 +361,11 @@ pub trait IsStarkProver< | |
| /// Builds a Merkle tree commitment from column-major LDE evaluations with | ||
| /// bit-reverse permutation, without cloning the full evaluation matrix. | ||
| /// | ||
| /// For each row index `i`, we hash `col_0[br(i)] || col_1[br(i)] || ...` | ||
| /// where `br(i)` is the bit-reversal of `i`. This produces the same Merkle | ||
| /// tree as the old clone + bit-reverse + columns2rows + batch_commit flow, | ||
| /// but avoids allocating the cloned and transposed matrices entirely. | ||
| /// Hashes `col_0[k] || col_1[k] || ...` for k = 0..num_rows (sequential column | ||
| /// reads, cache-friendly), then permutes the hash vector in bit-reversed order | ||
| /// so leaves[i] = hash(col_0[br(i)] || col_1[br(i)] || ...). Same Merkle tree | ||
| /// as reading at br(row_idx) inside the hashing loop, but the scattered column | ||
| /// access is replaced by a single small bit-reverse pass over 32-byte digests. | ||
| fn commit_columns_bit_reversed<E>( | ||
| columns: &[Vec<FieldElement<E>>], | ||
| ) -> Option<(BatchedMerkleTree<E>, Commitment)> | ||
|
|
@@ -392,21 +393,20 @@ pub trait IsStarkProver< | |
| #[cfg(not(feature = "parallel"))] | ||
| let iter = 0..num_rows; | ||
|
|
||
| // One allocation per row (was one per field element): write all columns | ||
| // into a single buffer, then hash once. | ||
| let hashed_leaves: Vec<Commitment> = iter | ||
| .map(|row_idx| { | ||
| let br_idx = reverse_index(row_idx, num_rows as u64); | ||
| let mut hashed_leaves: Vec<Commitment> = iter | ||
| .map(|k| { | ||
| let total_bytes = num_cols * byte_len; | ||
| let mut buf = vec![0u8; total_bytes]; | ||
| for col_idx in 0..num_cols { | ||
| columns[col_idx][br_idx] | ||
| columns[col_idx][k] | ||
| .write_bytes_be(&mut buf[col_idx * byte_len..(col_idx + 1) * byte_len]); | ||
| } | ||
| BatchedMerkleTreeBackend::<E>::hash_bytes(&buf) | ||
| }) | ||
| .collect(); | ||
|
|
||
| in_place_bit_reverse_permute(&mut hashed_leaves); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Low – Sequential permute after parallel hashing When the The bit-reverse permutation is a small, branch-heavy scatter/gather that parallelises poorly at typical domain sizes, so this is only worth addressing if profiling shows it as a hot spot. Just noting it so the trade-off is explicit. |
||
|
|
||
| let tree = BatchedMerkleTree::<E>::build_from_hashed_leaves(hashed_leaves)?; | ||
| let root = tree.root; | ||
|
gabrielbosio marked this conversation as resolved.
|
||
| Some((tree, root)) | ||
|
|
@@ -1081,9 +1081,12 @@ pub trait IsStarkProver< | |
| let t_sub = Instant::now(); | ||
| let deep_poly = | ||
| Polynomial::interpolate_fft::<Field>(&deep_evals).expect("iFFT should succeed"); | ||
| let mut lde_evals = Polynomial::evaluate_fft::<Field>(&deep_poly, 1, Some(domain_size)) | ||
| .expect("FFT should succeed"); | ||
| in_place_bit_reverse_permute(&mut lde_evals); | ||
| // FRI commit_phase consumes bit-reversed evaluations natively. Request them | ||
| // directly from evaluate_fft_bit_reversed to avoid a pair of redundant permutes | ||
| // (evaluate_fft's internal natural-order permute + an external re-bit-reverse). | ||
| let lde_evals = | ||
| Polynomial::evaluate_fft_bit_reversed::<Field>(&deep_poly, 1, Some(domain_size)) | ||
| .expect("FFT should succeed"); | ||
| #[cfg(feature = "instruments")] | ||
| let r4_fft_dur = t_sub.elapsed(); | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Low –
blowup_factor = 0produces a confusing errorWhen
blowup_factor = 0,len = 0. On a 64-bit target0usize.trailing_zeros()returns64, so the first guard fires and returnsDomainSizeError(64)— a misleading message for a caller that passed a zero blowup.evaluate_ffthas the same behaviour, so this isn't a regression, but sinceevaluate_fft_bit_reversedis a new public API it's a good place to add an early guard: