perf: skip Rayon on small Merkle tree levels#550
Conversation
|
/bench 10 |
|
|
||
| iter.map(|leaf| Self::hash_data(leaf)).collect() | ||
| { | ||
| if unhashed_leaves.len() >= 1024 { |
There was a problem hiding this comment.
Low — Magic constant duplicated across files
1024 is also hardcoded in utils.rs:86. If the threshold is ever tuned, it must be changed in both places. Define a shared constant (e.g. in utils.rs or a new consts.rs) and reference it here:
| if unhashed_leaves.len() >= 1024 { | |
| if unhashed_leaves.len() >= PARALLEL_THRESHOLD { |
| .into_par_iter() | ||
| .zip(children_iter.par_chunks_exact(2)); | ||
| { | ||
| if new_level_length >= 1024 { |
There was a problem hiding this comment.
Low — Same magic constant as in traits.rs
Same threshold value should come from one named constant so the two sites stay in sync:
| if new_level_length >= 1024 { | |
| if new_level_length >= PARALLEL_THRESHOLD { |
| } else { | ||
| new_level_iter | ||
| .iter_mut() | ||
| .zip(children_iter.chunks_exact(2)) | ||
| .for_each(|(new_parent, children)| { | ||
| *new_parent = B::hash_new_parent(&children[0], &children[1]); | ||
| }); | ||
| } | ||
| } | ||
| #[cfg(not(feature = "parallel"))] | ||
| let parent_and_children_zipped_iter = | ||
| new_level_iter.iter_mut().zip(children_iter.chunks_exact(2)); | ||
|
|
||
| parent_and_children_zipped_iter.for_each(|(new_parent, children)| { | ||
| *new_parent = B::hash_new_parent(&children[0], &children[1]); | ||
| }); | ||
| { | ||
| new_level_iter | ||
| .iter_mut() | ||
| .zip(children_iter.chunks_exact(2)) | ||
| .for_each(|(new_parent, children)| { | ||
| *new_parent = B::hash_new_parent(&children[0], &children[1]); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Low — Sequential logic duplicated
The else branch (lines 93–100) and the #[cfg(not(feature = "parallel"))] block (lines 102–110) are byte-for-byte identical. If the sequential logic ever changes it must be updated twice. Extract into a small inline helper or macro, e.g.:
fn build_level_sequential<B: IsMerkleTreeBackend>(
new_level_iter: &mut [B::Node],
children_iter: &[B::Node],
) where B::Node: Clone {
new_level_iter
.iter_mut()
.zip(children_iter.chunks_exact(2))
.for_each(|(p, c)| *p = B::hash_new_parent(&c[0], &c[1]));
}Then both sites call build_level_sequential::<B>(new_level_iter, children_iter);.
Review: perf: skip Rayon on small Merkle tree levelsNo security issues, no correctness bugs. The logic is sound — the early-return pattern in Three low-severity issues flagged inline:
Nothing blocking. Issue 3 is the most worth addressing before merge. |
Codex Code ReviewNo issues found in the reviewed diff. The changes in traits.rs and utils.rs look like a straightforward performance guard around Rayon usage and do not appear to change Merkle-tree correctness, memory safety, or security properties. Validation gap: I could not run |
Benchmark — fib_iterative_8M (median of 10)Table parallelism: 32 (auto = cores / 3)
Commit: c0fc4b4 · Baseline: built from main · Runner: self-hosted bench |
Adds a 1024-node threshold in
hash_leavesand the per-level loop inbuild: levels below the threshold fall back to sequential iteration, avoiding Rayon task-scheduling overhead on small upper-level FRI trees.Optimization extracted from PR #518.