-
Notifications
You must be signed in to change notification settings - Fork 0
perf: skip Rayon on small Merkle tree levels #550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -78,17 +78,36 @@ where | |||||
| let (new_level_iter, children_iter) = | ||||||
| nodes[new_level_begin_index..level_end_index + 1].split_at_mut(new_level_length); | ||||||
|
|
||||||
| // Skip Rayon for small levels: the scheduling overhead exceeds | ||||||
| // computation for levels with fewer than 1024 nodes. This avoids | ||||||
| // hundreds of unnecessary task spawns from small FRI layer trees. | ||||||
| #[cfg(feature = "parallel")] | ||||||
| let parent_and_children_zipped_iter = new_level_iter | ||||||
| .into_par_iter() | ||||||
| .zip(children_iter.par_chunks_exact(2)); | ||||||
| { | ||||||
| if new_level_length >= 1024 { | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Low — Same magic constant as in Same threshold value should come from one named constant so the two sites stay in sync:
Suggested change
|
||||||
| new_level_iter | ||||||
| .into_par_iter() | ||||||
| .zip(children_iter.par_chunks_exact(2)) | ||||||
| .for_each(|(new_parent, children)| { | ||||||
| *new_parent = B::hash_new_parent(&children[0], &children[1]); | ||||||
| }); | ||||||
| } else { | ||||||
| new_level_iter | ||||||
| .iter_mut() | ||||||
| .zip(children_iter.chunks_exact(2)) | ||||||
| .for_each(|(new_parent, children)| { | ||||||
| *new_parent = B::hash_new_parent(&children[0], &children[1]); | ||||||
| }); | ||||||
| } | ||||||
| } | ||||||
| #[cfg(not(feature = "parallel"))] | ||||||
| let parent_and_children_zipped_iter = | ||||||
| new_level_iter.iter_mut().zip(children_iter.chunks_exact(2)); | ||||||
|
|
||||||
| parent_and_children_zipped_iter.for_each(|(new_parent, children)| { | ||||||
| *new_parent = B::hash_new_parent(&children[0], &children[1]); | ||||||
| }); | ||||||
| { | ||||||
| new_level_iter | ||||||
| .iter_mut() | ||||||
| .zip(children_iter.chunks_exact(2)) | ||||||
| .for_each(|(new_parent, children)| { | ||||||
| *new_parent = B::hash_new_parent(&children[0], &children[1]); | ||||||
| }); | ||||||
| } | ||||||
|
Comment on lines
+93
to
+110
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Low — Sequential logic duplicated The fn build_level_sequential<B: IsMerkleTreeBackend>(
new_level_iter: &mut [B::Node],
children_iter: &[B::Node],
) where B::Node: Clone {
new_level_iter
.iter_mut()
.zip(children_iter.chunks_exact(2))
.for_each(|(p, c)| *p = B::hash_new_parent(&c[0], &c[1]));
}Then both sites call |
||||||
|
|
||||||
| level_end_index = level_begin_index - 1; | ||||||
| level_begin_index = new_level_begin_index; | ||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Low — Magic constant duplicated across files
1024is also hardcoded inutils.rs:86. If the threshold is ever tuned, it must be changed in both places. Define a shared constant (e.g. inutils.rsor a newconsts.rs) and reference it here: