perf: Sequential reads in commit_columns_bit_reversed#560
Conversation
Codex Code ReviewNo issues found in the PR diff. The ordering rewrite in Verification note: I attempted |
| }) | ||
| .collect(); | ||
|
|
||
| in_place_bit_reverse_permute(&mut hashed_leaves); |
There was a problem hiding this comment.
Medium – No test validates the equivalence directly
The mathematical transformation is correct: sequential_hash[k] = hash(col_0[k] || ...), and after in_place_bit_reverse_permute, leaves[i] = hash(col_0[br(i)] || ...) — identical to the old scatter read. The prove-verify roundtrip tests cover this implicitly, but since this is a correctness-critical cryptographic primitive, a direct unit test comparing the Merkle root produced by both approaches (old scatter-read vs. new sequential + permute) would be valuable insurance against future regressions.
Low – in_place_bit_reverse_permute silently miscomputes for non-power-of-two lengths in release builds
The debug_assert! at line 414–417 only fires in debug mode. in_place_bit_reverse_permute itself has no such guard — it calls size.trailing_zeros() which is only log2(size) when size is a power of two. In a release build with a non-power-of-two num_rows, the permutation would silently produce a wrong (but non-panicking) digest ordering. This is a pre-existing issue (the old reverse_index call had the same gap), but it's now applied to the committed digest array. Consider promoting the assert or adding a release-mode guard in in_place_bit_reverse_permute.
Review: perf: Sequential reads in commit_columns_bit_reversedSummaryThe optimization moves the bit-reversal work from a scattered column read inside the hashing loop ( Correctness ✓The equivalence holds:
Parallel safety is also fine — IssuesMedium – Missing unit test for the equivalence (see inline) Low – OverallClean, well-motivated change. The code is readable, the docstring explains the invariant clearly, and the diff is minimal. The two issues above are not blockers. |
|
/bench 10 |
Benchmark — fib_iterative_8M (median of 3)Table parallelism: 1
Commit: 4df366b · Baseline: built from main · Runner: self-hosted bench |
|
/bench k=1 |
Optimization extracted from PR #545.
Replaces the scattered
columns[col][br(row)]reads insidecommit_columns_bit_reversedwith sequentialcolumns[col][k]reads, then applies a singlein_place_bit_reverse_permuteover the hashed-digest vector at the end. The scatter moves from the ~2 GB column-read side to the ~64 MB digest-output side, producing a byte-identical Merkle root.