Skip to content

perf(encoding): interleave fastest hash fill insertion#41

Merged
polaz merged 4 commits intomainfrom
perf/#14-interleaved-hash-fill-rebased
Mar 28, 2026
Merged

perf(encoding): interleave fastest hash fill insertion#41
polaz merged 4 commits intomainfrom
perf/#14-interleaved-hash-fill-rebased

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Mar 28, 2026

Summary

  • add an interleaved suffix-hash insertion path in MatchGenerator::add_suffixes_till() for fastest mode (FAST_HASH_FILL_STEP = 3)
  • keep tail-anchor/backfill behavior unchanged to preserve boundary matching semantics
  • ignore generated benchmark artifacts (benchmark-report.md, benchmark-results.json)

Benchmark (vs origin/main, STRUCTURED_ZSTD_BENCH_LARGE_BYTES=16777216)

Fastest pure-Rust compression improved on key scenarios:

  • decodecorpus-z000033: 13.613 ms -> 12.090 ms (-11.19%)
  • high-entropy-1m: 19.056 ms -> 17.542 ms (-7.95%)
  • large-log-stream: 14.466 ms -> 12.480 ms (-13.73%)
  • low-entropy-1m: 0.903 ms -> 0.759 ms (-15.95%)

Validation

  • cargo fmt -- --check
  • cargo clippy --all-targets -- -D warnings
  • cargo build
  • cargo nextest run --workspace
  • cargo bench --bench compare_ffi -p structured-zstd --no-run
  • STRUCTURED_ZSTD_BENCH_LARGE_BYTES=16777216 bash .github/scripts/run-benchmarks.sh (branch + baseline in temp worktree)

Refs #14

Summary by CodeRabbit

  • Refactor
    • Improved internal suffix matching generation for better performance and efficiency (no public API changes).
  • Bug Fixes
    • Resolved cases where internal entries could be incorrectly recorded, improving correctness of matching behavior.
  • Tests
    • Added unit tests to validate boundary behavior and optimized fast-path insertion logic.

polaz added 2 commits March 28, 2026 15:55
- add interleaved suffix insertion path for fill-step=3 in matcher

- keep tail-anchor backfill behavior unchanged
- add benchmark-report.md and benchmark-results.json to gitignore
Copilot AI review requested due to automatic review settings March 28, 2026 14:06
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6405b96f-3c5e-4ed6-9259-105a214be188

📥 Commits

Reviewing files that changed from the base of the PR and between 5f64b7c and d2e90a4.

📒 Files selected for processing (1)
  • zstd/src/encoding/match_generator.rs

📝 Walkthrough

Walkthrough

Refactors MatchGenerator::add_suffixes_till to use index/bounds arithmetic, adds insert_suffix_if_absent helper, and implements an interleaved fast-path that inserts multiple suffix positions per iteration when fill_step equals the fast constant.

Changes

Cohort / File(s) Summary
Match Generator Suffix Insertion Optimization
zstd/src/encoding/match_generator.rs
Rewrote add_suffixes_till to compute insert_limit via saturating subtraction and avoid slice.windows. Added insert_suffix_if_absent helper, an interleaved fast insertion path (add_suffixes_interleaved_fast) for FAST_HASH_FILL_STEP, and tests verifying no inserts when idx < MIN_MATCH_LEN and correct interleaved insert offsets.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hop through bytes and careful bounds,

inserting tails in interleaved rounds.
Fast lanes hum and loops stay steady,
keys checked once — the table's ready.
Hooray for matches, neat and speedy! 🥕🧡

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: adding interleaved insertion optimization to the fastest hash fill path in the encoding module's match generator.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/#14-interleaved-hash-fill-rebased

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 28, 2026

Codecov Report

❌ Patch coverage is 95.94595% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
zstd/src/encoding/match_generator.rs 95.94% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the Fastest encoder’s hash-fill path in MatchGenerator::add_suffixes_till() by unrolling/interleaving suffix insertions (using FAST_HASH_FILL_STEP = 3) to reduce overhead and improve compression throughput, while keeping the existing tail-anchor/backfill behavior unchanged.

Changes:

  • Add a specialized “interleaved” insertion path for FAST_HASH_FILL_STEP plus shared helper (insert_suffix_if_absent).
  • Replace the prior windows(...).enumerate().step_by(...) insertion loop with explicit position-based insertion loops.
  • Update .gitignore to ignore benchmark artifacts (though the new entries currently duplicate existing ones).

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated no comments.

File Description
zstd/src/encoding/match_generator.rs Adds a fast-path, unrolled/interleaved suffix insertion loop for fastest hash-fill step while preserving tail-anchor behavior.
.gitignore Attempts to ignore benchmark artifacts, but introduces duplicate ignore patterns.

Copy link
Copy Markdown

@sw-release-bot sw-release-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'structured-zstd vs C FFI'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite Current: 5f64b7c Previous: f3310cb Ratio
compress/default/large-log-stream/matrix/c_ffi 4.345 ms 3.424 ms 1.27
decompress/fastest/high-entropy-1m/matrix/c_ffi 0.032 ms 0.027 ms 1.19
decompress/default/high-entropy-1m/matrix/c_ffi 0.032 ms 0.027 ms 1.19
compress-dict/fastest/small-10k-random/matrix/c_ffi_without_dict 0.007 ms 0.006 ms 1.17

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 28, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 28, 2026

@polaz Sure, I'll re-review the PR right away!


🧠 Learnings used
Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Applies to **/*.rs : Code must pass `cargo clippy -p structured-zstd --features hash,std,dict_builder -- -D warnings` (excluding `rustc-dep-of-std` and `fuzz_exports` features)

Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Avoid suggesting architectural changes that diverge too far from upstream, as this is a fork of ruzstd (KillingSpark/zstd-rs)
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 28, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 28, 2026

@polaz Sure, I'll re-review the PR now!


🧠 Learnings used
Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Applies to **/*.rs : Code must pass `cargo clippy -p structured-zstd --features hash,std,dict_builder -- -D warnings` (excluding `rustc-dep-of-std` and `fuzz_exports` features)

Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Avoid suggesting architectural changes that diverge too far from upstream, as this is a fork of ruzstd (KillingSpark/zstd-rs)

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 33
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T00:05:14.600Z
Learning: In structured-world/structured-zstd, the Clippy requirement is: `cargo clippy -p structured-zstd --features hash,std,dict_builder -- -D warnings`. The `rustc-dep-of-std` feature is excluded (internal feature for Rust stdlib builds only) and `fuzz_exports` is excluded (fuzzing-specific entry points validated separately from the regular lint gate). The `--all-features` flag should NOT be used.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.gitignore:
- Around line 13-14: The .gitignore has duplicate entries for
benchmark-results.json and benchmark-report.md; remove the redundant lines in
the .gitignore (the second occurrences shown in the diff) so each of these
filenames appears only once, keeping the original entries already declared
earlier and deleting the duplicate entries added in this change.

In `@zstd/src/encoding/match_generator.rs`:
- Around line 529-535: Add a debug assertion in insert_suffix_if_absent to
ensure the slice access is safe by checking the caller invariant (pos +
MIN_MATCH_LEN <= data.len()); specifically, add a debug_assert referencing
MIN_MATCH_LEN and pos at the top of the function so during development it will
panic if the invariant is violated while leaving release performance unchanged;
keep the rest of the logic (checking suffixes.contains_key and suffixes.insert)
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 67cd0651-cac4-4845-b34b-b51443030f05

📥 Commits

Reviewing files that changed from the base of the PR and between f3310cb and 82de7e1.

📒 Files selected for processing (2)
  • .gitignore
  • zstd/src/encoding/match_generator.rs

Comment thread .gitignore Outdated
Comment thread zstd/src/encoding/match_generator.rs
- remove duplicate benchmark artifact entries in .gitignore

- add debug_assert for insert_suffix_if_absent slice bound invariant
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread zstd/src/encoding/match_generator.rs Outdated
Comment thread zstd/src/encoding/match_generator.rs
- add regression for idx < MIN_MATCH_LEN in add_suffixes_till

- add focused interleaved-position registration test

- fix insert_limit saturation to preserve original windows() behavior
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants