perf(encoding): interleave fastest hash fill insertion by polaz · Pull Request #41 · structured-world/structured-zstd

polaz · 2026-03-28T14:06:23Z

Summary

add an interleaved suffix-hash insertion path in MatchGenerator::add_suffixes_till() for fastest mode (FAST_HASH_FILL_STEP = 3)
keep tail-anchor/backfill behavior unchanged to preserve boundary matching semantics
ignore generated benchmark artifacts (benchmark-report.md, benchmark-results.json)

Benchmark (vs `origin/main`, `STRUCTURED_ZSTD_BENCH_LARGE_BYTES=16777216`)

Fastest pure-Rust compression improved on key scenarios:

decodecorpus-z000033: 13.613 ms -> 12.090 ms (-11.19%)
high-entropy-1m: 19.056 ms -> 17.542 ms (-7.95%)
large-log-stream: 14.466 ms -> 12.480 ms (-13.73%)
low-entropy-1m: 0.903 ms -> 0.759 ms (-15.95%)

Validation

cargo fmt -- --check
cargo clippy --all-targets -- -D warnings
cargo build
cargo nextest run --workspace
cargo bench --bench compare_ffi -p structured-zstd --no-run
STRUCTURED_ZSTD_BENCH_LARGE_BYTES=16777216 bash .github/scripts/run-benchmarks.sh (branch + baseline in temp worktree)

Refs #14

Summary by CodeRabbit

Refactor
- Improved internal suffix matching generation for better performance and efficiency (no public API changes).
Bug Fixes
- Resolved cases where internal entries could be incorrectly recorded, improving correctness of matching behavior.
Tests
- Added unit tests to validate boundary behavior and optimized fast-path insertion logic.

- add interleaved suffix insertion path for fill-step=3 in matcher - keep tail-anchor backfill behavior unchanged

- add benchmark-report.md and benchmark-results.json to gitignore

coderabbitai · 2026-03-28T14:06:30Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6405b96f-3c5e-4ed6-9259-105a214be188

📥 Commits

Reviewing files that changed from the base of the PR and between 5f64b7c and d2e90a4.

📒 Files selected for processing (1)

zstd/src/encoding/match_generator.rs

📝 Walkthrough

Walkthrough

Refactors MatchGenerator::add_suffixes_till to use index/bounds arithmetic, adds insert_suffix_if_absent helper, and implements an interleaved fast-path that inserts multiple suffix positions per iteration when fill_step equals the fast constant.

Changes

Cohort / File(s)	Summary
Match Generator Suffix Insertion Optimization `zstd/src/encoding/match_generator.rs`	Rewrote `add_suffixes_till` to compute `insert_limit` via saturating subtraction and avoid `slice.windows`. Added `insert_suffix_if_absent` helper, an interleaved fast insertion path (`add_suffixes_interleaved_fast`) for `FAST_HASH_FILL_STEP`, and tests verifying no inserts when `idx < MIN_MATCH_LEN` and correct interleaved insert offsets.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

perf(encoding): align fastest matcher with zstd fast path #39: Implements similar changes to MatchGenerator::add_suffixes_till with interleaved/stepped suffix insertion logic; likely overlaps in approach and tests.

Poem

🐰 I hop through bytes and careful bounds,

inserting tails in interleaved rounds.
Fast lanes hum and loops stay steady,
keys checked once — the table's ready.
Hooray for matches, neat and speedy! 🥕🧡

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly describes the main change: adding interleaved insertion optimization to the fastest hash fill path in the encoding module's match generator.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/#14-interleaved-hash-fill-rebased

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-28T14:08:27Z

Codecov Report

❌ Patch coverage is 95.94595% with 3 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
zstd/src/encoding/match_generator.rs	95.94%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copilot

Pull request overview

This PR optimizes the Fastest encoder’s hash-fill path in MatchGenerator::add_suffixes_till() by unrolling/interleaving suffix insertions (using FAST_HASH_FILL_STEP = 3) to reduce overhead and improve compression throughput, while keeping the existing tail-anchor/backfill behavior unchanged.

Changes:

Add a specialized “interleaved” insertion path for FAST_HASH_FILL_STEP plus shared helper (insert_suffix_if_absent).
Replace the prior windows(...).enumerate().step_by(...) insertion loop with explicit position-based insertion loops.
Update .gitignore to ignore benchmark artifacts (though the new entries currently duplicate existing ones).

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated no comments.

File	Description
`zstd/src/encoding/match_generator.rs`	Adds a fast-path, unrolled/interleaved suffix insertion loop for fastest hash-fill step while preserving tail-anchor behavior.
`.gitignore`	Attempts to ignore benchmark artifacts, but introduces duplicate ignore patterns.

sw-release-bot

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'structured-zstd vs C FFI'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite	Current: `5f64b7c`	Previous: `f3310cb`	Ratio
`compress/default/large-log-stream/matrix/c_ffi`	`4.345` ms	`3.424` ms	`1.27`
`decompress/fastest/high-entropy-1m/matrix/c_ffi`	`0.032` ms	`0.027` ms	`1.19`
`decompress/default/high-entropy-1m/matrix/c_ffi`	`0.032` ms	`0.027` ms	`1.19`
`compress-dict/fastest/small-10k-random/matrix/c_ffi_without_dict`	`0.007` ms	`0.006` ms	`1.17`

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

polaz · 2026-03-28T14:45:34Z

@coderabbitai re-review

coderabbitai · 2026-03-28T14:45:47Z

@polaz Sure, I'll re-review the PR right away!

🧠 Learnings used

Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Applies to **/*.rs : Code must pass `cargo clippy -p structured-zstd --features hash,std,dict_builder -- -D warnings` (excluding `rustc-dep-of-std` and `fuzz_exports` features)

Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Avoid suggesting architectural changes that diverge too far from upstream, as this is a fork of ruzstd (KillingSpark/zstd-rs)

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

polaz · 2026-03-28T14:49:02Z

@coderabbitai re-review

coderabbitai · 2026-03-28T14:49:13Z

@polaz Sure, I'll re-review the PR now!

🧠 Learnings used

Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Applies to **/*.rs : Code must pass `cargo clippy -p structured-zstd --features hash,std,dict_builder -- -D warnings` (excluding `rustc-dep-of-std` and `fuzz_exports` features)

Learnt from: CR
Repo: structured-world/structured-zstd PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T15:15:32.318Z
Learning: Avoid suggesting architectural changes that diverge too far from upstream, as this is a fork of ruzstd (KillingSpark/zstd-rs)

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 33
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-26T00:05:14.600Z
Learning: In structured-world/structured-zstd, the Clippy requirement is: `cargo clippy -p structured-zstd --features hash,std,dict_builder -- -D warnings`. The `rustc-dep-of-std` feature is excluded (internal feature for Rust stdlib builds only) and `fuzz_exports` is excluded (fuzzing-specific entry points validated separately from the regular lint gate). The `--all-features` flag should NOT be used.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.gitignore:
- Around line 13-14: The .gitignore has duplicate entries for
benchmark-results.json and benchmark-report.md; remove the redundant lines in
the .gitignore (the second occurrences shown in the diff) so each of these
filenames appears only once, keeping the original entries already declared
earlier and deleting the duplicate entries added in this change.

In `@zstd/src/encoding/match_generator.rs`:
- Around line 529-535: Add a debug assertion in insert_suffix_if_absent to
ensure the slice access is safe by checking the caller invariant (pos +
MIN_MATCH_LEN <= data.len()); specifically, add a debug_assert referencing
MIN_MATCH_LEN and pos at the top of the function so during development it will
panic if the invariant is violated while leaving release performance unchanged;
keep the rest of the logic (checking suffixes.contains_key and suffixes.insert)
unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 67cd0651-cac4-4845-b34b-b51443030f05

📥 Commits

Reviewing files that changed from the base of the PR and between f3310cb and 82de7e1.

📒 Files selected for processing (2)

.gitignore
zstd/src/encoding/match_generator.rs

- remove duplicate benchmark artifact entries in .gitignore - add debug_assert for insert_suffix_if_absent slice bound invariant

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

- add regression for idx < MIN_MATCH_LEN in add_suffixes_till - add focused interleaved-position registration test - fix insert_limit saturation to preserve original windows() behavior

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

polaz added 2 commits March 28, 2026 15:55

perf(encoding): interleave fastest hash fill insertion

93fcfc1

- add interleaved suffix insertion path for fill-step=3 in matcher - keep tail-anchor backfill behavior unchanged

chore(repo): ignore generated benchmark reports

82de7e1

- add benchmark-report.md and benchmark-results.json to gitignore

Copilot AI review requested due to automatic review settings March 28, 2026 14:06

Copilot started reviewing on behalf of polaz March 28, 2026 14:07 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

sw-release-bot Bot reviewed Mar 28, 2026

View reviewed changes

coderabbitai Bot reviewed Mar 28, 2026

View reviewed changes

Comment thread .gitignore Outdated

Comment thread zstd/src/encoding/match_generator.rs

fix(review): dedupe ignores and guard suffix bounds

5f64b7c

- remove duplicate benchmark artifact entries in .gitignore - add debug_assert for insert_suffix_if_absent slice bound invariant

polaz requested a review from Copilot March 28, 2026 14:59

Copilot started reviewing on behalf of polaz March 28, 2026 14:59 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

Comment thread zstd/src/encoding/match_generator.rs Outdated

Comment thread zstd/src/encoding/match_generator.rs

test(matcher): cover interleaved fill boundaries

d2e90a4

- add regression for idx < MIN_MATCH_LEN in add_suffixes_till - add focused interleaved-position registration test - fix insert_limit saturation to preserve original windows() behavior

polaz requested a review from Copilot March 28, 2026 15:29

Copilot started reviewing on behalf of polaz March 28, 2026 15:29 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

polaz merged commit 6fc1fc3 into main Mar 28, 2026
15 checks passed

polaz deleted the perf/#14-interleaved-hash-fill-rebased branch March 28, 2026 15:58

sw-release-bot Bot mentioned this pull request Mar 28, 2026

chore: release v0.0.4 #40

Merged

Conversation

polaz commented Mar 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark (vs origin/main, STRUCTURED_ZSTD_BENCH_LARGE_BYTES=16777216)

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codecov Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

sw-release-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

polaz commented Mar 28, 2026

Uh oh!

coderabbitai Bot commented Mar 28, 2026

Uh oh!

polaz commented Mar 28, 2026

Uh oh!

coderabbitai Bot commented Mar 28, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

polaz commented Mar 28, 2026 •

edited by coderabbitai Bot

Loading

Benchmark (vs `origin/main`, `STRUCTURED_ZSTD_BENCH_LARGE_BYTES=16777216`)

coderabbitai Bot commented Mar 28, 2026 •

edited

Loading

codecov Bot commented Mar 28, 2026 •

edited

Loading

sw-release-bot Bot left a comment •

edited

Loading