perf(encoding): early incompressible fast-path + benchmark parity#99
perf(encoding): early incompressible fast-path + benchmark parity#99
Conversation
- add heuristic early no-compress path for fastest/default/level<=3 - add dfast lazy-skipping in skip_matching for incompressible blocks - preserve boundary seeding for cross-block matching and add coverage - keep roundtrip + cross-validation + benchmark compatibility
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughCompute a small-source hint; add a sampling-based incompressibility classifier and raw fast-path; thread Option incompressible hints through matchers; pass CompressionLevel into block encoding; gate dictionary priming on an attached dictionary; emit single-segment headers only when small-frame + no-dictionary conditions hold. Changes
Sequence Diagram(s)sequenceDiagram
participant Stream as rgba(20,120,200,0.5) StreamingEncoder
participant Frame as rgba(100,200,120,0.5) FrameCompressor
participant Class as rgba(200,120,100,0.5) IncompressibleModule
participant Matcher as rgba(160,100,220,0.5) MatchGenerator
participant Block as rgba(220,180,80,0.5) compress_block_encoded
Stream->>Frame: encode_block(data, source_size_hint, compression_level)
Frame->>Class: sample -> block_looks_incompressible(block)
alt allowed_by_level_and_window AND looks_incompressible
Frame->>Matcher: skip_matching_with_hint(Some(true))
Frame->>Block: compress_block_encoded(..., compression_level) -> BlockType::Raw + raw bytes
else
Frame->>Matcher: skip_matching_with_hint(None)
Frame->>Block: compress_block_encoded(..., compression_level) -> BlockType::Compressed / RLE
end
Block-->>Frame: encoded block bytes + BlockType
Frame->>Frame: compute single_segment (depends on small hint, dict-state, totals) and set window_size accordingly
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Adds a conservative “incompressible input” fast-path to the Rust encoder so high-entropy blocks can bypass expensive match-finding / entropy setup and emit Raw blocks quickly, improving throughput for Fastest/Default and low numeric levels.
Changes:
- Thread
CompressionLevelinto the shared compressed-block path (frame + streaming) and add a raw fast-path heuristic for likely-incompressible blocks. - Add dfast matcher “lazy skipping” during
skip_matching()for blocks classified as incompressible, while still densely seeding the tail for cross-block matching. - Add regression tests covering raw fast-path on random payloads and ensuring compressible log-like inputs don’t incorrectly fall back to Raw.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
zstd/src/encoding/streaming_encoder.rs |
Passes the active CompressionLevel into the shared block compression routine for streaming mode. |
zstd/src/encoding/match_generator.rs |
Adds incompressible detection + sparse hash insertion in dfast skip_matching(), with dense tail seeding. |
zstd/src/encoding/levels/fastest.rs |
Implements the early Raw emission heuristic for eligible levels and wires it into compress_block_encoded. |
zstd/src/encoding/frame_compressor.rs |
Wires the level into the shared block path for one-shot/frame mode and adds targeted regression tests. |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
zstd/src/encoding/match_generator.rs (1)
1643-1665:⚠️ Potential issue | 🟠 MajorKeep dictionary priming on the dense seeding path.
skip_matching()is also used fromMatchGeneratorDriver::prime_with_dictionary()on Lines 637-638. With this change, any ≥1 KiB dictionary chunk that looks high-entropy is only indexed every 8 bytes plus the tail, so first-block matches into most dictionary positions disappear. That regresses dictionary-backed compression even though no raw block was emitted. Please bypass the incompressible sparse path when seeding primed dictionary history.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/match_generator.rs` around lines 1643 - 1665, The current skip_matching() uses sparse insertion for blocks that look incompressible, which breaks priming from MatchGeneratorDriver::prime_with_dictionary(); modify skip_matching() (or add a new call path) so that when invoked for dictionary priming it always uses the dense seeding path: detect the priming call (e.g. add a boolean flag on MatchGenerator or a new method skip_matching_for_priming()) and bypass the insert_positions_with_step(..., INCOMPRESSIBLE_SKIP_STEP) branch, calling insert_positions(...) for the whole range and still performing the dense tail insertion (keep dense_tail/tail_start logic intact) so dictionary history is fully indexed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 657-679: This test only exercises CompressionLevel::Fastest; add a
second assertion exercising CompressionLevel::Default (and optionally one of the
Level(0..=3) variants) to ensure compressible input doesn't fall back to
BlockType::Raw for the default path: call
crate::encoding::compress_to_vec(data.as_slice(),
super::CompressionLevel::Default) (and/or super::CompressionLevel::Level(3)),
assert_ne!(first_block_type(&compressed_default), BlockType::Raw), assert
compressed_default.len() < data.len(), and round-trip decode via
zstd::stream::copy_decode to verify output equals data, mirroring the existing
Fastest checks.
In `@zstd/src/encoding/levels/fastest.rs`:
- Around line 86-146: The current should_emit_raw_fast_path only inspects the
block prefix (sample = &block[..sample_len]), causing misclassification for
mixed-entropy inputs; change the sampling to use a head+tail or strided probe
(e.g., take up to sample_len/2 from the start and sample_len/2 from the end, or
evenly spaced strides through the whole block) and feed those bytes into the
same histogram and 4-byte-quad repeat logic (update counts, sampled_quads,
repeats and distinct computation to reflect the combined sample), or perform a
quick second probe on a tail sample before returning true; keep existing guards
(max_symbol_guard, repeat_guard) but compute them from the effective sampled
length so the decision in should_emit_raw_fast_path reflects the whole-block
entropy rather than only the prefix.
---
Outside diff comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 1643-1665: The current skip_matching() uses sparse insertion for
blocks that look incompressible, which breaks priming from
MatchGeneratorDriver::prime_with_dictionary(); modify skip_matching() (or add a
new call path) so that when invoked for dictionary priming it always uses the
dense seeding path: detect the priming call (e.g. add a boolean flag on
MatchGenerator or a new method skip_matching_for_priming()) and bypass the
insert_positions_with_step(..., INCOMPRESSIBLE_SKIP_STEP) branch, calling
insert_positions(...) for the whole range and still performing the dense tail
insertion (keep dense_tail/tail_start logic intact) so dictionary history is
fully indexed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 840a716f-439b-4858-afe8-2ab9170aae67
📒 Files selected for processing (4)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/match_generator.rszstd/src/encoding/streaming_encoder.rs
- factor incompressible heuristic into shared helper + constants - switch heuristic sampling to head+tail to avoid prefix-only bias - keep dictionary priming on dense dfast seeding path - extend compressible negative-path tests for default and level(3)
|
@coderabbitai Addressed the outside-diff item on dictionary priming: MatchGeneratorDriver::prime_with_dictionary now uses a dedicated dense seeding path for dfast (skip_matching_for_dictionary_priming -> skip_matching_dense), so incompressible sparse stepping is not used for primed dictionary history. |
|
Tip For best results, initiate chat on the files or code changes.
✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/levels/fastest.rs`:
- Around line 49-58: The matcher is redundantly recomputing the "incompressible"
verdict inside state.matcher.skip_matching(); modify the call sites and matcher
API so the precomputed result from should_emit_raw_fast_path(compression_level,
&uncompressed_data) is threaded into the skip path (e.g., add a parameter or
setter to Matcher such as skip_matching(incompressible: bool) or
set_incompressible_then_skip()), then call state.matcher.commit_space(...)
followed by state.matcher.skip_matching(true) (or the equivalent setter + skip)
so skip_matching reuses the already-known incompressible verdict instead of
calling block_looks_incompressible(...) again. Ensure you update the Matcher
methods (skip_matching signature and any internal use of
block_looks_incompressible) and all callers accordingly (references:
should_emit_raw_fast_path, state.matcher.commit_space,
state.matcher.skip_matching).
In `@zstd/src/encoding/match_generator.rs`:
- Around line 1659-1675: The extra dense-tail reseed should only run when the
sparse insertion path was taken; change the logic so the tail re-insertion
(using dense_tail, tail_start, insert_positions) is executed only when
block_looks_incompressible(...) is true (i.e., the branch that called
insert_positions_with_step), leaving the full dense insert branch
(insert_positions(current_abs_start, current_abs_end)) untouched to avoid
redundant work and keep skip_matching() lean. Locate the block containing
block_looks_incompressible(current_abs_start, current_abs_end),
insert_positions_with_step, insert_positions, and the tail calculation
(dense_tail, tail_start) and move or guard the tail insert so it runs only for
the sparse branch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5886a03d-eb16-47b6-a3be-579337a33129
📒 Files selected for processing (5)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/incompressible.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rs
- avoid repeat sentinel collisions via explicit slot occupancy tracking - thread precomputed incompressible hints through matcher skip API - run dense tail reseed only for sparse dfast insertion path - add regression test for sparse incompressible cross-block tail matching
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 471-481: Tests contain a duplicate generate_data implementation
inside source_size_hint_levels_remain_ffi_compatible_small_inputs_matrix; remove
that inner helper and call the module-level fn generate_data(seed: u64, len:
usize) -> Vec<u8> instead so all tests share the same generator and signature.
Locate the inner function in
source_size_hint_levels_remain_ffi_compatible_small_inputs_matrix, delete it,
and replace its usages with direct calls to generate_data(seed, len).
In `@zstd/src/encoding/match_generator.rs`:
- Around line 1682-1687: skip_matching_dense currently only seeds hashes for the
newly appended slice (using insert_positions with
current_abs_start..current_abs_end) which misses inserting the last 1–3
positions of the previous slice when a primed dictionary spans slices; update
skip_matching_dense so after computing current_abs_start/current_abs_end it also
computes the previous-slice tail absolute range (using
self.window.back().unwrap() and self.history_abs_start + self.window_size -
prev_len to get prev_abs_end) and calls insert_positions for that tail (e.g.
max(prev_abs_end - 3, prev_abs_start) .. current_abs_start) so the last 1–3
positions of the prior slice become hashable; keep insert_positions, window,
history_abs_start and window_size as the referenced symbols.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3082542c-8b6d-44ec-80dc-17241d715c77
📒 Files selected for processing (6)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/incompressible.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rszstd/src/encoding/streaming_encoder.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 617-643: Add a third test that mirrors the existing
fastest_random_block_uses_raw_fast_path and
default_random_block_uses_raw_fast_path but exercises CompressionLevel::Level(2)
to cover the unique Dfast tuning; call crate::encoding::compress_to_vec with
super::CompressionLevel::Level(2), assert first_block_type(&compressed) ==
BlockType::Raw, and round-trip via zstd::stream::copy_decode to verify decoded
== data, using the same generate_data seed/size pattern as the other tests to
ensure parity with the existing test cases.
In `@zstd/src/encoding/match_generator.rs`:
- Around line 4550-4556: The test is exercising the sparse-incompressible branch
indirectly by calling matcher.skip_matching(None); change this to explicitly
call matcher.skip_matching(Some(true)) so the explicit incompressible-hint path
is covered; locate the block where current_abs_start/current_abs_end are
computed and where matcher.block_looks_incompressible(...) is asserted, and
replace the skip_matching(None) invocation with skip_matching(Some(true)) to pin
the new API contract for matcher.skip_matching.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 34a11fa6-1a29-41ca-9ecd-bea546ffa7b7
📒 Files selected for processing (2)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rs
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 4599-4605: Remove the heuristic precondition assert so the test no
longer depends on classifier tuning: delete the assert!(
matcher.block_looks_incompressible(current_abs_start, current_abs_end), ... )
lines (which reference matcher.block_looks_incompressible,
current_abs_start/current_abs_end computed from matcher.history_abs_start,
matcher.window_size and first.len()), and keep the call to
matcher.skip_matching(Some(true)) so the test directly exercises the explicit
hint without relying on the heuristic.
- Around line 4382-4395: The tests incorrectly assert SuffixStore occupancy by
calling last.suffixes.get(...) (collision-prone); instead, replace these
assertions with observable-match checks: use the matcher to attempt a match for
the slices (e.g., the prefix at last.data[0..MIN_MATCH_LEN], the sparse start at
last.data[3..3+MIN_MATCH_LEN], and the tail at
tail_start..tail_start+MIN_MATCH_LEN) and assert the expected match/no-match
behavior and offsets; update assertions to call the public match-finding API on
matcher (or the module's match lookup function) rather than inspecting
last.suffixes.get directly so the test verifies real matching behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d163c8d1-2f35-4fd6-a88f-073fbc7a91e4
📒 Files selected for processing (4)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
zstd/src/encoding/match_generator.rs (2)
4599-4605:⚠️ Potential issue | 🟡 MinorRemove the heuristic precondition from this explicit-hint regression.
This test already exercises
skip_matching(Some(true))directly. Keepingblock_looks_incompressible(...)in the setup ties it to classifier tuning and can create false failures without any change to the hinted path.♻️ Minimal follow-up
- let current_abs_start = matcher.history_abs_start + matcher.window_size - first.len(); - let current_abs_end = current_abs_start + first.len(); - assert!( - matcher.block_looks_incompressible(current_abs_start, current_abs_end), - "test fixture must take the sparse incompressible branch" - ); matcher.skip_matching(Some(true));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/match_generator.rs` around lines 4599 - 4605, Remove the heuristic precondition that asserts matcher.block_looks_incompressible(...) in the test setup; the test should directly exercise matcher.skip_matching(Some(true)) without relying on classifier tuning. Delete the computations of current_abs_start and current_abs_end (which are only used for the assertion) and remove the assert!(...) call that references matcher.block_looks_incompressible(current_abs_start, current_abs_end), leaving the matcher.skip_matching(Some(true)) invocation and any remaining setup intact so the test no longer depends on block_looks_incompressible.
4382-4395:⚠️ Potential issue | 🟡 MinorDon’t assert sparse indexing through raw
SuffixStore::get().
SuffixStoreis overwrite-on-collision, soSome(0)/Nonehere does not reliably prove which starts were indexed. This regression can pass or fail based on unrelated bucket collisions instead of observable matcher behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/match_generator.rs` around lines 4382 - 4395, The assertions directly inspecting SuffixStore via last.suffixes.get(...) are brittle because SuffixStore is overwrite-on-collision; remove those direct get() assertions and instead verify observable matcher behavior using the public matcher API (e.g., call the match generation routine on data that exercises cross-block tail matches and sparse-prefix starts) to assert expected match results; specifically, stop asserting on last.suffixes.get(...) and replace with tests that confirm the matcher (matcher.window.last(), match generator functions) returns the expected matches for the dense tail case and does not rely on sparse-start indexing assumptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 1675-1701: In skip_matching, after computing
current_abs_start/current_abs_end and determining used_sparse, ensure the
previous-slice tail is backfilled on the normal Dfast path: when used_sparse is
false, compute dense_tail = DFAST_MIN_MATCH_LEN + 3, set tail_start =
current_abs_start.saturating_sub(dense_tail) and if tail_start <
current_abs_start call self.insert_positions(tail_start, current_abs_start)
before calling self.insert_positions(current_abs_start, current_abs_end) (i.e.,
seed the last 1–3 bytes of the prior slice so cross-boundary matches are
hashable for the following block).
---
Duplicate comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 4599-4605: Remove the heuristic precondition that asserts
matcher.block_looks_incompressible(...) in the test setup; the test should
directly exercise matcher.skip_matching(Some(true)) without relying on
classifier tuning. Delete the computations of current_abs_start and
current_abs_end (which are only used for the assertion) and remove the
assert!(...) call that references
matcher.block_looks_incompressible(current_abs_start, current_abs_end), leaving
the matcher.skip_matching(Some(true)) invocation and any remaining setup intact
so the test no longer depends on block_looks_incompressible.
- Around line 4382-4395: The assertions directly inspecting SuffixStore via
last.suffixes.get(...) are brittle because SuffixStore is
overwrite-on-collision; remove those direct get() assertions and instead verify
observable matcher behavior using the public matcher API (e.g., call the match
generation routine on data that exercises cross-block tail matches and
sparse-prefix starts) to assert expected match results; specifically, stop
asserting on last.suffixes.get(...) and replace with tests that confirm the
matcher (matcher.window.last(), match generator functions) returns the expected
matches for the dense tail case and does not rely on sparse-start indexing
assumptions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1cf996c8-c431-476d-a9b3-e6b247561bae
📒 Files selected for processing (4)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/match_generator.rszstd/src/encoding/mod.rs
- enable and gate best-level raw early exits for incompressible data - preserve large-window match advantage with window-aware guard - propagate incompressible hints to HC skip path with sparse indexing - add regression tests for best fast-path and cross-block tail behavior
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 779-805: The test
compressible_logs_do_not_fall_back_to_raw_fast_path currently checks block type
and round-trip decompression but doesn't verify the hinted-header path; update
the test (or assert_not_raw_for_level helper) so that for hinted variants
(super::CompressionLevel::Default and any explicit hinted Level like Level(0) or
Level(3) you want to cover) you also parse the compressed header and assert
single_segment_flag(&compressed) == false (use the existing single_segment_flag
symbol or equivalent header inspection helper), keeping the existing checks for
first_block_type and decompression; this ensures the hinted non-raw header path
is exercised and prevents regressions in the all_blocks_raw guard.
In `@zstd/src/encoding/levels/fastest.rs`:
- Around line 158-220: The tests currently only inspect the serialized header
bits in output, but compress_block_encoded(...) returns a BlockType that the
real codepath (FrameCompressor) consumes; capture the return value of
compress_block_encoded in both tests and assert it equals the expected enum
(e.g., BlockType::Rle for rle_branch_passes_compressible_hint_to_skip_matching
and BlockType::Raw for
raw_fast_path_emits_raw_block_and_passes_incompressible_hint) instead of—or in
addition to—inspecting output bytes, using the returned value from
compress_block_encoded to verify the correct branch was chosen.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a8c5e34b-d992-465c-8ddd-20b5f72f8174
📒 Files selected for processing (2)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/levels/fastest.rs
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
zstd/src/encoding/frame_compressor.rs (1)
647-661:⚠️ Potential issue | 🟡 MinorPin the 511/512 single-segment boundary explicitly.
Line 323 introduces a new lower bound at 512 bytes, but the updated coverage starts at 513. That leaves the compat floor itself untested, so an off-by-one here would still pass. Please add explicit hinted 511-byte and 512-byte cases and assert that
single_segment_flag()stays false / flips true at that boundary.Also applies to: 699-741
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/frame_compressor.rs` around lines 647 - 661, In source_size_hint_levels_remain_ffi_compatible_small_inputs_matrix add explicit test cases for the 511 and 512 byte boundaries (i.e. include sizes 511 and 512 in the sizes test set used by that function and the similar block around lines 699-741), then assert that single_segment_flag() is false at 511 and true at 512 for the relevant compression levels; update the array of sizes and the per-size assertions so the single-segment boundary is explicitly tested rather than starting at 513.
♻️ Duplicate comments (1)
zstd/src/encoding/match_generator.rs (1)
1907-1918:⚠️ Potential issue | 🟠 MajorSeed the last 4-byte-hashable Dfast starts here, not just 6-byte starts.
Line 1914 still stops at
DFAST_MIN_MATCH_LEN, butinsert_position()can already index starts with only 4 bytes of lookahead. That leaves the final two hashable starts of a literals-only block unseeded, so the next block can miss an immediate cross-boundary match that begins 4–5 bytes before the prior block ended. Because this helper is shared, the gap affects bothstart_matching_general()andstart_matching_fast_loop().♻️ Proposed fix
fn seed_remaining_hashable_starts( &mut self, current_abs_start: usize, current_len: usize, pos: usize, ) { let mut seed_pos = pos.min(current_len); - while seed_pos + DFAST_MIN_MATCH_LEN <= current_len { + while seed_pos + 4 <= current_len { self.insert_position(current_abs_start + seed_pos); seed_pos += 1; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/match_generator.rs` around lines 1907 - 1918, The loop in seed_remaining_hashable_starts uses DFAST_MIN_MATCH_LEN which only preserves 6-byte starts; change the loop to seed the last 4-byte-hashable starts by replacing the termination check with the 4-byte threshold (use the existing 4-byte constant if present or literal 4) so the while becomes while seed_pos + 4 <= current_len { self.insert_position(current_abs_start + seed_pos); seed_pos += 1; } — this ensures insert_position(...) seeds the final two hashable starts for current_abs_start/current_len/pos and fixes cross-block matches used by start_matching_general() and start_matching_fast_loop().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 845-882: The test
hinted_small_compressible_frames_use_single_segment_across_levels only checks
header parity and roundtrip; update it to also assert the compressed frame does
not use raw blocks and still achieves size reduction: after compressing with
FrameCompressor (constructed via FrameCompressor::new and using
set_source_size_hint, set_source, set_drain, compress) and before decoding,
parse the frame (read_frame_header and any block iteration utilities) and add an
assertion that no BlockType::Raw blocks are present, and also assert that
compressed.len() < data.len() to ensure compression actually reduced size; keep
the existing single_segment_flag and roundtrip checks.
---
Outside diff comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 647-661: In
source_size_hint_levels_remain_ffi_compatible_small_inputs_matrix add explicit
test cases for the 511 and 512 byte boundaries (i.e. include sizes 511 and 512
in the sizes test set used by that function and the similar block around lines
699-741), then assert that single_segment_flag() is false at 511 and true at 512
for the relevant compression levels; update the array of sizes and the per-size
assertions so the single-segment boundary is explicitly tested rather than
starting at 513.
---
Duplicate comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 1907-1918: The loop in seed_remaining_hashable_starts uses
DFAST_MIN_MATCH_LEN which only preserves 6-byte starts; change the loop to seed
the last 4-byte-hashable starts by replacing the termination check with the
4-byte threshold (use the existing 4-byte constant if present or literal 4) so
the while becomes while seed_pos + 4 <= current_len {
self.insert_position(current_abs_start + seed_pos); seed_pos += 1; } — this
ensures insert_position(...) seeds the final two hashable starts for
current_abs_start/current_len/pos and fixes cross-block matches used by
start_matching_general() and start_matching_fast_loop().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 25f1af24-9b00-4cbd-89d8-444e8ea3262e
📒 Files selected for processing (4)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/match_generator.rszstd/src/encoding/streaming_encoder.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
zstd/src/encoding/frame_compressor.rs (1)
843-883:⚠️ Potential issue | 🟡 MinorAdd non-raw and size-reduction assertions to prevent silent regression.
This test verifies single-segment header parity but does not assert that the compressible fixture avoids
BlockType::Rawor achieves size reduction. A future change could silently regress compression behavior while this test still passes.🛡️ Proposed fix to harden the test
let (frame_header, _) = read_frame_header(compressed.as_slice()).unwrap(); assert!( frame_header.descriptor.single_segment_flag(), "hinted small compressible frame should use single-segment (level={level:?})" ); + assert_ne!( + first_block_type(&compressed), + BlockType::Raw, + "compressible hinted frame should stay off the raw fast path (level={level:?})" + ); + assert!( + compressed.len() < data.len(), + "compressible hinted frame should still shrink (level={level:?})" + ); let mut decoded = Vec::new(); zstd::stream::copy_decode(compressed.as_slice(), &mut decoded) .unwrap_or_else(|e| panic!("ffi decode failed (level={level:?}): {e}"));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@zstd/src/encoding/frame_compressor.rs` around lines 843 - 883, The test hinted_small_compressible_frames_use_single_segment_across_levels should also assert that the compressed output actually achieved compression and did not emit raw blocks: after building compressed (from FrameCompressor with set_source_size_hint and set_source), parse the frame payload to inspect blocks (use existing frame/block-parsing helpers or the BlockType enum) and assert no BlockType::Raw is present for this fixture, and assert compressed.len() < data.len() to ensure size reduction; keep these assertions inside the same per-level loop so any regression that preserves single_segment_flag but emits raw data or grows the output will fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 247-261: The single-segment calculation in StreamingEncoder (the
single_segment variable computed from pledged_content_size and window_size)
intentionally omits the !use_dictionary_state guard present in FrameCompressor
because StreamingEncoder currently has no dictionary support; add a concise
inline comment next to the single_segment computation (or near the FrameHeader
construction) referencing FrameCompressor and noting that StreamingEncoder lacks
set_dictionary/use_dictionary_state and therefore deliberately allows
single-segment framing, and include a TODO that if
set_dictionary/use_dictionary_state or dictionary priming is later added to
StreamingEncoder the single_segment logic must be updated to include the
!use_dictionary_state guard to match FrameCompressor behavior.
---
Duplicate comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 843-883: The test
hinted_small_compressible_frames_use_single_segment_across_levels should also
assert that the compressed output actually achieved compression and did not emit
raw blocks: after building compressed (from FrameCompressor with
set_source_size_hint and set_source), parse the frame payload to inspect blocks
(use existing frame/block-parsing helpers or the BlockType enum) and assert no
BlockType::Raw is present for this fixture, and assert compressed.len() <
data.len() to ensure size reduction; keep these assertions inside the same
per-level loop so any regression that preserves single_segment_flag but emits
raw data or grows the output will fail.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5ca9f252-caaa-414a-9efa-75954c949ee2
📒 Files selected for processing (4)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/levels/fastest.rszstd/src/encoding/match_generator.rszstd/src/encoding/streaming_encoder.rs
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'structured-zstd vs C FFI (x86_64-gnu)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.30.
| Benchmark suite | Current: b8dd1ab | Previous: 40b6e89 | Ratio |
|---|---|---|---|
compress/default/small-4k-log-lines/matrix/pure_rust |
0.088 ms |
0.067 ms |
1.31 |
compress/level4-row/small-4k-log-lines/matrix/pure_rust |
0.066 ms |
0.047 ms |
1.40 |
compress/level4-row/low-entropy-1m/matrix/pure_rust |
9.058 ms |
6.449 ms |
1.40 |
decompress/fastest/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/default/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/better/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/level4-row/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/best/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
|
@coderabbitai Re duplicate at zstd/src/encoding/frame_compressor.rs:843-883: fixed in cba456d. The test now asserts first block is not Raw and verifies compressed.len() < data.len() for each hinted compressible level, in addition to single-segment and roundtrip checks. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3336
Called for
All four properties the review comment requested are now covered. ✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/src/encoding/match_generator.rs`:
- Around line 1908-1918: In seed_remaining_hashable_starts change how seed_pos
is initialized so the final hashable starts at the block tail are always seeded
even when adaptive skip jumps to current_len: instead of seed_pos =
pos.min(current_len) compute seed_pos =
pos.min(current_len.saturating_sub(DFAST_SHORT_HASH_LOOKAHEAD)) (so when pos >=
current_len you still start at the last hashable start), then keep the existing
while loop that calls insert_position(current_abs_start + seed_pos) and
increments seed_pos; this ensures the tail hashable 4-byte starts aren't skipped
across block boundaries.
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 252-255: The single_segment gate currently checks both
pledged_content_size and the matcher's window_size, causing small pledged frames
to be treated differently depending on matcher internals; change the closure
that computes single_segment to only depend on pledged_content_size (i.e., keep
the 512..=(1<<14) range check and remove the "size <= window_size" condition) so
single_segment is true whenever a known frame size falls in that range
regardless of matcher window; this affects the variable single_segment computed
from pledged_content_size and ensures parity with FrameCompressor and
new_with_matcher behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9cd60c69-7aa8-4cc8-98ca-6f3b5f944012
📒 Files selected for processing (3)
zstd/src/encoding/frame_compressor.rszstd/src/encoding/match_generator.rszstd/src/encoding/streaming_encoder.rs
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'structured-zstd vs C FFI (x86_64-gnu)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.30.
| Benchmark suite | Current: 882dfd7 | Previous: 40b6e89 | Ratio |
|---|---|---|---|
compress/level4-row/small-4k-log-lines/matrix/pure_rust |
0.068 ms |
0.047 ms |
1.45 |
compress/level4-row/low-entropy-1m/matrix/pure_rust |
8.86 ms |
6.449 ms |
1.37 |
decompress/fastest/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/default/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/better/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/level4-row/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
decompress/best/low-entropy-1m/c_stream/matrix/c_ffi |
0.27 ms |
0.167 ms |
1.62 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
Summary
single_segment=truefor hinted one-shot payloads up to 16 KiB across levels512stay on non-single-segment framingskip_matching_with_hint(Some(false)))BlockTypefromcompress_block_encoded()Current behavior (HEAD)
source_size_hint <= 16 KiB) now use single-segment framing across levels when:[512 ..= 16 KiB]Validation
cargo fmt --all -- --checkcargo clippy --all-targets --features hash,std,dict_builder -- -D warningscargo nextest run --workspace --features hash,std,dict_builderSTRUCTURED_ZSTD_EMIT_REPORT=1 cargo bench -p structured-zstd --bench compare_ffi --features hash,std,dict_builder -- 'compress/fastest/small-4k-log-lines/matrix'REPORT_HDR scenario=small-4k-log-lines level=fastest encoder=rust ... single_segment=trueREPORT_HDR scenario=small-4k-log-lines level=fastest encoder=ffi ... single_segment=trueNotes
Closes #97
Summary by CodeRabbit
Improvements
Tests
Bench
Chore