Skip to content

perf(decoding): 4-stream interleaved Huffman decode and bulk table init#54

Merged
polaz merged 12 commits intomainfrom
feat/#10-perf-huffman-decoder--4-stream-parallel-decoding-a
Apr 3, 2026
Merged

perf(decoding): 4-stream interleaved Huffman decode and bulk table init#54
polaz merged 12 commits intomainfrom
feat/#10-perf-huffman-decoder--4-stream-parallel-decoding-a

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Apr 2, 2026

Summary

  • Interleave decode operations across all 4 Huffman streams to hide memory latency: 4 independent table lookups then 4 state advances per iteration, instead of sequential stream-by-stream processing
  • Replace entry-by-entry table filling with slice::fill() for bulk initialization (compiles to vectorized stores, matching C reference HUF_DEltX1_set4() pattern)
  • Add #[inline(always)] to decode_symbol, init_state, next_state for better codegen in the interleaved loop
  • Per-segment cursor bounds: all starts/ends clamped to [base, base+regen] and cursors bounded by segment ends, matching C reference opX < oendX guards — prevents OOB on corrupted inputs
  • Update dev-dependencies: criterion 0.5→0.8, rand 0.8→0.10, zstd 0.13.2→0.13.3, cli deps to latest patch versions

Technical Details

The C reference (huf_decompress.c) processes 4 Huffman streams with interleaved operations so that while one stream waits for a table lookup result, other streams can proceed. The previous Rust implementation decoded streams sequentially in a for loop.

The new interleaved loop structure:

  1. Fast path: while all 4 streams have bits, decode one symbol from each (4 independent table lookups), then advance each state (4 independent bit reads) — gives the CPU's OoO engine 4x more independent work per iteration
  2. Drain phase: finish remaining symbols from each stream individually
  3. Per RFC 8878 §3.1.1.3.2, segment sizes are known ahead of time (first 3 streams produce ceil(regen_size/4) symbols each), so output is written directly into pre-allocated target slices via cursor offsets — zero extra allocations on the hot path

Known Limitations

Double-symbol (X2) decode variant and stream selection heuristic are not implemented yet — tracked separately for a future release.

Test Plan

  • 172 unit tests pass
  • 15 cross-validation tests (Rust ↔ C FFI) pass
  • 11 CLI tests pass
  • 8 doc-tests pass
  • decode_all benchmark runs (~5.0ms on test corpus)
  • compare_ffi benchmark compiles and runs with criterion 0.8

Closes #10

Summary by CodeRabbit

  • Chores

    • Updated CLI and library dependency versions and dev tooling for improved stability and security.
  • Refactor

    • Reworked decompression internals to use interleaved decoding, added bounds checks and stronger error detection, and applied targeted inlining to improve throughput and memory safety.
  • Tests

    • Adjusted benchmarks and test helpers to use the latest tooling and RNG APIs for more consistent, reliable measurements.

- Interleave decode operations across all 4 Huffman streams to hide
  memory latency: 4 independent table lookups then 4 state advances
  per iteration instead of sequential stream-by-stream processing
- Replace entry-by-entry table filling with slice::fill() for bulk
  initialization (compiles to vectorized stores)
- Add #[inline(always)] to decode_symbol, init_state, next_state for
  better codegen in the interleaved loop
- Update dev-dependencies: criterion 0.5→0.8, rand 0.8→0.10,
  zstd 0.13.2→0.13.3, and bump cli deps to latest patch versions

Closes #10
Copilot AI review requested due to automatic review settings April 2, 2026 23:23
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ed3f4ba6-f872-40f0-ae93-63d841457c3f

📥 Commits

Reviewing files that changed from the base of the PR and between dd23b7e and 8dea12a.

📒 Files selected for processing (1)
  • zstd/benches/support/mod.rs

📝 Walkthrough

Walkthrough

Adds an interleaved 4‑stream Huffman literals decoder with per‑stream bit-readers and guarded output bounds, applies slice-fill bulk initialization and inlining hints in the Huffman decoder, and updates Cargo dependency and bench/dev-dependency versions/imports.

Changes

Cohort / File(s) Summary
Dependency updates
cli/Cargo.toml, zstd/Cargo.toml
Bumped CLI crate deps (clap, tracing, indicatif, tracing-indicatif, console, tracing-subscriber) and dev-deps (criterion 0.5→0.8, rand 0.8.5→0.10(feature change),zstd` dev-dep bumped).
4‑stream literals decoding
zstd/src/decoding/literals_section_decoder.rs
Rewrote num_streams == 4 path to set up 4 HuffmanDecoder/BitReaderReversed, compute per-stream start/end ranges, decode interleaved into a preallocated target, drain streams, validate/truncate on decoded-count or bitstream mismatches.
Huffman table & inlining
zstd/src/huff0/huff0_decoder.rs
Added #[inline(always)] to key HuffmanDecoder methods and replaced per-entry decode-table assignments with bulk slice.fill() initialization.
Benches & RNG imports
zstd/benches/compare_ffi.rs, zstd/benches/support/mod.rs
Switched black_box import to std::hint::black_box, trimmed criterion imports, replaced rand::RngCore with rand::Rng and updated random byte fill call.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant LiteralsDecoder as Literals\nSectionDecoder
    participant Huff1 as HuffmanDecoder\n(Stream1)
    participant Huff2 as HuffmanDecoder\n(Stream2)
    participant Huff3 as HuffmanDecoder\n(Stream3)
    participant Huff4 as HuffmanDecoder\n(Stream4)
    participant Buffers as Output\nBuffers

    Caller->>LiteralsDecoder: decode_literals(num_streams=4)
    LiteralsDecoder->>LiteralsDecoder: parse headers & slice into 4 substreams
    LiteralsDecoder->>Huff1: init_state(br1)
    LiteralsDecoder->>Huff2: init_state(br2)
    LiteralsDecoder->>Huff3: init_state(br3)
    LiteralsDecoder->>Huff4: init_state(br4)
    LiteralsDecoder->>Buffers: allocate target slice (base..base+regen)

    rect rgba(100,150,200,0.5)
    loop interleaved decode
        LiteralsDecoder->>Huff1: decode_symbol()
        Huff1->>Buffers: write symbol at cursor1
        LiteralsDecoder->>Huff2: decode_symbol()
        Huff2->>Buffers: write symbol at cursor2
        LiteralsDecoder->>Huff3: decode_symbol()
        Huff3->>Buffers: write symbol at cursor3
        LiteralsDecoder->>Huff4: decode_symbol()
        Huff4->>Buffers: write symbol at cursor4
        LiteralsDecoder->>Huff1: next_state(br1)
        LiteralsDecoder->>Huff2: next_state(br2)
        LiteralsDecoder->>Huff3: next_state(br3)
        LiteralsDecoder->>Huff4: next_state(br4)
    end
    end

    rect rgba(200,150,100,0.5)
    Note over LiteralsDecoder: per-stream drain & finalize
    LiteralsDecoder->>Huff1: drain remaining -> buf1
    LiteralsDecoder->>Huff2: drain remaining -> buf2
    LiteralsDecoder->>Huff3: drain remaining -> buf3
    LiteralsDecoder->>Huff4: drain remaining -> buf4
    end

    LiteralsDecoder->>LiteralsDecoder: validate total decoded count, truncate on mismatch
    LiteralsDecoder->>Caller: return decoded output or error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I split the bits and hopped in four,
I read and wrote—one symbol, then more,
I filled the table in bulk and neat,
I guarded bounds with careful feet,
Hoppity Huffman — faster, sweet! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main performance optimizations: 4-stream interleaved Huffman decoding and bulk table initialization via slice::fill().
Linked Issues check ✅ Passed The PR implements 4-stream interleaved Huffman decode and bulk table initialization (criteria 1-2 from #10), with #[inline(always)] and per-segment bounds. Double-symbol (X2) variant and stream-selection heuristic remain unimplemented but are documented as known limitations.
Out of Scope Changes check ✅ Passed All changes align with #10 objectives: interleaved decoding, bulk table init, inline attributes for codegen, bounds enforcement, and dependency updates. No unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#10-perf-huffman-decoder--4-stream-parallel-decoding-a

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 92.47312% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
zstd/src/decoding/literals_section_decoder.rs 92.04% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes zstd literal Huffman decoding by aligning the Rust implementation more closely with the C reference: it interleaves work across the 4 Huffman bitstreams to increase instruction-level parallelism and speeds up Huffman table construction via bulk slice initialization.

Changes:

  • Interleave 4-stream Huffman decoding in decompress_literals (decode 4 symbols, then advance 4 states per iteration).
  • Use slice::fill() to bulk-initialize ranges of identical Huffman decode table entries.
  • Update dev-dependency versions for benchmarks/tests (criterion/rand/zstd) and bump several CLI dependencies.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
zstd/src/huff0/huff0_decoder.rs Adds forced inlining on hot decoder methods; bulk-fills decode table entries via slice::fill()
zstd/src/decoding/literals_section_decoder.rs Reworks 4-stream literals decode into an interleaved loop + per-stream buffering then concatenation
zstd/Cargo.toml Updates dev-dependencies for benches/tests (criterion/rand/zstd)
cli/Cargo.toml Updates CLI dependency patch versions; updates CLI dev-dependencies

Comment thread zstd/Cargo.toml
Comment thread zstd/src/decoding/literals_section_decoder.rs Outdated
- rand 0.10: RngCore → Rng (fill_bytes moved to Rng trait)
- criterion 0.8: black_box → std::hint::black_box (deprecated)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cli/Cargo.toml`:
- Line 19: The dependency entry clap = { version = "4.6.0", features =
["derive"] } raises the MSRV to 1.85, so either update CI/toolchain and docs to
require Rust 1.85+ or pin clap to a pre-4.6.x version that supports our current
MSRV; modify the Cargo.toml dependency accordingly and update CI configuration
(rust-toolchain / workflow matrix) and any README or contribution docs to state
the new minimum Rust version if you choose to keep clap 4.6.0.

In `@zstd/src/decoding/literals_section_decoder.rs`:
- Around line 128-181: The per-stream buffers (bufs) are unbounded and you must
validate each of the 4 substream decoded lengths against the known per-stream
target sizes from section.regenerated_size before concatenation; modify the loop
that drains remaining symbols (using brs, decoders, max_bits) to either decode
directly into pre-sized target slices or accumulate into per-stream buffers and
then assert buf.len() == expected_len for each stream (compute expected_len from
section.regenerated_size split into 4 parts), returning DecompressLiteralsError
if any mismatch, and only call target.extend_from_slice(buf) after each
per-stream length has been validated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 18931dbb-5e05-481d-93dc-a84875467781

📥 Commits

Reviewing files that changed from the base of the PR and between d402315 and b751e26.

📒 Files selected for processing (4)
  • cli/Cargo.toml
  • zstd/Cargo.toml
  • zstd/src/decoding/literals_section_decoder.rs
  • zstd/src/huff0/huff0_decoder.rs

Comment thread cli/Cargo.toml
Comment thread zstd/src/decoding/literals_section_decoder.rs Outdated
…et slices

- Use RFC 8878 segment sizes to pre-allocate target and write directly
  via cursor offsets instead of per-stream Vec buffers
- Add decoded count validation with truncation for corrupted streams
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/benches/compare_ffi.rs`:
- Around line 14-15: Reorder the use statements to match rustfmt's canonical
ordering: place the std import(s) before external crate imports and sort items
within braces alphabetically; specifically move use std::hint::black_box; above
the criterion import and ensure the criterion::{criterion_group, criterion_main,
Criterion, SamplingMode, Throughput} list is alphabetically ordered (and apply
the same ordering rule to the other import on line 19 so the file is fully
rustfmt-compliant).

In `@zstd/benches/support/mod.rs`:
- Line 1: The `use rand::{rngs::SmallRng, Rng, SeedableRng};` import order is
failing rustfmt; fix by running `cargo fmt --all` (or manually reorder the items
inside the `use rand::{...}` to rustfmt's expected ordering), ensuring the
symbols SmallRng, Rng, and SeedableRng are arranged in the canonical sorted
order so `cargo fmt --check` passes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: babe9a42-9f9a-438c-8131-74f39cb2be35

📥 Commits

Reviewing files that changed from the base of the PR and between b751e26 and 7b34db0.

📒 Files selected for processing (2)
  • zstd/benches/compare_ffi.rs
  • zstd/benches/support/mod.rs

Comment thread zstd/benches/compare_ffi.rs Outdated
Comment thread zstd/benches/support/mod.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread zstd/src/decoding/literals_section_decoder.rs Outdated
Comment thread zstd/src/decoding/literals_section_decoder.rs
- Clamp starts[3] via min(3*seg, regen) to prevent OOB when regen < 6
- Add per-stream end indices and bound cursor writes in both interleaved
  and drain loops, matching C reference opX < oendX guards
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/decoding/literals_section_decoder.rs`:
- Around line 137-144: The computed starts/ends for the four streams can exceed
the true window (base..base+regen) for small/corrupted regen and cause
out-of-bounds accesses in the drain loop; update the logic around starts, ends,
and cursors (the variables base, seg, starts, ends, cursors, regen, target) so
every start and end is clamped into the valid slice bounds (base..base+regen or
target.len()) — e.g. after computing starts and ends, replace each with
usize::min(value, base + regen) (or target.len()) so cursors[i] < ends[i] can
never index past the vector. Ensure cursors is initialized from the clamped
starts so all downstream indexing is safe.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e942235f-b03e-4f4a-94f3-356dacc45775

📥 Commits

Reviewing files that changed from the base of the PR and between b3887ef and d025ebe.

📒 Files selected for processing (1)
  • zstd/src/decoding/literals_section_decoder.rs

Comment thread zstd/src/decoding/literals_section_decoder.rs
Copy link
Copy Markdown

@sw-release-bot sw-release-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'structured-zstd vs C FFI'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite Current: 7d378cd Previous: d402315 Ratio
compress/best/small-1k-random/matrix/c_ffi 0.375 ms 0.315 ms 1.19
compress/default/large-log-stream/matrix/c_ffi 3.898 ms 3.203 ms 1.22
decompress/default/high-entropy-1m/c_stream/matrix/c_ffi 0.035 ms 0.026 ms 1.35
decompress/better/high-entropy-1m/c_stream/matrix/c_ffi 0.036 ms 0.027 ms 1.33
decompress/best/high-entropy-1m/c_stream/matrix/c_ffi 0.173 ms 0.027 ms 6.41
compress-dict/better/small-4k-log-lines/matrix/c_ffi_with_dict 0.004 ms 0.003 ms 1.33

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

@polaz polaz requested a review from Copilot April 3, 2026 00:25
All starts (not just starts[3]) are clamped to base+regen via min(),
preventing OOB indexing when regen < 4 (e.g. corrupted frame headers).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment thread zstd/src/decoding/literals_section_decoder.rs
Comment thread zstd/benches/support/mod.rs
Comment thread zstd/Cargo.toml
@polaz polaz requested a review from Copilot April 3, 2026 00:54
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 3, 2026

@coderabbitai resume and re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 3, 2026

@polaz Sure! Resuming automatic reviews and triggering a full re-review of PR #54 now.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/structured-zstd PR: 48
File: zstd/src/encoding/match_generator.rs:1554-1559
Timestamp: 2026-04-02T12:03:04.619Z
Learning: In `zstd/src/encoding/match_generator.rs`, `HcMatchGenerator::insert_position` has an early-return guard at `abs_pos >= u32::MAX` that disables all further HC insertions for streams >4 GiB, degrading `CompressionLevel::Better` to repcodes-only. This is a known limitation tracked in issue `#51`; the proper fix (table rebasing) is deferred. A TODO comment and a limitation note on `CompressionLevel::Better` doc were added in PR `#48`. Do not re-flag this as a new bug.
✅ Actions performed

Reviews resumed.


Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/decoding/literals_section_decoder.rs`:
- Around line 198-203: In the short-decode branch of LiteralsSectionDecoder (the
code that calls short_decode and writes into target), detect a mismatch between
the number of bytes reported as decoded (decoded) and the expected target length
immediately and return the appropriate decode error (e.g.,
Err(DecodeError::Corrupted) or truncated/short-decode error) from that branch
instead of letting downstream length checks surface the problem; update the
branch that calls short_decode/short_decode_into (working with symbols target
and decoded) to return the short-decode error when decoded != target.len().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a12477d4-f279-4e12-9949-c52ef834ca3a

📥 Commits

Reviewing files that changed from the base of the PR and between d402315 and e597fb2.

📒 Files selected for processing (6)
  • cli/Cargo.toml
  • zstd/Cargo.toml
  • zstd/benches/compare_ffi.rs
  • zstd/benches/support/mod.rs
  • zstd/src/decoding/literals_section_decoder.rs
  • zstd/src/huff0/huff0_decoder.rs

Comment thread zstd/src/decoding/literals_section_decoder.rs Outdated
@polaz polaz requested a review from Copilot April 3, 2026 01:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread zstd/src/decoding/literals_section_decoder.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/decoding/literals_section_decoder.rs`:
- Around line 133-145: The post-write length check wrongly compares target.len()
to section.regenerated_size without accounting for pre-existing bytes; move base
and regen to function scope so the final validation asserts target.len() == base
+ regen (not just regen), and update both places that build starts/ends (the
4-stream path that defines starts/ends and the single-stream path) to use this
base-relative check; reference the variables base, regen, starts, ends,
section.regenerated_size and ensure the error still returns
DecodedLiteralCountMismatch when the condition fails.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4024174f-abb0-47d9-aa7a-a35c06c66842

📥 Commits

Reviewing files that changed from the base of the PR and between a296eb7 and 5c3aae4.

📒 Files selected for processing (1)
  • zstd/src/decoding/literals_section_decoder.rs

Comment thread zstd/src/decoding/literals_section_decoder.rs Outdated
Move base/regen to function scope so the final length check compares
target.len() against base+regen (not just regen). On 4-stream mismatch,
truncate to base (clean boundary) instead of base+decoded (scattered
segments). Both error paths now report regen-only counts consistently.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread zstd/src/decoding/literals_section_decoder.rs
Comment thread zstd/src/decoding/literals_section_decoder.rs
All error returns after target.resize() now truncate back to base,
preventing callers from observing partial/zero-filled output after
a failed decode when the Vec is reused across calls.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread zstd/benches/support/mod.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: Huffman decoder — 4-stream parallel decoding and table initialization

2 participants