feat(encoding): write frame content size in encoder output by polaz · Pull Request #60 · structured-world/structured-zstd

polaz · 2026-04-03T19:05:12Z

Summary

FrameCompressor now automatically writes Frame Content Size (FCS) in the frame header by buffering compressed blocks and deferring the header until total input size is known
StreamingEncoder gains set_pledged_content_size() API for callers who know input size upfront; finish() verifies the pledge matches bytes written
Fix FCS descriptor flag mapping bug (8 => 3, was 3 => 8 which would panic on 8-byte FCS values)
Add find_fcs_field_size() with correct +256 offset range for 2-byte FCS fields (256–65791)
Always include window descriptor alongside FCS (no single_segment) to avoid C zstd incompatibility with compressed blocks whose encoder window exceeds the declared content size

Technical Details

The C reference defaults ZSTD_c_contentSizeFlag to 1 (enabled) and writes FCS whenever the pledged source size is known. This enables decoders to pre-allocate output buffers, significantly improving decompression speed.

FrameCompressor: blocks are accumulated in an intermediate buffer instead of being flushed to the drain per-block. After all input is consumed, the correct header (with FCS) is written first, then the buffered blocks. This trades O(compressed_size) memory for automatic FCS detection without requiring callers to pledge.

StreamingEncoder: since the header must be written before data flows, FCS requires an explicit pledge via set_pledged_content_size(). Without a pledge, the header omits FCS (matching previous behavior).

No single_segment: the C encoder sets single_segment when windowSize >= pledgedSrcSize, which makes the decoder use FCS as window size. Our compressed blocks are encoded against the matcher's actual window (128KB+), so setting single_segment with a small FCS would give the decoder a window too small for the block format. We always include the window descriptor instead (costs 1 byte per frame).

Test Plan

Closes #16

Summary by CodeRabbit

New Features
- Non-streaming compressor now emits Frame Content Size (FCS) automatically; streaming encoder adds an API to pledge an upfront frame size and enforces it.
Bug Fixes
- Corrected FCS field sizing/serialization for external decoder compatibility.
- Improved empty-input handling and stricter pledge/size validation during streaming writes and finish.
Documentation
- README updated to mark FCS implemented and note streamer pledge requirement.
Tests
- Added tests for pledged-size behavior, round-trip decoding, compatibility, and edge cases.

- FrameCompressor now buffers compressed blocks and writes the frame header after all input is consumed, so the total uncompressed size is known and encoded as Frame_Content_Size (FCS) - StreamingEncoder gains set_pledged_content_size() for callers who know the input size upfront; finish() verifies the pledge - Fix FCS descriptor flag mapping bug (8 => 3, was 3 => 8) - Add find_fcs_field_size() with correct +256 offset range for 2-byte FCS fields (256–65791 vs find_min_size's 256–65535) - Always include window descriptor alongside FCS (no single_segment) to avoid C zstd incompatibility with compressed blocks whose encoder window exceeds the declared content size Closes #16

coderabbitai · 2026-04-03T19:05:21Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f501afcf-4ddd-4e42-ae32-28218e90eaa9

📥 Commits

Reviewing files that changed from the base of the PR and between a734b37 and 3308ec2.

📒 Files selected for processing (2)

zstd/src/encoding/frame_compressor.rs
zstd/src/encoding/streaming_encoder.rs

📝 Walkthrough

Walkthrough

Buffers encoded blocks to compute and emit Frame Content Size (FCS) in headers; adds FCS sizing/serialization utilities; StreamingEncoder gains pledged-content-size API with write-time enforcement and finish-time validation; README updated and tests added for zstd compatibility.

Changes

Cohort / File(s)	Summary
Documentation `README.md`	Marked "Frame Content Size (FCS)" implemented in the compression feature checklist.
Frame Buffering & Emission `zstd/src/encoding/frame_compressor.rs`	Buffer encoded blocks into `all_blocks`, track `total_uncompressed`, defer FrameHeader serialization until after input is read, adjust empty-input handling, assert non-zero window, and write header then buffered blocks; added std-only compatibility test.
Streaming Encoder (pledge handling) `zstd/src/encoding/streaming_encoder.rs`	Added `pledged_content_size: Option<u64>` and `bytes_consumed`; `set_pledged_content_size()` API with pre-write validation; writes enforce pledge (may accept/truncate); `finish()` validates pledge equals consumed bytes; added related tests.
Frame Header FCS Serialization `zstd/src/encoding/frame_header.rs`	Reworked FCS encoding: use `find_fcs_field_size(... )` and `serialize_fcs(...)`, update descriptor mapping, replace `minify` helper with direct serializers and assertions.
FCS Sizing Utility & Tests `zstd/src/encoding/util.rs`	Added `pub fn find_fcs_field_size(val: u64, single_segment: bool) -> usize` implementing 1/2/4/8-byte selection with unit tests; minor doc tweak to `find_min_size`.

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant SE as StreamingEncoder
    participant FC as FrameCompressor
    participant FH as FrameHeader
    participant W as Writer

    App->>SE: set_pledged_content_size(size)
    SE->>SE: store pledged_content_size

    App->>SE: write(data)
    SE->>SE: check pledge / accept or truncate write
    SE->>FC: compress_block(data)
    FC->>FC: append BlockHeader + payload to all_blocks (track total_uncompressed)
    FC-->>SE: return (buffered)

    App->>SE: finish()
    SE->>SE: if pledged -> validate bytes_consumed == pledged
    alt mismatch
        SE-->>App: Error(InvalidInput)
    else ok
        SE->>FH: compute field_size = find_fcs_field_size(total_uncompressed, single_segment)
        FH->>FH: serialize_fcs(total_uncompressed, field_size)
        FH->>W: write FrameHeader (header_buf)
        SE->>W: write buffered blocks (all_blocks)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(encoding): add Better compression level (zstd level 7, lazy2 strategy) #48: Modifies FrameCompressor::compress and block emission logic; overlaps buffering and block-output behavior.
feat(encoding): add streaming write encoder #45: Overlaps StreamingEncoder pledge handling and FCS-related streaming write/bookkeeping changes.

Poem

🐰 I counted hops and every byte I ate,
I pledged a size and guarded every plate.
One, two, four or eight — the header keeps the score,
I buffered all my carrots and wrote them out in store. 🎉

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: writing frame content size in encoder output, which is the central objective of the PR.
Linked Issues check	✅ Passed	The PR implements all core coding requirements from issue `#16`: FCS encoding with correct field sizing (1/2/4/8 bytes), pledged-size validation in StreamingEncoder, automatic FCS inference in FrameCompressor, fixed FCS descriptor flag mapping, correct 2-byte field handling with +256 offset, and C zstd compatibility.
Out of Scope Changes check	✅ Passed	All changes are directly related to implementing FCS support: FrameCompressor now buffers blocks to compute total size, StreamingEncoder adds pledged-size tracking and validation, frame_header.rs corrects FCS encoding logic, util.rs adds FCS field-sizing logic, and README documents the feature. No unrelated changes detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/#16-feat-frame-content-size-in-encoder-output

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR enhances the encoder to emit Zstandard Frame Content Size (FCS) in the frame header when the uncompressed size is known, improving decoder pre-allocation and cross-compatibility with the C zstd implementation.

Changes:

Add correct FCS sizing/serialization utilities and fix the FCS descriptor flag mapping (8-byte FCS flag value).
Update FrameCompressor to buffer compressed blocks so the header can be written after total input size is known (and always include the window descriptor).
Add StreamingEncoder::set_pledged_content_size() and enforce pledge verification in finish(), plus new tests and README capability update.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
zstd/src/encoding/util.rs	Adds `find_fcs_field_size()` and unit tests for correct FCS field sizing rules.
zstd/src/encoding/frame_header.rs	Uses `find_fcs_field_size()` for descriptor + payload serialization and fixes the FCS flag mapping bug.
zstd/src/encoding/frame_compressor.rs	Buffers blocks to write the header with FCS after total input size is known; adds C-zstd compatibility test.
zstd/src/encoding/streaming_encoder.rs	Adds pledged content size API and verifies pledge at finish; adds streaming tests.
README.md	Documents support for Frame Content Size.

codecov · 2026-04-03T19:11:10Z

Codecov Report

❌ Patch coverage is 94.93088% with 11 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
zstd/src/encoding/frame_compressor.rs	91.93%	5 Missing ⚠️
zstd/src/encoding/frame_header.rs	76.92%	3 Missing ⚠️
zstd/src/encoding/streaming_encoder.rs	97.36%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

sw-release-bot

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'structured-zstd vs C FFI'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite	Current: `3308ec2`	Previous: `98c1be0`	Ratio
`compress/default/small-1k-random/matrix/pure_rust`	`6.012` ms	`4.311` ms	`1.39`
`compress/best/small-1k-random/matrix/pure_rust`	`0.312` ms	`0.243` ms	`1.28`
`compress/best/small-1k-random/matrix/c_ffi`	`0.837` ms	`0.378` ms	`2.21`
`compress/default/small-10k-random/matrix/pure_rust`	`7.707` ms	`6.017` ms	`1.28`
`compress/default/small-10k-random/matrix/c_ffi`	`0.028` ms	`0.022` ms	`1.27`
`compress/better/small-10k-random/matrix/c_ffi`	`0.113` ms	`0.096` ms	`1.18`
`compress/best/small-10k-random/matrix/pure_rust`	`0.863` ms	`0.622` ms	`1.39`
`compress/best/small-10k-random/matrix/c_ffi`	`0.787` ms	`0.425` ms	`1.85`
`compress/default/small-4k-log-lines/matrix/pure_rust`	`5.615` ms	`4.611` ms	`1.22`
`compress/best/small-4k-log-lines/matrix/c_ffi`	`0.714` ms	`0.303` ms	`2.36`
`compress/better/decodecorpus-z000033/matrix/pure_rust`	`83.996` ms	`56.997` ms	`1.47`
`compress/best/decodecorpus-z000033/matrix/pure_rust`	`100.077` ms	`66.517` ms	`1.50`
`compress/best/decodecorpus-z000033/matrix/c_ffi`	`23.067` ms	`19.446` ms	`1.19`
`compress/better/high-entropy-1m/matrix/pure_rust`	`125.318` ms	`66.849` ms	`1.87`
`compress/best/high-entropy-1m/matrix/pure_rust`	`145.539` ms	`56.907` ms	`2.56`
`compress/best/high-entropy-1m/matrix/c_ffi`	`1.789` ms	`0.92` ms	`1.94`
`compress/default/low-entropy-1m/matrix/pure_rust`	`11.966` ms	`9.692` ms	`1.23`
`compress/best/low-entropy-1m/matrix/pure_rust`	`5.569` ms	`4.59` ms	`1.21`
`compress/best/low-entropy-1m/matrix/c_ffi`	`1.72` ms	`1.193` ms	`1.44`
`compress/fastest/large-log-stream/matrix/c_ffi`	`2.763` ms	`2.282` ms	`1.21`
`decompress/best/high-entropy-1m/c_stream/matrix/c_ffi`	`0.168` ms	`0.026` ms	`6.46`
`compress-dict/better/small-10k-random/matrix/c_ffi_with_dict`	`0.122` ms	`0.103` ms	`1.18`

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

…ions - Fix find_fcs_field_size doc: only 1-byte encoding is gated on single_segment; 2-byte range (256–65791) works regardless - Fix set_pledged_content_size doc: remove stale Single_Segment claim since encoder always includes the window descriptor - Add debug_assert guards to serialize_fcs against underflow when field_size==2 and val<256, reject unexpected field_size values with unreachable!()

Splice the small header (max 14 bytes) into the front of all_blocks instead of writing header and blocks as two separate drain calls, eliminating the second full-size buffer allocation.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Line 50: Update the README checkbox/line about "Frame Content Size" to clearly
describe the two FCS code paths: state that FrameCompressor automatically learns
and exposes FCS by buffering the entire frame (trading memory for automatic
FCS), whereas StreamingEncoder does not provide FCS unless the caller calls
set_pledged_content_size() before the first write() (allowing streaming with no
buffering); mention the memory tradeoff and behavior so callers know which API
gives FCS automatically and which requires an explicit pledged size.

In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 218-223: The compress() path should reject matchers reporting
window_size() == 0 before attempting any reads: add the same guard used in
StreamingEncoder to check self.state.matcher.window_size() and return an error
(or the same failure variant) from compress() when it equals 0, preventing reads
into an empty block buffer and avoiding emitting an empty frame for non-empty
input; locate the check near the start of frame compressor logic (around the
existing let window_size = self.state.matcher.window_size() usage) and mirror
the error handling behavior used by StreamingEncoder.

In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 88-95: The method set_pledged_content_size currently checks
frame_started before verifying the encoder's sticky error state; call
ensure_open() at the start of set_pledged_content_size() and propagate any error
it returns before doing other checks, so that a prior I/O failure is honored;
then keep the existing frame_started check and only set pledged_content_size
when ensure_open() succeeds and frame has not started.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d3c28615-bd74-47de-a17e-c4d9d6a6b932

📥 Commits

Reviewing files that changed from the base of the PR and between 98c1be0 and 978262f.

📒 Files selected for processing (5)

README.md
zstd/src/encoding/frame_compressor.rs
zstd/src/encoding/frame_header.rs
zstd/src/encoding/streaming_encoder.rs
zstd/src/encoding/util.rs

…e API - Add assert for window_size == 0 in FrameCompressor::compress(), matching the existing guard in StreamingEncoder - Call ensure_open() in set_pledged_content_size() to honor sticky error state before accepting a pledge - Clarify README FCS line: auto in FrameCompressor, pledge-based in StreamingEncoder

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

… header - Move pledge validation in finish() before ensure_frame_started() to avoid emitting a header with incorrect FCS on mismatch - Reject writes exceeding pledged content size in write() rather than only detecting at finish() - Rename bytes_written → bytes_consumed to clarify it tracks uncompressed input, not drain output - Revert splice prepend to two write_all calls — splice is O(n) memmove vs two cheap sequential writes

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

The comment claimed validation happens "before emitting the frame header", but the header is already written on the first write() via ensure_frame_started(). Reword to accurately describe that validation happens before finalizing the frame.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 936-943: The test pledged_content_size_mismatch_returns_error uses
the literal "only 10 bytes" which is misleading because its length is 13; update
the test in StreamingEncoder::new / set_pledged_content_size / write_all /
finish to use a clearer literal or comment—e.g., replace the string with "short
string" or update the message/comment to state the actual byte length—so the
test name and data are not confusing to future maintainers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 547f59a3-1cbd-4d54-9a91-6e6421f5365c

📥 Commits

Reviewing files that changed from the base of the PR and between d2ddcf7 and e69754b.

📒 Files selected for processing (1)

zstd/src/encoding/streaming_encoder.rs

Replace "only 10 bytes" (actually 13 bytes) with "short payload" and add a clarifying comment.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 936-951: Add a new unit test that verifies writes which would
exceed a previously set pledge are rejected: create a StreamingEncoder via
StreamingEncoder::new(...), call set_pledged_content_size(5) on it, then call
write_all with a buffer longer than 5 and assert the call returns an Err whose
kind() is ErrorKind::InvalidInput; name the test
write_exceeding_pledge_returns_error to match existing tests and place it
alongside the other pledged-content-size tests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 22e87764-412c-4a3c-a410-e1503d078216

📥 Commits

Reviewing files that changed from the base of the PR and between e69754b and db737b5.

📒 Files selected for processing (1)

zstd/src/encoding/streaming_encoder.rs

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

…ial writes - Replace overflow-prone addition with checked_sub + truncation to honor Write partial-write semantics (accept up to remaining pledge) - Add write_exceeding_pledge_returns_error test - Update compress() docstring to document O(compressed_size) buffering

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

zstd/src/encoding/streaming_encoder.rs (1)

936-951: ⚠️ Potential issue | 🟡 Minor

Add a dedicated test for write-time pledge overflow rejection.

Coverage still lacks a direct test for the write() guard path (Line 405-410) where a write exceeding the pledge must fail with InvalidInput.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@zstd/src/encoding/streaming_encoder.rs` around lines 936 - 951, Add a test
that sets a pledged content size on a StreamingEncoder and then calls write (the
path guarded at the write() method) with a payload that would overflow the
pledge so the write itself returns an error; specifically, create a
StreamingEncoder via StreamingEncoder::new, call
set_pledged_content_size(n).unwrap(), then call write (or write_all) with more
than n bytes and assert the returned error is ErrorKind::InvalidInput (matching
the write-time guard in write()). Ensure the test targets the write-time pledge
overflow case rather than the finish-time check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 978-989: The test no_pledged_size_omits_fcs_from_header currently
asserts header.frame_content_size() == 0 which is ambiguous; instead, after
reading the header via crate::decoding::frame::read_frame_header(...) assert the
Frame Header Descriptor's FCS flag is unset (check the descriptor/flags on the
returned header object rather than frame_content_size). Locate the header type
returned by read_frame_header and add an assertion that the FCS/content-checksum
bit (e.g., the descriptor's fcs/content_checksum flag or the header descriptor
bits) is false/zero to prove the FCS was omitted.

---

Duplicate comments:
In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 936-951: Add a test that sets a pledged content size on a
StreamingEncoder and then calls write (the path guarded at the write() method)
with a payload that would overflow the pledge so the write itself returns an
error; specifically, create a StreamingEncoder via StreamingEncoder::new, call
set_pledged_content_size(n).unwrap(), then call write (or write_all) with more
than n bytes and assert the returned error is ErrorKind::InvalidInput (matching
the write-time guard in write()). Ensure the test targets the write-time pledge
overflow case rather than the finish-time check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 15ba4574-f70a-4957-8d94-c33821ca4508

📥 Commits

Reviewing files that changed from the base of the PR and between e69754b and db737b5.

📒 Files selected for processing (1)

zstd/src/encoding/streaming_encoder.rs

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

… tests - Move pledge exhaustion check before ensure_frame_started() so that misuse like pledge(0) + write(data) doesn't emit a partial header - Assert descriptor.frame_content_size_bytes() in FCS tests to distinguish "FCS present with value 0" from "FCS omitted"

polaz · 2026-04-04T07:09:47Z

@coderabbitai re-review

coderabbitai · 2026-04-04T07:09:58Z

@polaz Sure, I'll re-review the PR now!

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-03T17:39:58.596Z
Learning: In `zstd/src/decoding/frame_decoder.rs`, `FrameDecoderState::reset()` intentionally does NOT reserve the frame-content-size (FCS) delta beyond `window_size`. Two cases cover the design space: (1) multi-segment frames where FCS >> window_size — the delta can reach ~99 MiB, and eagerly reserving it would penalize streaming callers; (2) single-segment frames where window_size == FCS — the delta is always 0 so there is nothing to reserve. Incremental growth during decoding is handled by the per-block `MAX_BLOCK_SIZE` (128 KiB) reserve in `execute_sequences`. Do not flag the absence of an FCS-delta reserve in `reset()` as a missing pre-allocation.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-03T16:36:03.209Z
Learning: In `zstd/src/decoding/frame_decoder.rs`, `FrameDecoderState::reset()` appears asymmetric to `new()` in its pre-allocation, but the asymmetry is intentional: `decoder_scratch.reset(window_size)` internally calls `DecodeBuffer::reset()` which already reserves `window_size`. Therefore `reset()` only needs to call `scratch.buffer.reserve(frame_content_size)` on top of that for the FCS delta, rather than explicitly reserving `window_size` again as `new()` does. Do not re-flag this as a bug or asymmetry.

✅ Actions performed

Full review triggered.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/encoding/frame_compressor.rs`:
- Around line 450-498: The test fcs_header_written_and_c_zstd_compatible
currently exercises inputs that only produce 4-byte FCS; add a 2-byte FCS case
(e.g. a 300‑byte buffer such as vec![0xCD; 300]) to the inputs array so the test
covers the 256..=65_791 / -256 encoding path; update the inputs variable in that
function (fcs_header_written_and_c_zstd_compatible) to include the new &[u8]
reference for the 300-byte vector so the round-trip and FCS presence assertions
exercise the 2-byte FCS encoding.

In `@zstd/src/encoding/streaming_encoder.rs`:
- Around line 967-973: Add a new assertion that exercises the partial-progress
write() path instead of only write_all(): create the StreamingEncoder via
StreamingEncoder::new and set_pledged_content_size(5), then call write(...) with
a buffer longer than 5 and assert it returns Ok(n) where n equals the remaining
pledge (e.g. 5), and then call write(...) again and assert it returns an Err
with ErrorKind::InvalidInput; reference the existing test
write_exceeding_pledge_returns_error, StreamingEncoder::new,
set_pledged_content_size, write_all and write to locate where to add this
additional case.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ac8f3162-e87a-4fd1-a3d1-797c8a28c2fd

📥 Commits

Reviewing files that changed from the base of the PR and between 98c1be0 and a734b37.

📒 Files selected for processing (5)

README.md
zstd/src/encoding/frame_compressor.rs
zstd/src/encoding/frame_header.rs
zstd/src/encoding/streaming_encoder.rs
zstd/src/encoding/util.rs

… test - Add 300-byte input to FCS interop matrix to exercise the 256–65791 two-byte FCS encoding path with -256 offset - Add write_straddling_pledge_reports_partial_progress test to verify write() returns Ok(n) at the pledge boundary instead of hard error

polaz · 2026-04-04T07:34:15Z

@coderabbitai re-review

coderabbitai · 2026-04-04T07:34:27Z

@polaz Sure, I'll re-review the PR now!

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-03T16:36:03.209Z
Learning: In `zstd/src/decoding/frame_decoder.rs`, `FrameDecoderState::reset()` appears asymmetric to `new()` in its pre-allocation, but the asymmetry is intentional: `decoder_scratch.reset(window_size)` internally calls `DecodeBuffer::reset()` which already reserves `window_size`. Therefore `reset()` only needs to call `scratch.buffer.reserve(frame_content_size)` on top of that for the FCS delta, rather than explicitly reserving `window_size` again as `new()` does. Do not re-flag this as a bug or asymmetry.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-03T17:39:58.596Z
Learning: In `zstd/src/decoding/frame_decoder.rs`, `FrameDecoderState::reset()` intentionally does NOT reserve the frame-content-size (FCS) delta beyond `window_size`. Two cases cover the design space: (1) multi-segment frames where FCS >> window_size — the delta can reach ~99 MiB, and eagerly reserving it would penalize streaming callers; (2) single-segment frames where window_size == FCS — the delta is always 0 so there is nothing to reserve. Incremental growth during decoding is handled by the per-block `MAX_BLOCK_SIZE` (128 KiB) reserve in `execute_sequences`. Do not flag the absence of an FCS-delta reserve in `reset()` as a missing pre-allocation.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 48
File: zstd/src/encoding/match_generator.rs:1554-1559
Timestamp: 2026-04-02T12:03:04.619Z
Learning: In `zstd/src/encoding/match_generator.rs`, `HcMatchGenerator::insert_position` has an early-return guard at `abs_pos >= u32::MAX` that disables all further HC insertions for streams >4 GiB, degrading `CompressionLevel::Better` to repcodes-only. This is a known limitation tracked in issue `#51`; the proper fix (table rebasing) is deferred. A TODO comment and a limitation note on `CompressionLevel::Better` doc were added in PR `#48`. Do not re-flag this as a new bug.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 44
File: zstd/src/encoding/frame_compressor.rs:0-0
Timestamp: 2026-03-28T22:55:01.577Z
Learning: In `zstd/src/encoding/frame_compressor.rs`, `FrameCompressor::set_dictionary` returns `Result<Option<Dictionary>, DictionaryDecodeError>` and enforces two fail-fast programmer-error contracts via `Err(...)` rather than panics: (1) `dictionary.id == 0` returns `DictionaryDecodeError::ZeroDictionaryId`; (2) any zero entry in `dictionary.offset_hist` returns `DictionaryDecodeError::ZeroRepeatOffsetInDictionary`. The fallible parsing paths (`Dictionary::decode_dict`, `Dictionary::from_raw_content`) apply the same rejections at input boundaries. Do not flag these as inconsistencies — the uniform `Result`-based contract is intentional.

✅ Actions performed

Full review triggered.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings April 3, 2026 19:05

Copilot started reviewing on behalf of polaz April 3, 2026 19:05 View session

style: apply cargo fmt

a27ab10

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Comment thread zstd/src/encoding/util.rs Outdated

Comment thread zstd/src/encoding/streaming_encoder.rs Outdated

Comment thread zstd/src/encoding/frame_header.rs Outdated

Comment thread zstd/src/encoding/frame_compressor.rs

sw-release-bot Bot reviewed Apr 3, 2026

View reviewed changes

polaz added 2 commits April 3, 2026 22:42

perf(encoding): prepend FCS header into block buffer to avoid extra copy

978262f

Splice the small header (max 14 bytes) into the front of all_blocks instead of writing header and blocks as two separate drain calls, eliminating the second full-size buffer allocation.

coderabbitai Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread README.md Outdated

Comment thread zstd/src/encoding/frame_compressor.rs

Comment thread zstd/src/encoding/streaming_encoder.rs

polaz requested a review from Copilot April 3, 2026 20:14

Copilot started reviewing on behalf of polaz April 3, 2026 20:15 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Comment thread zstd/src/encoding/streaming_encoder.rs

Comment thread zstd/src/encoding/streaming_encoder.rs

Comment thread zstd/src/encoding/streaming_encoder.rs Outdated

Comment thread zstd/src/encoding/frame_compressor.rs Outdated

polaz requested a review from Copilot April 3, 2026 21:25

Copilot started reviewing on behalf of polaz April 3, 2026 21:26 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Comment thread zstd/src/encoding/streaming_encoder.rs Outdated

coderabbitai Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread zstd/src/encoding/streaming_encoder.rs

test(encoding): fix misleading test literal in pledge mismatch test

db737b5

Replace "only 10 bytes" (actually 13 bytes) with "short payload" and add a clarifying comment.

polaz requested a review from Copilot April 4, 2026 06:09

Copilot started reviewing on behalf of polaz April 4, 2026 06:10 View session

coderabbitai Bot reviewed Apr 4, 2026

View reviewed changes

Comment thread zstd/src/encoding/streaming_encoder.rs

Copilot AI reviewed Apr 4, 2026

View reviewed changes

Comment thread zstd/src/encoding/streaming_encoder.rs

Comment thread zstd/src/encoding/frame_compressor.rs

coderabbitai Bot reviewed Apr 4, 2026

View reviewed changes

Comment thread zstd/src/encoding/streaming_encoder.rs

polaz requested a review from Copilot April 4, 2026 06:30

Copilot started reviewing on behalf of polaz April 4, 2026 06:30 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

Comment thread zstd/src/encoding/streaming_encoder.rs Outdated

Comment thread zstd/src/encoding/frame_compressor.rs Outdated

Comment thread zstd/src/encoding/streaming_encoder.rs

polaz requested a review from Copilot April 4, 2026 07:09

Copilot started reviewing on behalf of polaz April 4, 2026 07:10 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 4, 2026

View reviewed changes

Comment thread zstd/src/encoding/frame_compressor.rs

Comment thread zstd/src/encoding/streaming_encoder.rs

polaz requested a review from Copilot April 4, 2026 07:34

Copilot started reviewing on behalf of polaz April 4, 2026 07:35 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

Comment thread zstd/src/encoding/frame_compressor.rs Outdated

refactor(encoding): extract large test vec into named local binding

3308ec2

polaz requested a review from Copilot April 4, 2026 10:17

Copilot started reviewing on behalf of polaz April 4, 2026 10:18 View session

Copilot AI reviewed Apr 4, 2026

View reviewed changes

polaz merged commit cd31f5e into main Apr 4, 2026
14 of 15 checks passed

polaz deleted the feat/#16-feat-frame-content-size-in-encoder-output branch April 4, 2026 10:34

sw-release-bot Bot mentioned this pull request Apr 3, 2026

chore: release v0.0.6 #57

Merged

Conversation

polaz commented Apr 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Technical Details

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sw-release-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polaz commented Apr 4, 2026

polaz commented Apr 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 3, 2026 •

edited

Loading

codecov Bot commented Apr 3, 2026 •

edited

Loading

sw-release-bot Bot left a comment •

edited

Loading