Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 8 minutes and 49 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds a scenario×level Criterion benchmark matrix with dictionary benches, a support module for benchmark scenarios, CI script parsing of enriched REPORT/REPORT_MEM/REPORT_DICT output to produce JSON and a multi-table Markdown report, a flamegraph helper script, documentation (BENCHMARKS.md, README), and .gitignore entries for generated artifacts. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CI as CI
participant Script as run-benchmarks.sh
participant Bench as compare_ffi (cargo bench)
participant Parser as parser (in script)
participant Artifacts as Artifacts (JSON / MD)
CI->>Script: start benchmark run
Script->>Bench: invoke cargo bench (env: mktemp raw file)
Bench-->>Script: stdout (Criterion timings + REPORT / REPORT_MEM / REPORT_DICT)
Script->>Parser: parse raw output (unescape labels, extract sections)
Parser->>Artifacts: write `benchmark-results.json` (timings)
Parser->>Artifacts: write `benchmark-report.md` (ratios, memory, dict, timings tables)
Script->>CI: exit status and artifact locations
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Expands the project’s performance benchmarking infrastructure to better track structured-zstd (pure Rust) vs the C zstd reference across a broader scenario/level matrix, with CI-friendly reporting and local profiling helpers.
Changes:
- Reworks
compare_ffiinto a multi-scenario / multi-level Criterion matrix withREPORT ...metadata lines for post-processing. - Adds benchmark scenario/level scaffolding (including optional Silesia fixtures via env var).
- Introduces reporting + documentation: CI/local benchmark runner output (JSON + markdown), a flamegraph helper script, and benchmark workflow docs.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| zstd/benches/support/mod.rs | Adds reusable benchmark scenarios (random/entropy/large/log/corpus + optional Silesia) and level mapping. |
| zstd/benches/compare_ffi.rs | Converts benchmark into a matrix for compress/decompress with per-scenario configuration and report-line emission. |
| .github/scripts/run-benchmarks.sh | Runs the matrix and generates benchmark-results.json plus a human-readable markdown report. |
| scripts/bench-flamegraph.sh | Adds a helper to generate flamegraphs for a selected Criterion benchmark filter. |
| BENCHMARKS.md | Documents scenarios, level mapping, and common benchmark commands/outputs. |
| README.md | Links to the benchmark documentation and summarizes what the suite measures. |
| .gitignore | Ignores generated benchmark JSON/markdown outputs. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
- add scenario-based Criterion matrix against C zstd - generate benchmark JSON and markdown reports for CI - document benchmark workflows and add flamegraph helper Refs #24
b5733d5 to
3a419fa
Compare
- guard ratio reporting for zero-length inputs - skip empty Silesia fixtures with BENCH_WARN - use mktemp + trap for raw bench output parsing - make flamegraph --root opt-in via BENCH_FLAMEGRAPH_USE_ROOT
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/bench-flamegraph.sh`:
- Around line 13-21: The script expands EXTRA_FLAMEGRAPH_ARGS directly in the
cargo flamegraph command which can trigger "unbound variable" on older Bash with
set -u; change the expansion to a safe guarded form such as
"${EXTRA_FLAMEGRAPH_ARGS[@]:-}" (or use the conditional expansion
"${EXTRA_FLAMEGRAPH_ARGS:+${EXTRA_FLAMEGRAPH_ARGS[@]}}") in the cargo flamegraph
invocation so the empty array won't cause an error; update the line that calls
cargo flamegraph (the command using EXTRA_FLAMEGRAPH_ARGS, and ensure
BENCH_FLAMEGRAPH_USE_ROOT/EXTRA_FLAMEGRAPH_ARGS logic remains unchanged).
In `@zstd/benches/compare_ffi.rs`:
- Around line 62-69: The benchmark currently allocates the target buffer inside
the iteration (vec![0u8; expected_len]) which skews pure decompression timing;
move the target buffer allocation outside the closure and reuse it inside the
"pure_rust" bench closure (create a mutable target Vec<u8> of size expected_len
before calling group.bench_function and then reset or zero its contents each
iteration before calling FrameDecoder::decode_all), keeping the decoder creation
inside the iteration if you want to measure decoder setup as well or moving that
out if you only want decode throughput; update the "pure_rust" bench closure
(where FrameDecoder::new() and target are used) to reference the pre-allocated
target to eliminate per-iteration allocation overhead.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: afb75d4e-c9e5-423b-9ea6-2d51baa2a266
📒 Files selected for processing (7)
.github/scripts/run-benchmarks.sh.gitignoreBENCHMARKS.mdREADME.mdscripts/bench-flamegraph.shzstd/benches/compare_ffi.rszstd/benches/support/mod.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/benches/compare_ffi.rs`:
- Around line 62-77: The c_ffi benchmark allocates a new Vec each iteration via
zstd::decode_all, creating allocation asymmetry versus the pre-allocated buffer
used by FrameDecoder in the pure_rust bench; replace the c_ffi bench body to
decode into a pre-allocated buffer (mirror the pure_rust pattern) by creating a
target Vec<u8> of expected_len before b.iter and using zstd's streaming API
(e.g., zstd::stream::Decoder) with a Cursor<&mut [u8]> or equivalent
write-into-slice approach to reuse the buffer, or alternatively add a comment
clarifying that decode_all intentionally measures allocation+decode if you
prefer realistic usage; change the closure around group.bench_function("c_ffi",
...) and reference zstd::decode_all, zstd::stream::Decoder, and Cursor to
implement the chosen fix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7da886e3-4fb4-4721-b5b0-00f3061549a4
📒 Files selected for processing (2)
scripts/bench-flamegraph.shzstd/benches/compare_ffi.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@zstd/benches/compare_ffi.rs`:
- Around line 63-68: The benchmark loop is zeroing the entire output buffer each
iteration (target.fill(0)) which contaminates the measured decode time; remove
the redundant target.fill(0) call inside the b.iter closure so the measured work
is just FrameDecoder::new() + decoder.decode_all(&ffi_compressed, &mut target).
Keep the pre-allocated let mut target = vec![0u8; expected_len]; and preserve
the assert_eq!(written, expected_len) after decoder.decode_all to ensure full
writes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 094a9595-6f3b-41ae-99c5-0f1311999c4a
📒 Files selected for processing (1)
zstd/benches/compare_ffi.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 91-98: The table header labels the first column "Scenario" but the
loop is inserting row["label"], causing a mismatch and hiding stable scenario
IDs; update the for-loop that iterates over ratios (the block building lines
with f'| {row["label"]} ...') to use row["scenario"] for the first column
instead of row["label"] so the printed table matches the "Scenario" header and
exposes the stable scenario identifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a26751e4-7093-4815-9b51-944f156b7c9e
📒 Files selected for processing (2)
.github/scripts/run-benchmarks.shBENCHMARKS.md
- include scenario id in ratio report markdown table - reuse decoders and buffers in decompression benchmark loops - keep throughput comparison focused on decode work Refs #24
- compare metadata.len() against max size in u64 space - derive Silesia scenario ids from full file names - append stable numeric suffix on id collisions
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 54-57: The markdown_table_escape function currently only handles
backslashes, pipes and newlines; update markdown_table_escape to trim
leading/trailing whitespace and additionally escape other Markdown-sensitive
characters (at minimum: backticks `, square brackets [, ], asterisks *,
underscores _, angle brackets < and > and percent signs %) before returning
(still convert newlines to <br>); locate and update the markdown_table_escape
function to apply these replacements so benchmark labels render safely in
tables.
In `@zstd/benches/support/mod.rs`:
- Around line 318-331: The dedupe_scenario_id function currently uses an
unbounded loop that could hang if seen_ids fails to accept new entries; change
it to a bounded retry loop (e.g., set a MAX_SUFFIX constant like 1_000_000) and
iterate suffix from 2 up to MAX_SUFFIX, returning the first candidate inserted
into seen_ids; if no candidate is accepted by MAX_SUFFIX, handle the failure
deterministically (for example panic with a clear message or return a fallback
ID) so the function never loops indefinitely. Ensure you update the logic in
dedupe_scenario_id and reference/use seen_ids.insert(...) inside the bounded
loop.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b74de2e7-1a6b-4c8b-aae4-3a522dac7cb4
📒 Files selected for processing (8)
.github/scripts/run-benchmarks.sh.gitignoreBENCHMARKS.mdREADME.mdscripts/bench-flamegraph.shzstd/Cargo.tomlzstd/benches/compare_ffi.rszstd/benches/support/mod.rs
- expand markdown table escaping for benchmark labels - bound scenario id suffix search and panic deterministically on exhaustion
Summary
Validation
Notes
STRUCTURED_ZSTD_SILESIA_MAX_FILES=12,STRUCTURED_ZSTD_SILESIA_MAX_FILE_BYTES=67108864) and can be overridden via envRefs #24
Summary by CodeRabbit
Documentation
New Features
Chores