perf(bench): add rust/ffi delta benchmark artifacts#65
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughParses Criterion benchmark names into structured fields, normalizes implementations, indexes ratio and timing data, computes per-(scenario+params) Rust/FFI ratio and speed deltas with 0.99/1.05 parity bands and statuses, and emits Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Runner as "Benchmark Runner\n(Criterion logs)"
participant Script as "run-benchmarks.sh\n(shell + Python)"
participant RatioProc as "Ratio Parser\n(indexer)"
participant TimingProc as "Timing Parser\n(indexer)"
participant DeltaEngine as "Delta Synth\n(align & compute)"
participant Storage as "Filesystem\n(bench artifacts)"
Runner->>Script: emit raw benchmark/report lines
Script->>RatioProc: parse ratio REPORT lines -> ratio_index
Script->>TimingProc: parse timing/REPORT_MEM -> timing_rows, input_bytes
RatioProc->>DeltaEngine: provide canonical ratio series
TimingProc->>DeltaEngine: provide canonical timing/throughput series
DeltaEngine->>DeltaEngine: normalize impl ids, align rust/ffi, compute deltas & statuses
DeltaEngine->>Storage: write `benchmark-delta.json` and `benchmark-delta.md`
Script->>Storage: update stderr logs with delta counts and file paths
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Adds Rust-vs-FFI delta reporting to the benchmark pipeline so results can be compared per canonical (scenario + params) key with threshold bands and status labels, and documents how to interpret those deltas.
Changes:
- Extend
.github/scripts/run-benchmarks.shto normalize benchmark names into canonical keys and compute ratio/speed deltas + status classifications. - Emit new benchmark delta artifacts:
benchmark-delta.jsonandbenchmark-delta.md. - Document the new artifacts and delta interpretation rules in
BENCHMARKS.md.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
.github/scripts/run-benchmarks.sh |
Parses benchmark names into canonical keys, computes Rust/FFI deltas and statuses, and writes new delta JSON/MD artifacts. |
benchmark-delta.json |
New generated delta artifact containing canonical rows with raw + delta metrics. |
benchmark-delta.md |
New generated markdown delta report (Ratio + Speed packs). |
BENCHMARKS.md |
Documents the new delta artifacts and how to interpret delta thresholds. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 294-365: The delta_rows currently appends every parsed key,
producing rows for non-comparable families (e.g., compress-dict) that never have
rust/ffi pairs; change the logic in the loop (the block that builds
speed_series, rust_timing/ffi_timing, ratio_pack and before calling
delta_rows.append) to skip/continue for keys that are not comparable: require
either both rust_timing and ffi_timing to be present (i.e., speed comparison
possible) or the ratio_pack to contain a non-null ratio/delta (e.g.,
ratio_pack["delta"] or ratio_pack["rust_ratio"] or ratio_pack["ffi_ratio"] is
not None); only call delta_rows.append when that comparability check passes
(leave classify_speed_delta, REF_LOW/REF_HIGH and the rest unchanged).
- Around line 429-468: The table currently formats tiny ratios with
"{rust_ratio:.4f}" and "{ffi_ratio:.4f}", which collapses distinct very-small
values; update the formatting in the loops that append to delta_lines (the lines
that call delta_lines.append(f"| {key} | {rust_ratio:.4f} |") and the analogous
ffi_ratio append) to use a higher-precision/compact formatter (for example use
general format like "{rust_ratio:.6g}" and "{ffi_ratio:.6g}" or increase fixed
precision to ".6f") so very-small ratios remain distinguishable while keeping
output readable; keep the same loops over delta_rows and still skip None values
and still escape keys with markdown_table_escape.
In `@BENCHMARKS.md`:
- Around line 97-100: The parity band wording is ambiguous; update the
BENCHMARKS.md lines that currently say "0.99..1.05" to explicitly show the
inclusive bounds (e.g., "[0.99, 1.05]" or "0.99 ≤ delta ≤ 1.05") so the
documentation matches the implementation in classify_ratio_delta() and
classify_speed_delta(); change both the Ratio delta and Speed delta descriptions
and any other occurrences to the inclusive notation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 525035db-1734-4666-a16e-f88078990200
📒 Files selected for processing (4)
.github/scripts/run-benchmarks.shBENCHMARKS.mdbenchmark-delta.jsonbenchmark-delta.md
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'structured-zstd vs C FFI'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: f1ae198 | Previous: d8260fa | Ratio |
|---|---|---|---|
decompress/fastest/high-entropy-1m/c_stream/matrix/c_ffi |
0.035 ms |
0.028 ms |
1.25 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 264-275: The ratio_index entries are missing structured metadata
(stage/level/source) so merges produce params.level="unknown"; update the code
that builds ratio_index (the loop over ratios that calls canonical_key and uses
classify_ratio_delta) to copy row["stage"], row["level"], and row["source"] (or
the equivalent keys present in each ratio row) into each ratio_index[key] entry
alongside rust_ratio, ffi_ratio, delta and status so downstream merging keeps
structured params intact; apply the same change to the other ratio-building
block referenced (the block around classify_ratio_delta in the 293-331 area) so
both places preserve the original ratio row metadata.
- Around line 107-119: The current fallback silently drops segments by setting
source=None; instead validate the parts shape and fail fast: after splitting
into parts, check len(parts) and only accept the known shapes (e.g., exactly 3
or 4 parts); if parts has an unexpected length raise an exception (ValueError)
that includes the raw name and parts so the caller can detect bad benchmark-name
shapes; keep the existing assignments for stage, level, scenario, source,
implementation when length is valid and do not silently set source=None for
unknown shapes.
In `@BENCHMARKS.md`:
- Around line 89-100: Add explicit status enum values and their threshold
semantics to BENCHMARKS.md for benchmark-delta.json consumers: document the
exact labels emitted (e.g., for ratio delta: "better", "parity", "worse" mapped
to rust_ratio/ffi_ratio < 0.99, 0.99..=1.05, > 1.05; and for speed delta:
"faster", "parity", "slower" mapped to rust_bytes_per_sec/ffi_bytes_per_sec >
1.05, 0.99..=1.05, < 0.99), mention that these labels are produced by
.github/scripts/run-benchmarks.sh and appear in the `status` field of
`benchmark-delta.json`, and show the mapping concisely so downstream consumers
can parse the enums without reverse-engineering the script.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 50aaf102-7787-4c05-b19e-df4ac39efa2e
📒 Files selected for processing (3)
.github/scripts/run-benchmarks.sh.gitignoreBENCHMARKS.md
- Remove threshold/status classification from delta artifacts - Keep direct same-run Rust vs FFI delta values only - Update benchmark docs to reflect raw relative comparison
- Reintroduce faster/slower/parity labels from raw same-run deltas - Keep rust/ffi delta values as source of truth - Document that no calibration/pre-test coefficients are applied
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
.github/scripts/run-benchmarks.sh (1)
104-116:⚠️ Potential issue | 🟠 MajorFail fast on unsupported benchmark-name shapes.
Line 104 currently coerces any unknown layout into
stage/level/scenario/impl, which drops middle segments and can collapse distinct benchmarks onto the same canonical key. That silently corruptsspeed_index/key_metainstead of surfacing parser drift.🛠️ Safer fallback
- # Fallback for future benchmark name variants. - stage = parts[0] if len(parts) > 0 else "unknown" - level = parts[1] if len(parts) > 1 else "unknown" - scenario = parts[2] if len(parts) > 2 else name - source = None - implementation = parts[-1] - return { - "stage": stage, - "level": level, - "scenario": scenario, - "source": source, - "implementation": implementation, - } + raise ValueError(f"Unsupported benchmark name format: {name} (parts={parts})")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/scripts/run-benchmarks.sh around lines 104 - 116, The parser currently maps arbitrary-length benchmark names into stage/level/scenario/implementation by collapsing or dropping segments (variables: parts, stage, level, scenario, implementation), which can silently conflate distinct benchmarks; change it to validate the shape of parts and fail fast: check that len(parts) is one of the expected shapes (e.g., exactly 4 or another allowed set) and if not, print a clear error including the offending name and exit non‑zero (or raise an exception) instead of returning a coerced dict; update the code paths that call this logic to handle the error where appropriate and keep the returned structure unchanged when the input is valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 104-116: The parser currently maps arbitrary-length benchmark
names into stage/level/scenario/implementation by collapsing or dropping
segments (variables: parts, stage, level, scenario, implementation), which can
silently conflate distinct benchmarks; change it to validate the shape of parts
and fail fast: check that len(parts) is one of the expected shapes (e.g.,
exactly 4 or another allowed set) and if not, print a clear error including the
offending name and exit non‑zero (or raise an exception) instead of returning a
coerced dict; update the code paths that call this logic to handle the error
where appropriate and keep the returned structure unchanged when the input is
valid.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4cc940d0-c043-461c-b9e3-d32b59d9fbd5
📒 Files selected for processing (3)
.github/scripts/run-benchmarks.shBENCHMARKS.mdREADME.md
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
.github/scripts/run-benchmarks.sh (3)
264-275:⚠️ Potential issue | 🟠 MajorPreserve structured metadata in
ratio_indexentries.
ratio_indexonly stores ratio metrics but omitsstage/level/source. When a compression benchmark has no matching timing row (e.g., a hypothetical compress-only scenario), lines 296-300 fall back to"unknown"or string parsing, producing inconsistent structured params inbenchmark-delta.json(e.g.,params.level = "unknown"even though the ratio row has a valid level).🧩 Proposed fix to retain ratio metadata
ratio_index = {} for row in ratios: key = canonical_key("compress", row["scenario"], row["level"], None) ratio_delta = None if row["ffi_ratio"] > 0.0: ratio_delta = row["rust_ratio"] / row["ffi_ratio"] ratio_index[key] = { + "meta": { + "stage": "compress", + "scenario": row["scenario"], + "level": row["level"], + "source": None, + }, "rust_ratio": row["rust_ratio"], "ffi_ratio": row["ffi_ratio"], "delta": ratio_delta, "status": classify_ratio_delta(ratio_delta), }Then at lines 296-300, use the ratio metadata as a fallback source:
for key in all_keys: + ratio_pack = ratio_index.get( + key, + { + "meta": None, + "rust_ratio": None, + "ffi_ratio": None, + "delta": None, + "status": "insufficient-data", + }, + ) - meta = key_meta.get(key) + meta = key_meta.get(key) or (ratio_pack.get("meta") if ratio_pack else None) stage = meta["stage"] if meta else "compress" scenario = meta["scenario"] if meta else key.split(" + ")[0] level = meta["level"] if meta else "unknown" source = meta["source"] if meta else None - - ratio_pack = ratio_index.get( - key, - { - "rust_ratio": None, - "ffi_ratio": None, - "delta": None, - "status": "insufficient-data", - }, - )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/scripts/run-benchmarks.sh around lines 264 - 275, The ratio_index entries only store rust/ffi/delta and lose structured metadata (stage/level/source), causing fallbacks to yield "unknown" params; update the loop that builds ratio_index (the for row in ratios block using canonical_key("compress", ...)) to include the original row's metadata fields (e.g., "stage", "level", "source" or whatever keys exist on row) in each ratio_index value so later code that reads ratio_index can use these fields as reliable fallbacks; keep the existing keys ("rust_ratio","ffi_ratio","delta","status") and add the metadata fields to the same dict so classify_ratio_delta and canonical_key usage remains unchanged.
107-119:⚠️ Potential issue | 🟠 MajorFail fast on unsupported benchmark-name shapes.
The fallback silently drops segments between
scenarioandimplementationby forcingsource=None, which can collapse distinct benchmarks under one canonical key if a new family misses the three hardcoded patterns. This makes collisions undetectable untilspeed_indexoverwrites earlier rows or delta tables show unexpected aggregations.🛠️ Proposed guard to reject unknown shapes
- # Fallback for future benchmark name variants. - stage = parts[0] if len(parts) > 0 else "unknown" - level = parts[1] if len(parts) > 1 else "unknown" - scenario = parts[2] if len(parts) > 2 else name - source = None - implementation = parts[-1] - return { - "stage": stage, - "level": level, - "scenario": scenario, - "source": source, - "implementation": implementation, - } + raise ValueError( + f"Unsupported benchmark name format: {name} (parts={parts}). " + f"Add an explicit pattern or update the fallback." + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/scripts/run-benchmarks.sh around lines 107 - 119, The parsing currently masks unexpected benchmark-name shapes by setting source = None; instead validate the parts length and fail fast: if len(parts) == 3 set stage=parts[0], level=parts[1], scenario=parts[2], source=None, implementation=parts[-1]; if len(parts) == 4 set source=parts[3]; for any other len(parts) print a clear error mentioning the raw name and exit non-zero so unknown shapes are rejected rather than collapsed (use the existing variables parts, name, stage, level, scenario, source, implementation).
432-551:⚠️ Potential issue | 🟠 MajorAdd 0.99 / 1.05 reference band rows to delta tables.
The delta markdown omits the promised reference threshold visualization. PR objectives require "99%/105% reference lines" in
benchmark-delta.md, andBENCHMARKS.mddocuments these bands, but the actual tables only show raw deltas and status labels without explicit threshold markers. This forces readers to mentally map each delta against 0.99/1.05 instead of seeing the bands inline.📊 Suggested approach
After each delta table header, add two constant reference rows:
Ratio delta table:
| Key | Delta | Status | | --- | ---: | --- | | **Reference: better threshold** | **0.99** | — | | **Reference: worse threshold** | **1.05** | — | | scenario-1 + ... | 0.9823 | rust_better_smaller | ...Speed delta table:
| Key | Delta | Status | | --- | ---: | --- | | **Reference: slower threshold** | **0.99** | — | | **Reference: faster threshold** | **1.05** | — | | scenario-1 + ... | 1.0732 | rust_faster | ...This gives readers immediate visual context for each delta without requiring cross-reference to
BENCHMARKS.md.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/scripts/run-benchmarks.sh around lines 432 - 551, The delta tables currently omit the 0.99/1.05 reference rows; update the script that builds delta_lines to inject two constant reference rows immediately after the "### Rust/FFI ratio delta" header block and likewise after the "### Rust/FFI speed delta" header block. Concretely, after the delta_lines.extend that adds "| Key | Delta | Status |" for the ratio table, append two lines like "| **Reference: better threshold** | **0.99** | — |" and "| **Reference: worse threshold** | **1.05** | — |" before the for row in delta_rows loop that reads row["ratio"]["delta_rust_over_ffi"]; do the analogous insertion for the speed delta table (use labels "Reference: slower threshold" / "Reference: faster threshold" with 0.99 and 1.05) before the loop that reads row["speed"]["delta_rust_over_ffi"]. Ensure these lines are added to the delta_lines list so they appear in the generated benchmark-delta.md.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 370-371: The CI currently produces benchmark-delta.json and
benchmark-delta.md in .github/scripts/run-benchmarks.sh but never uploads them;
update the CI job that runs the "Run benchmarks" step in
.github/workflows/ci.yml to upload these two files (benchmark-delta.json and
benchmark-delta.md) using actions/upload-artifact (or alternatively post them as
a PR comment via the GitHub API) so reviewers can access per-scenario deltas;
ensure the upload step runs after the script completes and references the exact
filenames generated by run-benchmarks.sh.
---
Duplicate comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 264-275: The ratio_index entries only store rust/ffi/delta and
lose structured metadata (stage/level/source), causing fallbacks to yield
"unknown" params; update the loop that builds ratio_index (the for row in ratios
block using canonical_key("compress", ...)) to include the original row's
metadata fields (e.g., "stage", "level", "source" or whatever keys exist on row)
in each ratio_index value so later code that reads ratio_index can use these
fields as reliable fallbacks; keep the existing keys
("rust_ratio","ffi_ratio","delta","status") and add the metadata fields to the
same dict so classify_ratio_delta and canonical_key usage remains unchanged.
- Around line 107-119: The parsing currently masks unexpected benchmark-name
shapes by setting source = None; instead validate the parts length and fail
fast: if len(parts) == 3 set stage=parts[0], level=parts[1], scenario=parts[2],
source=None, implementation=parts[-1]; if len(parts) == 4 set source=parts[3];
for any other len(parts) print a clear error mentioning the raw name and exit
non-zero so unknown shapes are rejected rather than collapsed (use the existing
variables parts, name, stage, level, scenario, source, implementation).
- Around line 432-551: The delta tables currently omit the 0.99/1.05 reference
rows; update the script that builds delta_lines to inject two constant reference
rows immediately after the "### Rust/FFI ratio delta" header block and likewise
after the "### Rust/FFI speed delta" header block. Concretely, after the
delta_lines.extend that adds "| Key | Delta | Status |" for the ratio table,
append two lines like "| **Reference: better threshold** | **0.99** | — |" and
"| **Reference: worse threshold** | **1.05** | — |" before the for row in
delta_rows loop that reads row["ratio"]["delta_rust_over_ffi"]; do the analogous
insertion for the speed delta table (use labels "Reference: slower threshold" /
"Reference: faster threshold" with 0.99 and 1.05) before the loop that reads
row["speed"]["delta_rust_over_ffi"]. Ensure these lines are added to the
delta_lines list so they appear in the generated benchmark-delta.md.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f2850401-aeb5-4b18-8087-652cc82d060c
📒 Files selected for processing (2)
.github/scripts/run-benchmarks.shBENCHMARKS.md
- Fail fast on unsupported benchmark name shapes - Preserve ratio metadata in delta row param fallback - Use explicit None checks for speed delta math - Clarify interpretation strings as delta semantics - Upload benchmark-delta artifacts in CI benchmark job
There was a problem hiding this comment.
♻️ Duplicate comments (1)
.github/scripts/run-benchmarks.sh (1)
472-489:⚠️ Potential issue | 🟠 MajorRender explicit 0.99 / 1.05 reference guides in delta markdown.
The delta tables still show only value/status. The required parity band guide rails are not explicitly rendered in
benchmark-delta.md, so readers can’t see the promised thresholds directly in-table.🛠️ Proposed patch
delta_lines.extend( [ "", "### Rust/FFI ratio delta", "", + f"Reference band: {DELTA_LOW:.2f}–{DELTA_HIGH:.2f} (near parity).", + "", "| Key | Delta | Status |", "| --- | ---: | --- |", ] ) @@ delta_lines.extend( [ "", "### Rust/FFI speed delta", "", + f"Reference band: {DELTA_LOW:.2f}–{DELTA_HIGH:.2f} (near parity).", + "", "| Key | Delta | Status |", "| --- | ---: | --- |", ] )Also applies to: 530-547
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/scripts/run-benchmarks.sh around lines 472 - 489, The delta table generation (delta_lines, iterating delta_rows and using row["ratio"]["delta_rust_over_ffi"]) only outputs value and status; update the table to include the explicit parity reference columns (e.g., show the lower and upper parity thresholds like 0.99 and 1.05) so readers see the band guide rails in the markdown. Modify the header string built into delta_lines to add two new columns (e.g., "ParityLow" and "ParityHigh" or a single "Parity ref") and when appending rows include the literal reference values next to the formatted delta (use the same loop that references row["ratio"]["status"] and row["ratio"]["delta_rust_over_ffi"], or if thresholds are global constants, insert "0.99" and "1.05" accordingly); apply the same change to the other table generation block around the similar code at lines ~530-547.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 472-489: The delta table generation (delta_lines, iterating
delta_rows and using row["ratio"]["delta_rust_over_ffi"]) only outputs value and
status; update the table to include the explicit parity reference columns (e.g.,
show the lower and upper parity thresholds like 0.99 and 1.05) so readers see
the band guide rails in the markdown. Modify the header string built into
delta_lines to add two new columns (e.g., "ParityLow" and "ParityHigh" or a
single "Parity ref") and when appending rows include the literal reference
values next to the formatted delta (use the same loop that references
row["ratio"]["status"] and row["ratio"]["delta_rust_over_ffi"], or if thresholds
are global constants, insert "0.99" and "1.05" accordingly); apply the same
change to the other table generation block around the similar code at lines
~530-547.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5048940f-a560-4bdb-9ff4-7e56d0d5b096
📒 Files selected for processing (2)
.github/scripts/run-benchmarks.sh.github/workflows/ci.yml
- Add explicit 0.99–1.05 guide text to ratio delta section - Add explicit 0.99–1.05 guide text to speed delta section
|
@coderabbitai Fixed in 6ba8518: now renders explicit reference guides in both delta sections:
|
|
@coderabbitai Fixed in commit 6ba8518. benchmark-delta.md now renders explicit reference guides in both delta sections:
|
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings used |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1471
before their respective tables, and the values are interpolated directly from 🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/scripts/run-benchmarks.sh:
- Around line 348-363: The JSON output lacks the threshold/band constants used
to compute status, so include the reference band (DELTA_LOW and DELTA_HIGH) in
the exported payload (e.g., add "delta_low" and "delta_high" or a nested
"reference_band": {"low":..., "high":...}) alongside both the "ratio" and
"speed" objects so consumers can interpret "status" without hardcoding values;
update the code that builds the payload (where ratio_pack is used and where
speed_delta/classify_speed_delta(...) are set) to read the existing
DELTA_LOW/DELTA_HIGH constants and emit them into the benchmark-delta.json
output for both sections.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 52442509-80e2-47c3-a7a0-c99817e7ce73
📒 Files selected for processing (1)
.github/scripts/run-benchmarks.sh
Summary
(scenario + params)keys in the benchmark parser and fail fast on unsupported benchmark-name shapesstage/level/source) when building delta rows and tighten numericNonehandling in speed calculationsbenchmark-delta.json,benchmark-delta.md) and upload them in CI for PR visibilityValidation
Closes #62
Summary by CodeRabbit
New Features
Documentation
Chores