fix: add retry logic, per-chunk progress saving, and gzip support by mike1858 · Pull Request #138 · Piebald-AI/splitrail

mike1858 · 2026-04-05T01:06:03Z

Three interrelated fixes to make uploads robust against transient failures:

Retry logic with exponential backoff (2s, 4s, 8s...):
- Actually uses the existing retry_attempts config (was defined but unused)
- Each chunk is retried independently on failure
- Failed chunks don't lose progress from successful ones
Per-chunk incremental progress saving:
- Messages are sorted by date before chunking so older messages go first
- After each successful chunk, last_date_uploaded is saved immediately
- On the next upload run, only newer messages are re-sent
- Previously: progress was only saved after ALL chunks completed
Gzip compression via reqwest feature:
- Enables automatic Accept-Encoding: gzip for API responses
- Response decompression is automatic with the feature enabled

Also refactored upload_single_chunk into its own function with a ChunkContext struct to keep the signature clean and make the retry loop readable.

Summary by CodeRabbit

New Features
- New gpt-5.4-mini model available with pricing and alias support
- Incremental progress tracking for message uploads
Bug Fixes
- Message uploads retry per-chunk on failure with exponential backoff and resume
Tests
- Added and updated tests covering retry behavior and progress updates
Chores
- Expanded HTTP client feature set to include gzip alongside rustls

Three interrelated fixes to make uploads robust against transient failures: 1. Retry logic with exponential backoff (2s, 4s, 8s...): - Actually uses the existing retry_attempts config (was defined but unused) - Each chunk is retried independently on failure - Failed chunks don't lose progress from successful ones 2. Per-chunk incremental progress saving: - Messages are sorted by date before chunking so older messages go first - After each successful chunk, last_date_uploaded is saved immediately - On the next upload run, only newer messages are re-sent - Previously: progress was only saved after ALL chunks completed 3. Gzip compression via reqwest feature: - Enables automatic Accept-Encoding: gzip for API responses - Response decompression is automatic with the feature enabled Also refactored upload_single_chunk into its own function with a ChunkContext struct to keep the signature clean and make the retry loop readable.

coderabbitai · 2026-04-05T01:06:17Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ac90b23f-8fa2-4396-862c-1df5c196cdc9

📥 Commits

Reviewing files that changed from the base of the PR and between 31f7ddf and 8763457.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (1)

src/upload.rs

📝 Walkthrough

Walkthrough

Adds gzip support to reqwest, registers a new canonical model gpt-5.4-mini with pricing and aliases, and refactors upload_message_stats to sort messages, upload in date-ordered chunks with per-chunk retries/exponential backoff, and checkpoint progress per chunk. Tests updated/added for retry behavior and model cost lookup.

Changes

Cohort / File(s)	Summary
Dependency Configuration `Cargo.toml`	Enabled `gzip` feature on the `reqwest` dependency in addition to `rustls`.
Model Registry `src/models.rs`	Added canonical `gpt-5.4-mini` entry with pricing (input_per_1m: 0.75, output_per_1m: 4.5, cached_input_per_1m: 0.075) and alias mappings; updated tests to import `calculate_output_cost`/`get_model_info` and assert alias-resolution and cost calculations.
Upload Logic Refactor `src/upload.rs`	`upload_message_stats` now sorts messages by date, creates date-ordered chunks, and uploads each chunk via a new `upload_single_chunk` helper with a per-chunk retry loop using exponential backoff; introduces `ChunkContext` and `save_chunk_progress`; incremental checkpointing replaces the previous single end-of-run config update.
Upload Tests `src/upload/tests.rs`	Adjusted three tests to set `config.upload.retry_attempts=1`; added a test that simulates a 500 then 200 server sequence to verify retry behavior, request count, and that `last_date_uploaded` is updated after successful retry.

Sequence Diagram(s)

sequenceDiagram
    participant Client as upload_message_stats()
    participant Chunker as chunk_iterator
    participant Retry as retry_loop
    participant Upload as upload_single_chunk()
    participant HTTP as HTTP_Server
    participant Config as config.last_date_uploaded

    Client->>Chunker: sort messages by date and create chunks
    loop for each chunk
        Chunker->>Retry: start retry attempts (max_retries)
        loop while retries_left > 0
            Retry->>Upload: upload chunk
            Upload->>HTTP: POST chunk
            alt success (200)
                HTTP-->>Upload: success
                Upload-->>Retry: return success
                Retry->>Config: save_chunk_progress(last_uploaded_timestamp + 1ms)
                Retry->>Chunker: proceed to next chunk
            else failure (non-200 / error)
                HTTP-->>Upload: error
                Upload-->>Retry: return error
                Retry->>Retry: wait exponential_backoff
                Retry->>Retry: decrement retries_left
            end
        end
        alt all retries exhausted
            Retry->>Config: save_chunk_progress(for completed chunks)
            Retry-->>Client: return final error
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(upload): add debug logging for upload operations #96: Modifies per-chunk upload flow in src/upload.rs with chunking and retry-related changes — closely related to the upload refactor here.
Add support for Z.AI/Zhipu AI, xAI, and Synthetic.new models #107: Adds model entries and alias mappings to src/models.rs, similar to the new gpt-5.4-mini registry additions.
GPT-5-2-Codex #101: Introduces additional model entries/aliases in src/models.rs, overlapping the model-indexing changes made in this PR.

Poem

A rabbit hops through chunks and tries, 🐇
When servers grumble, it retries, ↗️
Checkpoint crumbs left neat and small,
New model seeds spring up for all,
Gzip crunches, progress saved—hooray! 🎉

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes all three main changes in the pull request: retry logic, per-chunk progress saving, and gzip support. It is clear, specific, and directly relates to the primary objectives of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/fix-timeouts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/upload.rs`:
- Around line 195-199: The code currently sets the checkpoint using
chrono::Utc::now() via config.set_last_date_uploaded(date) which can skip
messages created during the upload; instead set the checkpoint to the timestamp
of the last successfully uploaded message plus 1ms. Modify the final-save logic
in upload.rs to use the tracked last_uploaded_timestamp (or compute it from the
batch of uploaded messages) and call
config.set_last_date_uploaded(last_uploaded_timestamp + 1) before
config.save(true) rather than using Utc::now(); ensure the variable used matches
the existing identifier that records the last uploaded message timestamp.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f1a954a0-d9e1-4831-aed1-7807998f7788

📥 Commits

Reviewing files that changed from the base of the PR and between a9f7e9f and 31f7ddf.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (4)

Cargo.toml
src/models.rs
src/upload.rs
src/upload/tests.rs

The final save after the upload loop was overwriting the correct message-anchored checkpoint (set by save_chunk_progress) with Utc::now(), which could skip messages created during the upload window. save_chunk_progress already persists the checkpoint after every chunk including the last one, so the final save was both redundant and harmful.

coderabbitai Bot requested changes Apr 5, 2026

View reviewed changes

Comment thread src/upload.rs Outdated

mike1858 added 2 commits April 5, 2026 08:19

Merge remote-tracking branch 'origin/main' into fix/fix-timeouts

05ba1c1

coderabbitai Bot approved these changes Apr 5, 2026

View reviewed changes

mike1858 merged commit b663305 into main Apr 5, 2026
6 checks passed

mike1858 deleted the fix/fix-timeouts branch April 5, 2026 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add retry logic, per-chunk progress saving, and gzip support#138

fix: add retry logic, per-chunk progress saving, and gzip support#138
mike1858 merged 3 commits intomainfrom
fix/fix-timeouts

mike1858 commented Apr 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 5, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mike1858 commented Apr 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mike1858 commented Apr 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 5, 2026 •

edited

Loading