Skip to content

fix: add retry logic, per-chunk progress saving, and gzip support#138

Merged
mike1858 merged 3 commits intomainfrom
fix/fix-timeouts
Apr 5, 2026
Merged

fix: add retry logic, per-chunk progress saving, and gzip support#138
mike1858 merged 3 commits intomainfrom
fix/fix-timeouts

Conversation

@mike1858
Copy link
Copy Markdown
Member

@mike1858 mike1858 commented Apr 5, 2026

Three interrelated fixes to make uploads robust against transient failures:

  1. Retry logic with exponential backoff (2s, 4s, 8s...):

    • Actually uses the existing retry_attempts config (was defined but unused)
    • Each chunk is retried independently on failure
    • Failed chunks don't lose progress from successful ones
  2. Per-chunk incremental progress saving:

    • Messages are sorted by date before chunking so older messages go first
    • After each successful chunk, last_date_uploaded is saved immediately
    • On the next upload run, only newer messages are re-sent
    • Previously: progress was only saved after ALL chunks completed
  3. Gzip compression via reqwest feature:

    • Enables automatic Accept-Encoding: gzip for API responses
    • Response decompression is automatic with the feature enabled

Also refactored upload_single_chunk into its own function with a ChunkContext struct to keep the signature clean and make the retry loop readable.

Summary by CodeRabbit

  • New Features

    • New gpt-5.4-mini model available with pricing and alias support
    • Incremental progress tracking for message uploads
  • Bug Fixes

    • Message uploads retry per-chunk on failure with exponential backoff and resume
  • Tests

    • Added and updated tests covering retry behavior and progress updates
  • Chores

    • Expanded HTTP client feature set to include gzip alongside rustls

Three interrelated fixes to make uploads robust against transient failures:

1. Retry logic with exponential backoff (2s, 4s, 8s...):
   - Actually uses the existing retry_attempts config (was defined but unused)
   - Each chunk is retried independently on failure
   - Failed chunks don't lose progress from successful ones

2. Per-chunk incremental progress saving:
   - Messages are sorted by date before chunking so older messages go first
   - After each successful chunk, last_date_uploaded is saved immediately
   - On the next upload run, only newer messages are re-sent
   - Previously: progress was only saved after ALL chunks completed

3. Gzip compression via reqwest feature:
   - Enables automatic Accept-Encoding: gzip for API responses
   - Response decompression is automatic with the feature enabled

Also refactored upload_single_chunk into its own function with a ChunkContext
struct to keep the signature clean and make the retry loop readable.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ac90b23f-8fa2-4396-862c-1df5c196cdc9

📥 Commits

Reviewing files that changed from the base of the PR and between 31f7ddf and 8763457.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • src/upload.rs

📝 Walkthrough

Walkthrough

Adds gzip support to reqwest, registers a new canonical model gpt-5.4-mini with pricing and aliases, and refactors upload_message_stats to sort messages, upload in date-ordered chunks with per-chunk retries/exponential backoff, and checkpoint progress per chunk. Tests updated/added for retry behavior and model cost lookup.

Changes

Cohort / File(s) Summary
Dependency Configuration
Cargo.toml
Enabled gzip feature on the reqwest dependency in addition to rustls.
Model Registry
src/models.rs
Added canonical gpt-5.4-mini entry with pricing (input_per_1m: 0.75, output_per_1m: 4.5, cached_input_per_1m: 0.075) and alias mappings; updated tests to import calculate_output_cost/get_model_info and assert alias-resolution and cost calculations.
Upload Logic Refactor
src/upload.rs
upload_message_stats now sorts messages by date, creates date-ordered chunks, and uploads each chunk via a new upload_single_chunk helper with a per-chunk retry loop using exponential backoff; introduces ChunkContext and save_chunk_progress; incremental checkpointing replaces the previous single end-of-run config update.
Upload Tests
src/upload/tests.rs
Adjusted three tests to set config.upload.retry_attempts=1; added a test that simulates a 500 then 200 server sequence to verify retry behavior, request count, and that last_date_uploaded is updated after successful retry.

Sequence Diagram(s)

sequenceDiagram
    participant Client as upload_message_stats()
    participant Chunker as chunk_iterator
    participant Retry as retry_loop
    participant Upload as upload_single_chunk()
    participant HTTP as HTTP_Server
    participant Config as config.last_date_uploaded

    Client->>Chunker: sort messages by date and create chunks
    loop for each chunk
        Chunker->>Retry: start retry attempts (max_retries)
        loop while retries_left > 0
            Retry->>Upload: upload chunk
            Upload->>HTTP: POST chunk
            alt success (200)
                HTTP-->>Upload: success
                Upload-->>Retry: return success
                Retry->>Config: save_chunk_progress(last_uploaded_timestamp + 1ms)
                Retry->>Chunker: proceed to next chunk
            else failure (non-200 / error)
                HTTP-->>Upload: error
                Upload-->>Retry: return error
                Retry->>Retry: wait exponential_backoff
                Retry->>Retry: decrement retries_left
            end
        end
        alt all retries exhausted
            Retry->>Config: save_chunk_progress(for completed chunks)
            Retry-->>Client: return final error
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

A rabbit hops through chunks and tries, 🐇
When servers grumble, it retries, ↗️
Checkpoint crumbs left neat and small,
New model seeds spring up for all,
Gzip crunches, progress saved—hooray! 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes all three main changes in the pull request: retry logic, per-chunk progress saving, and gzip support. It is clear, specific, and directly relates to the primary objectives of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/fix-timeouts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/upload.rs`:
- Around line 195-199: The code currently sets the checkpoint using
chrono::Utc::now() via config.set_last_date_uploaded(date) which can skip
messages created during the upload; instead set the checkpoint to the timestamp
of the last successfully uploaded message plus 1ms. Modify the final-save logic
in upload.rs to use the tracked last_uploaded_timestamp (or compute it from the
batch of uploaded messages) and call
config.set_last_date_uploaded(last_uploaded_timestamp + 1) before
config.save(true) rather than using Utc::now(); ensure the variable used matches
the existing identifier that records the last uploaded message timestamp.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f1a954a0-d9e1-4831-aed1-7807998f7788

📥 Commits

Reviewing files that changed from the base of the PR and between a9f7e9f and 31f7ddf.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • Cargo.toml
  • src/models.rs
  • src/upload.rs
  • src/upload/tests.rs

Comment thread src/upload.rs Outdated
mike1858 added 2 commits April 5, 2026 08:19
The final save after the upload loop was overwriting the correct
message-anchored checkpoint (set by save_chunk_progress) with
Utc::now(), which could skip messages created during the upload
window. save_chunk_progress already persists the checkpoint after
every chunk including the last one, so the final save was both
redundant and harmful.
@mike1858 mike1858 merged commit b663305 into main Apr 5, 2026
6 checks passed
@mike1858 mike1858 deleted the fix/fix-timeouts branch April 5, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant