Skip to content

fix(codex): Treat Codex capacity errors as retryable#2346

Merged
luispater merged 1 commit intorouter-for-me:devfrom
pjpjq:codex/fix-codex-capacity-retry
Mar 28, 2026
Merged

fix(codex): Treat Codex capacity errors as retryable#2346
luispater merged 1 commit intorouter-for-me:devfrom
pjpjq:codex/fix-codex-capacity-retry

Conversation

@pjpjq
Copy link
Copy Markdown
Contributor

@pjpjq pjpjq commented Mar 25, 2026

Summary

  • map Codex "Selected model is at capacity" failures into retryable 429 status errors
  • keep existing reset-based retry-after handling for real usage limit responses
  • add a regression test covering capacity errors returned as 400s

Testing

  • go test ./internal/runtime/executor -run 'TestParseCodexRetryAfter|TestNewCodexStatusErrTreatsCapacityAsRetryableRateLimit' -count=1\n- go test ./internal/runtime/executor -count=1

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the robustness of Codex API interactions by intelligently handling capacity-related errors. Previously, these errors might not have been automatically retried, leading to failed operations. By reclassifying "model is at capacity" messages as retryable 429 status codes, the system can now automatically reattempt requests, improving reliability and user experience when the Codex model experiences temporary load.

Highlights

  • Reclassified Codex capacity errors: The system now treats specific "model is at capacity" messages from the Codex API as http.StatusTooManyRequests (429), making them retryable.
  • Preserved existing retry logic: The existing Retry-After header parsing for actual usage limit responses remains unchanged, ensuring correct handling of explicit rate limits.
  • Added regression test: A new test ensures that capacity errors, even when initially returned with a 400 status, are correctly identified and remapped to a 429 for retry purposes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces logic to identify Codex model capacity errors and treat them as retryable http.StatusTooManyRequests (429) errors, even if the original status code was different. A new function isCodexModelCapacityError was added to detect these specific error messages, and newCodexStatusErr was updated to use this. A new test case was also added to verify this behavior. The review suggests simplifying the strings.Contains checks in isCodexModelCapacityError for better robustness and conciseness.

Comment on lines +713 to +716
if strings.Contains(lower, "selected model is at capacity") ||
strings.Contains(lower, "model is at capacity. please try a different model") {
return true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The two strings.Contains checks for capacity errors can be simplified. Both of the error messages you're checking for contain the substring "model is at capacity". You can combine them into a single, more robust check. This will make the code simpler and more resilient to small variations in the error message from the API.

		if strings.Contains(lower, "model is at capacity") {
			return true
		}

@pjpjq pjpjq changed the title Treat Codex capacity errors as retryable fix(codex): Treat Codex capacity errors as retryable Mar 25, 2026
Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:
This PR is focused and aligned with its stated intent: treat Codex “model at capacity” failures as retryable by mapping them to 429, while preserving existing retry-after parsing for true usage-limit responses.

Key findings:

  • Blocking: none.
  • Non-blocking: capacity detection is text-based and status-agnostic; adding a status guard (e.g., only remap on 400/429) would reduce false-positive risk.

Test plan:

  • Verified the PR adds a regression test:
    • TestNewCodexStatusErrTreatsCapacityAsRetryableRateLimit
  • Existing retry-after behavior remains covered by TestParseCodexRetryAfter.

Follow-ups:

  • Optional hardening: add a negative test ensuring non-capacity errors containing similar text are not remapped unintentionally.

This is an automated Codex review result and still requires manual verification by a human reviewer.

@luispater luispater changed the base branch from main to dev March 28, 2026 13:00
@luispater luispater merged commit 2741e7b into router-for-me:dev Mar 28, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants