Skip to content

feat(provider): automatic model fallback on transient errors#20105

Open
ESRE-dev wants to merge 1 commit intoanomalyco:devfrom
ESRE-dev:pr/provider-fallback
Open

feat(provider): automatic model fallback on transient errors#20105
ESRE-dev wants to merge 1 commit intoanomalyco:devfrom
ESRE-dev:pr/provider-fallback

Conversation

@ESRE-dev
Copy link
Copy Markdown

@ESRE-dev ESRE-dev commented Mar 30, 2026

Issue for this PR

Closes #20100

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Adds configurable provider/model fallback to improve resilience when the primary provider is temporarily degraded.

This PR introduces:

  1. Provider fallback module

    • Adds fallback resolution and middleware support to retry once on a configured fallback model when errors are transient.
  2. LLM pipeline integration

    • Integrates fallback resolution into the session LLM stream path.
    • If a fallback target is configured and available, fallback middleware is applied during model invocation.
  3. Transient error classification improvements

    • Improves provider error handling for fallback decisions.
    • Includes copilot-specific handling for transient 403 behavior and clearer reauth guidance messaging.
  4. Config support

    • Adds config support for fallback mapping (primary -> fallback) to allow user-controlled failover behavior.

This is broader than #19394 (provider-specific retryability) and provides general cross-provider fallback behavior.

How did you verify your code works?

  • Added and ran packages/opencode/test/provider/fallback.test.ts.
  • Ran targeted session/provider tests to confirm fallback activation and non-fallback behavior.
  • Verified transient vs non-transient classification behavior in provider error handling.

Screenshots / recordings

Not a UI change.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

If you do not follow this template your PR will be automatically rejected.

@github-actions github-actions bot added needs:compliance This means the issue will auto-close after 2 hours. and removed needs:compliance This means the issue will auto-close after 2 hours. labels Mar 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

@github-actions
Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on the search results, I found some potentially related PRs that deal with provider fallback and error handling:

Potentially Related PRs:

  1. feat: add runtime model fallback on retry exhaustion #11739 - "feat: add runtime model fallback on retry exhaustion"

  2. fix(opencode): correct model fallback index tracking and config parsing #8669 - "fix(opencode): correct model fallback index tracking and config parsing"

  3. feat(opencode): OpenRouter model discovery/pruning at runtime #20004 - "feat(opencode): OpenRouter model discovery/pruning at runtime"

These PRs have overlapping concerns with fallback behavior and provider error handling, though they may be from different time periods or addressing different aspects of the feature.

- Add ProviderFallback module for configurable model fallback
- Integrate fallback middleware into LLM stream pipeline
- shouldFallback() handles 429, 500, 502, 503, transient 403
- Copilot-specific: reauth guidance for 403, transient retryability
- Config: fallback mapping (providerID/modelID → fallback target)
@ESRE-dev ESRE-dev force-pushed the pr/provider-fallback branch from a565674 to 46fe331 Compare April 1, 2026 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Configurable provider/model fallback on transient errors

1 participant