Skip to content

feat: replay 9 upstream features from closed-not-merged PRs#693

Merged
KooshaPari merged 12 commits intomainfrom
upstream-feature-replay
Feb 27, 2026
Merged

feat: replay 9 upstream features from closed-not-merged PRs#693
KooshaPari merged 12 commits intomainfrom
upstream-feature-replay

Conversation

@KooshaPari
Copy link
Copy Markdown
Owner

@KooshaPari KooshaPari commented Feb 27, 2026

Summary

Replays 9 upstream features from closed-not-merged PRs that were never integrated into main:

Not included (needs further work)

Already in main (verified)

Test plan

  • Verify Go build passes
  • Run existing test suite
  • Verify import paths are all kooshapari/cliproxyapi-plusplus
  • Test sticky-round-robin with X-Session-Key header
  • Test antigravity model fallback when fetch fails

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features

  • Added sticky round-robin routing strategy for session-aware load balancing
  • Enabled automatic model backfilling across Antigravity providers
  • Session key header propagation support

Improvements

  • Enhanced tool parameter compatibility across Gemini, Claude, and OpenAI formats
  • Integrated web search annotations into translations for all target formats
  • Implemented request retry logic and diagnostic logging for upstream failures

Reliability

  • Added in-memory caching for Antigravity models with fallback mechanisms
  • Reduced refresh check interval to 5s with concurrent rate limiting (16 parallel)
  • Implemented alternative authentication fallback for transient errors

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 27, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04a260f and 02fd192.

📒 Files selected for processing (25)
  • .github/workflows/release.yaml
  • internal/api/handlers/management/config_basic.go
  • internal/config/config.go
  • internal/runtime/executor/antigravity_executor.go
  • internal/runtime/executor/antigravity_executor_models_cache_test.go
  • internal/runtime/executor/iflow_executor.go
  • internal/runtime/executor/iflow_executor_test.go
  • internal/translator/antigravity/gemini/antigravity_gemini_request.go
  • internal/translator/antigravity/gemini/antigravity_gemini_request_test.go
  • internal/translator/gemini/openai/responses/gemini_openai-responses_request.go
  • internal/translator/openai/claude/openai_claude_response.go
  • internal/translator/openai/claude/openai_claude_response_test.go
  • internal/translator/openai/gemini/openai_gemini_response.go
  • internal/translator/openai/openai/responses/openai_openai-responses_request.go
  • internal/translator/openai/openai/responses/openai_openai-responses_response.go
  • sdk/api/handlers/handlers.go
  • sdk/cliproxy/auth/conductor.go
  • sdk/cliproxy/auth/conductor_overrides_test.go
  • sdk/cliproxy/auth/selector.go
  • sdk/cliproxy/auth/selector_test.go
  • sdk/cliproxy/builder.go
  • sdk/cliproxy/service.go
  • sdk/cliproxy/service_antigravity_backfill_test.go
  • test/builtin_tools_translation_test.go
  • test/openai_websearch_translation_test.go

📝 Walkthrough

Walkthrough

This PR introduces sticky round-robin routing strategy with session-based stickiness, implements request-level caching for antigravity models with fallback behavior, enhances IFlow executor with retry logic and request signing, adds auth fallback mechanisms with concurrency controls, and expands translation pipelines to propagate web-search annotations and citations across Claude, Gemini, and OpenAI formats.

Changes

Cohort / File(s) Summary
Build & Release
.github/workflows/release.yaml
Adds new GitHub Actions job to build Termux ARM64 binaries via Docker, computing version/commit/build-date metadata and creating archived artifacts for release distribution.
Routing Configuration
internal/config/config.go, internal/api/handlers/management/config_basic.go, sdk/cliproxy/builder.go, sdk/cliproxy/service.go
Extends routing strategy support to recognize "sticky-round-robin" (and aliases "stickyroundrobin", "srr") across configuration, handler normalization, and service/builder instantiation layers.
Sticky Round-Robin Selector
sdk/cliproxy/auth/selector.go, sdk/cliproxy/auth/selector_test.go
Introduces new StickyRoundRobinSelector with session-key based routing, sticky session-to-auth mappings with memory eviction, and improved error handling for unavailable auth scenarios; adds test validating service unavailable responses.
Auth Conductor & Fallback
sdk/cliproxy/auth/conductor.go, sdk/cliproxy/auth/conductor_overrides_test.go
Reduces refresh check interval from 30s to 5s, adds concurrency-limited refresh via semaphore (max 16 concurrent), introduces alternative auth detection for transient errors (408/500/502/503/504), standardizes "no auth available" error responses, and adds tests validating cooldown behavior with/without alternatives.
Antigravity Model Caching
internal/runtime/executor/antigravity_executor.go, internal/runtime/executor/antigravity_executor_models_cache_test.go
Implements in-memory cache for antigravity models with thread-safe load/store accessors, deep cloning on retrieval, fallback on all error paths, and comprehensive tests validating cache semantics and immutability.
IFlow Executor Refactoring
internal/runtime/executor/iflow_executor.go, internal/runtime/executor/iflow_executor_test.go
Major refactoring centralizing HTTP execution, adding idempotency/conversation metadata keys, implementing 406 retry with optional request signing, capturing request diagnostics (session/conversation IDs), parsing business-status errors, synthesizing SSE chunks from non-streaming responses, and logging rejected request diagnostics; includes extensive test coverage for retry, diagnostics, streaming fallback, and error propagation.
Gemini API Compatibility
internal/translator/antigravity/gemini/antigravity_gemini_request.go, internal/translator/antigravity/gemini/antigravity_gemini_request_test.go
Handles function_declarations by renaming parametersJsonSchema to parameters when parameters is absent, preserving API compatibility; adds tests validating both parameters and parametersJsonSchema paths.
Gemini-OpenAI Response Handling
internal/translator/gemini/openai/responses/gemini_openai-responses_request.go
Adds control-character detection helper and conditional JSON embedding for function call outputs to prevent sjson.SetRaw corruption when outputs contain literal control characters.
Claude Response Translation
internal/translator/openai/claude/openai_claude_response.go, internal/translator/openai/claude/openai_claude_response_test.go
Introduces canonical tool name mapping and annotation accumulation, adds ToolCallAccumulator.Started tracking to emit content_block_start only once per tool call, normalizes tool names in both streaming/non-streaming paths, propagates web-search annotations as citations; includes tests validating tool name canonicalization and content-block emission.
Gemini Response Translation
internal/translator/openai/gemini/openai_gemini_response.go
Adds annotation accumulation from OpenAI responses and attaches as groundingMetadata with citations array in both streaming and non-streaming paths.
OpenAI Request/Response Translation
internal/translator/openai/openai/responses/openai_openai-responses_request.go, internal/translator/openai/openai/responses/openai_openai-responses_response.go
Changes tool pass-through behavior to preserve builtin tools (previously filtered); accumulates web-search annotations into AnnotationsRaw and propagates to response content parts, output items, and text messages.
HTTP Request Metadata
sdk/api/handlers/handlers.go
Forwards X-Session-Key header into execution metadata under "session_key" key for request tracing and session-based routing.
Antigravity Model Backfilling
sdk/cliproxy/service.go, sdk/cliproxy/service_antigravity_backfill_test.go
Adds backfill mechanism to propagate primary models from source antigravity auth to other clients lacking models, applying auth_kind derivation, model exclusions, and aliases; includes tests validating model registration and exclusion handling.
Translation & Tool Tests
test/builtin_tools_translation_test.go, test/openai_websearch_translation_test.go
Renames and updates tool translation tests; adds comprehensive web-search translation suite covering OpenAI↔Claude/Gemini/Responses formats with annotation/citation/groundingMetadata propagation in both streaming and non-streaming paths.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 A sticky hop through session lanes,
Where models cache and auth remains,
Retry with grace, when signatures sign,
Web-search citations shine and align,
Round-robin routes no longer plain! 🌟

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch upstream-feature-replay

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: the Go language version (go1.25) used to build golangci-lint is lower than the targeted Go version (1.26.0)
The command is terminated due to an error: can't load config: the Go language version (go1.25) used to build golangci-lint is lower than the targeted Go version (1.26.0)


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

KooshaPari and others added 10 commits February 27, 2026 04:20
…rs in function output

Cherry-pick of upstream PR router-for-me#1672. Adds containsLiteralControlChars guard
to prevent sjson.SetRaw from corrupting the JSON tree when function outputs
contain literal control characters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-pick of upstream PR router-for-me#1686. Reduces refresh check interval to 5s
and adds refreshMaxConcurrency=16 constant (semaphore already in main).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-pick of upstream PR router-for-me#1648. Renames parametersJsonSchema to
parameters for Gemini API compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-pick of upstream PR router-for-me#1233. Adds build-termux job that
builds inside a Termux container for aarch64 support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…iders

Cherry-pick of upstream PR router-for-me#1579. Fixes duplicate/empty tool_use blocks
in OpenAI->Claude streaming translation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ormats

Cherry-pick of upstream PR router-for-me#1539. Adds url_citation/annotation passthrough
from OpenAI web search to Gemini and Claude response formats.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-pick of upstream PR router-for-me#1673. Adds StickyRoundRobinSelector that
routes requests with the same X-Session-Key to consistent auth credentials.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Follow-up for sticky-round-robin (upstream PR router-for-me#1673). Uses partial
eviction (evict half) instead of full map reset for better stickiness.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-pick of upstream PR router-for-me#1699. Caches successful model fetches and
falls back to cached list when fetches fail, preventing empty model lists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cherry-pick of upstream PR router-for-me#1699 (part 2). Ensures cached model metadata
is deep-copied to prevent mutation across concurrent requests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates several previously unmerged upstream features, enhancing the API's robustness, performance, and compatibility across various LLM providers. Key improvements include better handling of JSON data integrity, optimized resource management for auto-refresh, standardized API parameter naming, and more intelligent routing and error recovery mechanisms. These changes collectively improve the stability and feature parity of the system.

Highlights

  • JSON Tree Corruption Prevention: Implemented a fix to prevent JSON tree corruption when literal control characters are present in function output, specifically in OpenAI to Gemini response translation.
  • Auto-Refresh Concurrency Limiting: Introduced a semaphore to limit the concurrency of auto-refresh operations to 16, along with reducing the refresh interval to 5 seconds, to prevent refresh storms.
  • Gemini API Schema Parameter Naming Correction: Corrected the Gemini API schema parameter naming, normalizing 'parametersJsonSchema' to 'parameters' for consistency.
  • Sticky Round-Robin Routing Strategy: Added a new 'sticky-round-robin' routing strategy that routes requests with the same 'X-Session-Key' header to the same backend credential, falling back to standard round-robin if no session key is present.
  • Duplicate/Empty Tool Use Blocks Fix: Addressed issues with duplicate or empty tool_use blocks in OpenAI to Claude streaming translation, ensuring tool calls are emitted correctly and only once.
  • OpenAI Web Search Annotations Passthrough: Enabled the passthrough of OpenAI web search annotations (url_citation) to all supported output formats (Claude, Gemini, and OpenAI Responses).
  • Antigravity Model List Backfill: Implemented caching for the primary Antigravity model list and a mechanism to backfill empty Antigravity authentications with this cached fallback list when fetching models fails.
  • iFlow 406 Retry and Stream Fallback Hardening: Improved the iFlow executor to retry requests without a signature if an initial request returns a 406 Not Acceptable status. Also added fallback logic for streaming requests that receive non-SSE JSON responses.
  • Auth Availability and Cooldown Logic: Refined the authentication cooldown logic such that transient errors (e.g., 500s) will not trigger a cooldown if no alternative authentication is available for the requested model.
Changelog
  • internal/api/handlers/management/config_basic.go
    • Updated import paths to the new repository location
    • Added 'sticky-round-robin' as a recognized routing strategy
  • internal/config/config.go
    • Updated import paths to the new repository location
    • Updated RoutingConfig documentation to include 'sticky-round-robin' strategy
  • internal/runtime/executor/antigravity_executor.go
    • Updated import paths to the new repository location
    • Introduced antigravityPrimaryModelsCache for caching fetched model lists
    • Added helper functions (cloneAntigravityModels, storeAntigravityPrimaryModels, loadAntigravityPrimaryModels, fallbackAntigravityPrimaryModels) for cache management
    • Modified FetchAntigravityModels to use the cached model list as a fallback on token errors, request creation errors, network errors, or unexpected HTTP statuses
  • internal/runtime/executor/antigravity_executor_models_cache_test.go
    • Added new test file for Antigravity primary models cache functionality
    • Included tests for ensuring empty model lists do not overwrite the cache and that loaded models are clones to prevent mutation
  • internal/runtime/executor/iflow_executor.go
    • Updated import paths to the new repository location
    • Added iflowRequestAttempt struct to snapshot request details for logging
    • Refactored request execution into executeChatCompletionsRequest to handle 406 Not Acceptable retries without signature
    • Introduced iflowRequestIDs to extract session and conversation IDs from metadata or payload
    • Implemented synthesizeOpenAIStreamingChunksFromNonStream to convert non-SSE JSON responses into streaming chunks for fallback
    • Added parseIFlowBusinessStatusError, parseNumericStatus, normalizeIFlowBusinessStatus for robust business error parsing
    • Added parseOpenAIStreamNetworkErrorWithoutContent to detect network errors in streaming responses without content
  • internal/runtime/executor/iflow_executor_test.go
    • Added tests for executeChatCompletionsRequest to verify retry behavior on 406 status codes
    • Included tests for diagnostic logging on 403 Forbidden responses and absence of diagnostics on 500 errors
    • Added tests for streaming behavior, including emitting message_stop and falling back when upstream returns non-SSE JSON
  • internal/translator/antigravity/gemini/antigravity_gemini_request.go
    • Updated import paths to the new repository location
    • Corrected Gemini API schema parameter naming from parametersJsonSchema to parameters within tool declarations
  • internal/translator/antigravity/gemini/antigravity_gemini_request_test.go
    • Added tests to ensure Gemini tool parameters keys are preserved or normalized correctly
  • internal/translator/gemini/openai/responses/gemini_openai-responses_request.go
    • Added containsLiteralControlChars function to check for literal control characters in JSON output
    • Modified function output handling to prevent JSON tree corruption by safely escaping values with literal control characters
  • internal/translator/openai/claude/openai_claude_response.go
    • Updated import paths to the new repository location
    • Added AnnotationsRaw field to ConvertOpenAIResponseToAnthropicParams to accumulate web search annotations
    • Added CanonicalToolNameByLower to ConvertOpenAIResponseToAnthropicParams for tool name canonicalization
    • Modified streaming and non-streaming translation to handle OpenAI web search annotations as Claude citations
    • Improved tool call handling in streaming to emit tool_use start only once and canonicalize tool names
  • internal/translator/openai/claude/openai_claude_response_test.go
    • Added new test file for Claude response translation, specifically for tool name canonicalization and single emission of tool start events
  • internal/translator/openai/gemini/openai_gemini_response.go
    • Added AnnotationsRaw field to ConvertOpenAIResponseToGeminiParams to accumulate web search annotations
    • Modified streaming and non-streaming translation to handle OpenAI web search annotations as Gemini grounding metadata
  • internal/translator/openai/openai/responses/openai_openai-responses_request.go
    • Modified ConvertOpenAIResponsesRequestToOpenAIChatCompletions to pass through built-in tools (e.g., web_search) instead of ignoring them
  • internal/translator/openai/openai/responses/openai_openai-responses_response.go
    • Added AnnotationsRaw field to oaiToResponsesState to accumulate web search annotations
    • Modified streaming and non-streaming translation to populate OpenAI Responses with web search annotations
  • sdk/api/handlers/handlers.go
    • Updated import paths to the new repository location
    • Added logic to forward the X-Session-Key header as metadata for sticky routing
  • sdk/cliproxy/auth/conductor.go
    • Updated import paths to the new repository location
    • Reduced refreshCheckInterval to 5 seconds and added refreshMaxConcurrency (16) to limit concurrent refresh operations
    • Introduced refreshSemaphore to enforce concurrency limits on auth refreshes
    • Modified MarkResult to prevent cooldowns for transient errors if no alternative auth is available for the model
    • Added hasAlternativeAuthForModelLocked to check for available alternative auths
    • Introduced noAuthAvailableError for consistent error reporting when no auth is available
  • sdk/cliproxy/auth/conductor_overrides_test.go
    • Added tests to verify MarkResult behavior regarding cooldowns when alternative auths are present or absent
  • sdk/cliproxy/auth/selector.go
    • Added StickyRoundRobinSelector to provide session-key based routing
    • Updated getAvailableAuths to return a StatusServiceUnavailable error when no auths are available
  • sdk/cliproxy/auth/selector_test.go
    • Added test case for SelectorPick_AllUnavailableReturnsServiceUnavailable
  • sdk/cliproxy/builder.go
    • Added sticky-round-robin as a selectable routing strategy in the service builder
  • sdk/cliproxy/service.go
    • Added sticky-round-robin to the list of normalized routing strategies
    • Implemented backfillAntigravityModels to populate model lists for Antigravity auths using a primary source if their own list is empty
  • sdk/cliproxy/service_antigravity_backfill_test.go
    • Added new test file for Antigravity model backfilling functionality, including tests for registering missing auths and respecting excluded models
  • test/builtin_tools_translation_test.go
    • Updated import paths to the new repository location
    • Modified TestOpenAIResponsesToOpenAI_PassesBuiltinTools to assert that built-in tools are now passed through in OpenAI Chat Completions translation
  • test/openai_websearch_translation_test.go
    • Added new test file to cover translation of OpenAI web search annotations to Claude, Gemini, and OpenAI Responses formats for both streaming and non-streaming scenarios
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/release.yaml
Activity
  • The pull request was generated using Claude Code, indicating an automated process for integrating these features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

KooshaPari and others added 2 commits February 27, 2026 04:21
Cherry-pick of upstream PR router-for-me#1650. Improves iflow executor with 406 retry
handling, stream stability fixes, and better auth availability checks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Follow-up for upstream PR router-for-me#1650. Addresses review feedback on iflow
executor body read handling and session ID extraction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@KooshaPari KooshaPari force-pushed the upstream-feature-replay branch from 5741ea1 to 02fd192 Compare February 27, 2026 11:21
@KooshaPari KooshaPari merged commit 8268d68 into main Feb 27, 2026
7 of 10 checks passed
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates a significant number of features from previously unmerged pull requests, enhancing functionality across various components. Key improvements include the introduction of a 'sticky-round-robin' routing strategy, caching for antigravity models to improve resilience, and more robust error handling and retry logic in the iFlow executor. The translation layers have also been updated to handle annotations and improve compatibility. While the changes are substantial and largely beneficial, I've identified a few critical bugs in the openai_claude_response.go translator that will cause compilation errors due to variable re-declarations and type mismatches. These issues need to be addressed to ensure the stability of the new features.

@KooshaPari KooshaPari deleted the upstream-feature-replay branch February 27, 2026 11:40
@KooshaPari KooshaPari restored the upstream-feature-replay branch February 27, 2026 11:56
@KooshaPari KooshaPari deleted the upstream-feature-replay branch February 28, 2026 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant