migrate #1673: feat: add sticky-round-robin routing strategy#475
migrate #1673: feat: add sticky-round-robin routing strategy#475KooshaPari wants to merge 2850 commits intomainfrom
Conversation
Merge recover-all branch into main
The convertChatCompletionsStreamChunkToCompletions function was including usage information in all stream chunks, but should drop usage when a chunk has a finish_reason (terminal chunk). Only preserve usage for usage-only chunks (empty choices array). Fixes TestConvertChatCompletionsStreamChunkToCompletions_DropsUsageOnTerminalFinishChunk by tracking hasFinishReason flag and conditionally including usage based on: 1. NOT being a terminal finish chunk, OR 2. Being a usage-only chunk (no choices) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Removes ~23K LOC of duplicate executor code - Server builds successfully
- Add api/openapi.yaml with core endpoints - Add .github/workflows/generate-sdks.yaml for Python/TypeScript SDK generation - Enables SDK generation from OpenAPI spec
- Add cliproxy/client.py - Python client for API - Add cliproxy/__init__.py - SDK init - Generated from OpenAPI spec
Root cause: Multiple config type aliases (sdk/config.SDKConfig vs pkg/llmproxy/config.SDKConfig vs internal/config.SDKConfig) were treated as different types by Go despite aliasing to the same underlying type. Similarly, ErrorMessage types in different packages were duplicated. Changes: 1. Fixed sdk/config/config.go to import from internal/config instead of pkg/llmproxy/config, establishing correct import hierarchy 2. Updated all util functions (SetProxy, NewAnthropicHttpClient) to import from internal/config for canonical type identity 3. Made pkg/llmproxy/config re-export sdk/config types as aliases 4. Made pkg/llmproxy/interfaces/ErrorMessage an alias to internal version 5. Made pkg/llmproxy/access/config_access/provider.go accept sdk/config.SDKConfig 6. Added necessary type aliases and methods to pkg/llmproxy/config.go Result: All config and interface types now have unified identity throughout the codebase. Type mismatches in SetProxy, NewAnthropicHttpClient, configaccess.Register, and interfaces.ErrorMessage are resolved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove duplicate type definitions in kiro_websearch_handler.go (McpRequest, McpParams, etc already in kiro_websearch.go) - Define SDKConfig as struct in pkg/llmproxy/config instead of alias to avoid circular import - Add Wave Batch 7 (CPB-0910..CPB-0920) to troubleshooting.md - Clean up merge conflict markers in troubleshooting.md
This commit fixes the following test failures:
1. pkg/llmproxy/registry [setup failed]
- Fixed syntax error in registry_coverage_test.go (missing comma in assertion)
- Removed unused time import
2. pkg/llmproxy/api::TestServer_StartupSmokeEndpoints_UserAgentVariants
- Fixed test expectations to accept different response formats from different handlers
- OpenAI handler returns {object: "list", data: [...]}
- Claude handler returns {data: [...], has_more: false, first_id: "...", last_id: "..."}
- Tests now check for data field presence instead of rigid format expectations
3. pkg/llmproxy/auth/copilot::TestDeviceFlowClient_PollForToken
- Test was already passing; no changes needed
4. pkg/llmproxy/config::TestSanitizeOAuthModelAlias_AllowsSameAliasForDifferentNames
- Fixed deduplication logic to dedupe by (name, alias) pair instead of alias only
- Allows same alias to map to different models within a channel
- Example: both model-a and model-b can use shared-alias
5. pkg/llmproxy/config::TestSanitizeOAuthModelAlias_InjectsDefaultKiroWhenEmpty
- Expanded defaultGitHubCopilotAliases() to include both Opus and Sonnet models
- Updated test expectations to verify both aliases are present
Root causes:
- Syntax errors in test files
- Incorrect test expectations for handler response formats
- Deduplication logic considering only alias field, not name+alias pair
- Missing default model aliases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- API: handle different response formats from OpenAI vs Claude handlers - Config: fix OAuth model alias deduplication to key by (name,alias) pair - Config: expand default GitHub Copilot aliases to include Sonnet model - Config: update test expectations for new default aliases Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add SDK types to pkg/llmproxy/config - Remove sdk_config.go (was causing type conflicts) - Add air.toml for hot reload development - Build succeeds with unified types
fix: layer 2 — unify internal and pkg/llmproxy config types
fix: layer 1 — add code scanning suppressions for acceptable patterns
Adds StickyRoundRobinSelector that routes requests with the same X-Session-Key header to the same auth credential. This prevents multi-turn conversations from jumping between upstream accounts. When no session key is present, falls back to standard round-robin. Strategy can be configured as "sticky-round-robin" or "srr". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move session_key constant to shared SessionKeyMetadataKey in executor/types.go alongside other metadata keys - Align default maxKeys to 4096, consistent with RoundRobinSelector - Replace wipe-all eviction with partial eviction (remove half) to avoid losing all session stickiness simultaneously Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When sticky-round-robin receives a request without an X-Session-Key header, it now uses fill-first behavior (always pick the first available auth) instead of round-robin. This keeps non-session traffic concentrated on one account. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (10)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…migrate-1673-feat-sticky-round-robin
|
Superseded by layered CI-fix PR flow. Continuing in ci/fix branch PR #498. |
…1673-feat-sticky-round-robin ci: fix workflow required-check names for feature/koosh-migrate-1673-feat-sticky-round-robin
Migrated from router-for-me#1673 via PR migration lane.