fix(backend): implement weighted load balancing in selectModel by pescn · Pull Request #49 · EM-GeekLab/NexusGate

pescn · 2026-01-19T21:05:19Z

Summary

Fixed the selectModel function in api-helpers.ts that was ignoring weight configuration
Previously, the function had a TODO: implement weighted load balancing comment but only returned the first candidate
Now implements proper weighted random selection algorithm matching the implementation in utils/model.ts

Problem

When multiple providers offered the same model with different weights (e.g., provider-A weight=1, provider-B weight=2, provider-C weight=3), all requests were routed to provider-A (100%) instead of being distributed according to weights (~16.7%, ~33.3%, ~50.0%).

Test Results

Scenario	provider-A (w=1)	provider-B (w=2)	provider-C (w=3)
Expected	~16.7%	~33.3%	~50.0%
Before fix	100% ❌	0%	0%
After fix	16.9% ✅	32.4% ✅	50.7% ✅

Affected Endpoints

/v1/chat/completions
/v1/messages
/v1/responses

Note: /v1/embeddings was already using the correct weighted selection from utils/model.ts.

Test plan

Verified with simulation test script (scripts/test-selectmodel-comparison.ts)
Type check passes (bun run check)
Lint passes (bun run lint)

🤖 Generated with Claude Code

Summary by CodeRabbit

改进

改进
- 升级模型选择机制，实现加权随机负载均衡算法，在多个可用模型间更智能地分配请求，提高系统稳定性和可靠性。

_{✏️ Tip: You can customize this high-level summary in your review settings.}

The selectModel function in api-helpers.ts had a TODO comment but was only returning the first candidate, ignoring the weight configuration. This fix implements proper weighted random selection algorithm that distributes requests across multiple providers based on their configured weights. The same algorithm was already implemented in utils/model.ts (used by embeddings endpoint) but not in api-helpers.ts (used by chat completions, messages, and responses endpoints). Affected endpoints: - /v1/chat/completions - /v1/messages - /v1/responses Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

gemini-code-assist · 2026-01-19T21:05:33Z

Summary of Changes

Hello @pescn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a crucial fix to the selectModel function, which was previously failing to apply weighted load balancing for model selection. The change ensures that API requests are now distributed across available models based on their configured weights, preventing disproportionate routing to a single model and improving the overall efficiency and fairness of resource utilization for several key endpoints.

Highlights

Weighted Load Balancing Implementation: The selectModel function in api-helpers.ts has been updated to correctly implement weighted load balancing, addressing a previous TODO comment and ensuring model selection respects configured weights.
Corrected Model Distribution: This fix resolves an issue where all requests were routed to the first candidate model regardless of weight, now distributing traffic according to the specified weights for /v1/chat/completions, /v1/messages, and /v1/responses endpoints.
Alignment with Existing Logic: The newly implemented weighted random selection algorithm in api-helpers.ts now matches the logic already present in utils/model.ts, ensuring consistency across the codebase.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-01-19T21:05:40Z

📝 Walkthrough

总览

在 selectModel 函数中实现了加权随机负载均衡，增加了单个候选项的处理、基于 model.weight 的多候选项加权随机选择、选择日志记录以及回退机制。同时更新了相应的文档注释。

变更

类别 / 文件	变更摘要
加权负载均衡 `backend/src/utils/api-helpers.ts`	实现 `selectModel` 函数的加权随机选择逻辑：添加单候选项直接返回、多候选项基于权重的随机选择、选择事件日志记录，以及首候选项回退机制

估计代码审查工作量

🎯 2 (简单) | ⏱️ ~12 分钟

诗

🐰 权重舞动在数字间，
负载均衡展开笑颜，
随机漫步选中之道，
日志记录每个选择，
稳妥回退保驾护航！

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	PR标题清晰准确地描述了主要变更：在selectModel函数中实现加权负载均衡，与代码改动完全相关。
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae4895b and 4003963.

📒 Files selected for processing (1)

backend/src/utils/api-helpers.ts

🔇 Additional comments (2)

backend/src/utils/api-helpers.ts (2)

109-113: LGTM!

单一候选项的优化处理正确。通过提前返回避免不必要的权重计算，! 非空断言在 length === 1 检查后是安全的。

134-135: Fallback 处理合理。

作为防御性编程，在理论上不应触达的代码路径上提供 fallback 是好的实践。使用 ?? null 确保返回类型一致。

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request correctly implements weighted load balancing in the selectModel function within api-helpers.ts, resolving an issue where model weights were ignored. The implementation now aligns with the existing logic in utils/model.ts, ensuring consistent behavior. The change is well-described and tested. My review includes one suggestion to improve code clarity.

gemini-code-assist · 2026-01-19T21:06:45Z

+    // oxlint-disable-next-line no-unnecessary-type-assertion
+    return candidates[0]!; // TypeScript needs assertion here


The non-null assertion ! appears to be unnecessary here. The if (candidates.length === 1) check on line 110 guarantees that candidates[0] is defined. The oxlint-disable-next-line comment also suggests that a linter agrees the assertion is not needed. You can simplify the code by removing the assertion and the related comments.

Suggested change

// oxlint-disable-next-line no-unnecessary-type-assertion

return candidates[0]!; // TypeScript needs assertion here

return candidates[0];

…quests Implement automatic failover capability that switches to alternate providers when a request fails with retriable errors (5xx, 429, network timeouts). Key features: - Failover service with configurable retry behavior - Exponential backoff with jitter for retries - Same-provider retries for transient errors (429) - Cross-provider failover for persistent failures - Weighted random selection for failover candidates - Detailed error logging with all attempted providers Also includes the weighted load balancing fix from PR #49: - Implement proper weighted random selection in selectModel() - Add filterCandidates() helper for failover use Affected endpoints: - /v1/chat/completions - /v1/messages - /v1/responses Configuration defaults: - maxProviderAttempts: 3 - sameProviderRetries: 1 - retriableStatusCodes: [429, 500, 502, 503, 504] - timeoutMs: 120000 (2 minutes) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…quests (#50) * feat(backend): add automatic failover and retry for multi-provider requests Implement automatic failover capability that switches to alternate providers when a request fails with retriable errors (5xx, 429, network timeouts). Key features: - Failover service with configurable retry behavior - Exponential backoff with jitter for retries - Same-provider retries for transient errors (429) - Cross-provider failover for persistent failures - Weighted random selection for failover candidates - Detailed error logging with all attempted providers Also includes the weighted load balancing fix from PR #49: - Implement proper weighted random selection in selectModel() - Add filterCandidates() helper for failover use Affected endpoints: - /v1/chat/completions - /v1/messages - /v1/responses Configuration defaults: - maxProviderAttempts: 3 - sameProviderRetries: 1 - retriableStatusCodes: [429, 500, 502, 503, 504] - timeoutMs: 120000 (2 minutes) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address PR review comments for failover implementation - Change modelId fallback from ?? 0 to undefined to avoid database integrity issues - Add AbortError handling in isRetriableNetworkError for timeout scenarios - Fix SSE error format in responses.ts to match OpenAI Responses API spec - Add error capture to catch blocks for better logging - Optimize selectMultipleCandidates by calculating totalWeight once Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * refactor: extract shared failover error handling to reduce code duplication - Add processFailoverError helper function in api-helpers.ts - Refactor completions.ts, messages.ts, responses.ts to use shared helper - Reduces ~120 lines of duplicated error handling code - Maintains same behavior: forward non-retriable errors, return 502 on exhaustion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address code review feedback for failover service - Use more specific error message patterns in isRetriableNetworkError to avoid false positives (e.g., "network error" instead of "network") - Simplify provider iteration loop using for...of with entries() since candidates are already unique and ordered from selectMultipleCandidates - Remove redundant triedProviders Set Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address additional code review feedback - Fix SSE error type in messages.ts to use "server_error" per Anthropic spec - Remove unused "internal_error" case from FailoverErrorResult type Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

gemini-code-assist Bot reviewed Jan 19, 2026

View reviewed changes

pescn mentioned this pull request Jan 19, 2026

feat(backend): add automatic failover and retry for multi-provider requests #50

Merged

4 tasks

pescn merged commit d1c02d9 into main Jan 19, 2026
2 checks passed

pescn deleted the fix/weighted-load-balancing branch January 24, 2026 22:54

pescn restored the fix/weighted-load-balancing branch January 24, 2026 22:57

pescn deleted the fix/weighted-load-balancing branch January 24, 2026 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(backend): implement weighted load balancing in selectModel#49

fix(backend): implement weighted load balancing in selectModel#49
pescn merged 1 commit into
mainfrom
fix/weighted-load-balancing

pescn commented Jan 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

gemini-code-assist Bot commented Jan 19, 2026

Uh oh!

coderabbitai Bot commented Jan 19, 2026 •

edited

Loading

总览

变更

估计代码审查工作量

诗

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		// oxlint-disable-next-line no-unnecessary-type-assertion
		return candidates[0]!; // TypeScript needs assertion here

	// oxlint-disable-next-line no-unnecessary-type-assertion
	return candidates[0]!; // TypeScript needs assertion here
	return candidates[0];

Conversation

pescn commented Jan 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Test Results

Affected Endpoints

Test plan

Summary by CodeRabbit

改进

Uh oh!

gemini-code-assist Bot commented Jan 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai Bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

总览

变更

估计代码审查工作量

诗

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pescn commented Jan 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 19, 2026 •

edited

Loading