feat(qwen): add rate limiting and quota error handling#1676
feat(qwen): add rate limiting and quota error handling#1676luispater merged 1 commit intorouter-for-me:devfrom
Conversation
piexian
commented
Feb 22, 2026
- Add 60 requests/minute rate limiting per credential using sliding window
- Detect insufficient_quota errors and set cooldown until next day (Beijing time)
- Map quota errors (HTTP 403/429) to 429 with retryAfter for conductor integration
- Cache Beijing timezone at package level to avoid repeated syscalls
- Add redactAuthID function to protect credentials in logs
- Extract wrapQwenError helper to consolidate error handling
- Add 60 requests/minute rate limiting per credential using sliding window - Detect insufficient_quota errors and set cooldown until next day (Beijing time) - Map quota errors (HTTP 403/429) to 429 with retryAfter for conductor integration - Cache Beijing timezone at package level to avoid repeated syscalls - Add redactAuthID function to protect credentials in logs - Extract wrapQwenError helper to consolidate error handling
Summary of ChangesHello @piexian, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the robustness and reliability of the Qwen API executor by introducing comprehensive rate limiting and intelligent quota error handling. It ensures that API calls respect Qwen's usage policies, preventing excessive requests and gracefully managing daily quota exhaustion by applying appropriate cooldown periods. These changes improve the system's resilience and provide better feedback for upstream services like conductor. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces rate limiting and quota error handling for Qwen API calls. It includes a sliding window rate limiter, detection of insufficient_quota errors to set a cooldown until the next day (Beijing time), and mapping of Qwen quota errors to HTTP 429 with retryAfter. Additionally, it caches the Beijing timezone, redacts authentication IDs for logging, and consolidates error handling into a helper function. The changes are well-structured and address important aspects of API interaction, such as robustness and logging.
| "io" | ||
| "net/http" | ||
| "strings" | ||
| "sync" |
There was a problem hiding this comment.
| if len(validTimestamps) == 0 { | ||
| delete(qwenRateLimiter.requests, authID) | ||
| } |
There was a problem hiding this comment.
The comment "Delete empty entries, otherwise update with pruned slice" is slightly misleading. The if len(validTimestamps) == 0 block deletes the entry, but if validTimestamps is not empty, the qwenRateLimiter.requests[authID] = validTimestamps line (line 117) updates the entry, effectively pruning it. Consider rephrasing for clarity.
| if len(validTimestamps) == 0 { | |
| delete(qwenRateLimiter.requests, authID) | |
| } | |
| // Delete entry if no valid timestamps remain, otherwise update with pruned slice | |
| if len(validTimestamps) == 0 { | |
| delete(qwenRateLimiter.requests, authID) | |
| } |
| if retryAfter < time.Second { | ||
| retryAfter = time.Second | ||
| } |
| msg := strings.ToLower(gjson.GetBytes(body, "error.message").String()) | ||
| if strings.Contains(msg, "insufficient_quota") || strings.Contains(msg, "quota exceeded") || | ||
| strings.Contains(msg, "free allocated quota exceeded") { | ||
| return true |
| var authLabel, authType, authValue string | ||
| if auth != nil { |
There was a problem hiding this comment.
The authID variable is declared and used in the checkQwenRateLimit call, but then it's removed from the var authLabel, authType, authValue string declaration. This is fine, but ensure that authID is still correctly passed to recordAPIRequest (line 270) and that its removal from the var declaration doesn't cause any issues with other parts of the code that might expect it there.
| var authLabel, authType, authValue string | ||
| if auth != nil { |
…ndling-clean feat(qwen): add rate limiting and quota error handling
…ndling-clean feat(qwen): add rate limiting and quota error handling
…ndling-clean feat(qwen): add rate limiting and quota error handling