feat: Add per-API-key rate limiting (RPM/TPM)#45
Conversation
Add rpmLimit (requests per minute, default 50) and tpmLimit (tokens per minute, default 50000) fields to the api_keys table for per-key rate limiting configuration. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement rate limiting with token bucket algorithm: - Support 3x burst capacity for both RPM and TPM - Redis-based counters with automatic expiry - Pre-flight checks for RPM/TPM limits - Post-flight token consumption for accurate TPM tracking - Fail-open strategy on Redis errors Files: - apiKeyRateLimit.ts: Core rate limiting logic - apiKeyRateLimitPlugin.ts: Elysia plugin for request interception - apiKeyPlugin.ts: Store API key record for downstream use Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Integrate rate limiting plugin into v1 API endpoints: - completions.ts: OpenAI Chat Completions API - messages.ts: Anthropic Messages API - embeddings.ts: OpenAI Embeddings API - responses.ts: OpenAI Response API Token consumption is tracked post-flight after request completion, ensuring accurate TPM accounting for streaming responses. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add admin API endpoints for managing API key rate limits: - POST /admin/apiKey: Support rpmLimit/tpmLimit in creation - PUT /admin/apiKey/:key/ratelimit: Update rate limit config - GET /admin/apiKey/:key/usage: Get current usage statistics Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add UI for configuring API key rate limits: - RateLimitDialog: Form for editing RPM/TPM limits with live usage stats - Progress component: Radix UI progress bar for usage visualization - Row action menu: Add "Configure Rate Limits" option Dependencies: - @radix-ui/react-progress: Progress bar component Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Display real-time rate limit usage in the Applications table: - RateLimitCell: Progress bar with remaining/total display - Auto-refresh every 10 seconds - Smart number formatting (K/M suffixes) i18n updates: - en-US: Add rate limit related translations - zh-CN: "请求并发限制" for RPM, "Token并发限制" for TPM Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Backend: Cap remaining value at base limit for clearer UI display - Frontend: Use remaining from API directly, ensure non-negative display - Fix progress bar percentage calculation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Display format changed from remaining/total to used/total - Shows actual usage which better represents burst scenarios - Highlight in orange when usage exceeds base limit (burst usage) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthrough引入基于 Redis 的 API Key 速率限制(RPM 与 TPM),在数据库添加限额字段,后端增添限流插件与工具库、扩展路由与管理端点,前端新增速率展示/配置组件及本地化条目与进度条依赖。 Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ApiServer as API服务器
participant RateLimitPlugin as apiKeyRateLimitPlugin
participant Handler as 业务处理器
participant Redis
participant TokenConsume as consumeTokens
Client->>ApiServer: 带 API Key 发起请求
ApiServer->>RateLimitPlugin: 读取 apiKeyRecord 并执行限流检查
RateLimitPlugin->>Redis: 执行 checkRpmLimit(消耗 1 请求令牌)
Redis-->>RateLimitPlugin: 返回 RPM 允许/剩余
RateLimitPlugin->>Redis: 执行 checkTpmLimit(检查 TPM 可用)
Redis-->>RateLimitPlugin: 返回 TPM 允许/剩余
RateLimitPlugin->>ApiServer: 若被拒绝返回 429,否则继续
ApiServer->>Handler: 执行业务处理(生成响应,计算 tokens)
Handler->>TokenConsume: 调用 consumeTokens 更新 TPM(后置消耗)
TokenConsume->>Redis: 原子消费 TPM 令牌
Redis-->>TokenConsume: 确认已消费
Handler-->>Client: 返回最终响应
Client->>ApiServer: 或请求 GET /apiKey/:key/usage
ApiServer->>Redis: 调用 getRateLimitStatus
Redis-->>ApiServer: 返回 RPM/TPM 当前状态
ApiServer-->>Client: 返回使用/限额数据
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 分钟 Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
🔇 Additional comments (9)
✏️ Tip: You can disable this entire section by setting Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@backend/src/api/v1/completions.ts`:
- Around line 175-180: The non-streaming completion path consumes tokens without
validating token counts—completion.promptTokens and completion.completionTokens
can be -1—so update the block that calls consumeTokens(apiKeyRecord.id,
apiKeyRecord.tpmLimit, totalTokens) to first verify both promptTokens and
completionTokens are >= 0 (or compute totalTokens only when each is valid),
mirroring the check used in the streaming path (the same validation around
completion.promptTokens/completion.completionTokens used before consumeTokens);
only call consumeTokens when totalTokens is a non-negative number and
apiKeyRecord exists.
In `@backend/src/api/v1/messages.ts`:
- Around line 240-244: The non-stream branch consumes tokens without validating
prompt/completion token counts, risking negative consumption when
completion.promptTokens or completion.completionTokens default to -1; update the
block that calls consumeTokens (using apiKeyRecord.id, apiKeyRecord.tpmLimit,
totalTokens) to first check that completion.promptTokens > 0 &&
completion.completionTokens > 0 (or equivalently compute inputTokens and
outputTokens and ensure both > 0) and only call consumeTokens when that check
passes, matching the streaming-path guard.
In `@backend/src/api/v1/responses.ts`:
- Around line 251-256: The non-streaming path may call consumeTokens with
negative values because completion.promptTokens and completion.completionTokens
default to -1; add the same guard used in the streaming path to only call
consumeTokens when completion.promptTokens > 0 && completion.completionTokens >
0 (use those exact symbols) and pass apiKeyRecord.id and apiKeyRecord.tpmLimit
to consumeTokens only in that case to avoid consuming negative tokens.
🧹 Nitpick comments (11)
frontend/src/pages/api-keys/rate-limit-cell.tsx (1)
12-20: 建议将formatNumber提取到公共工具模块。此处的
formatNumber函数用于紧凑格式显示(如1.2K、1.0M),与@/lib/utils.ts中已有的formatNumber功能不同。建议将此函数移至utils.ts并重命名为formatCompactNumber,以便在其他需要紧凑数字显示的地方复用。♻️ 建议的重构
在
frontend/src/lib/utils.ts中添加:export function formatCompactNumber(num: number): string { if (num >= 1000000) { return `${(num / 1000000).toFixed(1)}M` } if (num >= 1000) { return `${(num / 1000).toFixed(0)}K` } return num.toString() }然后在此文件中导入使用:
-function formatNumber(num: number): string { - if (num >= 1000000) { - return `${(num / 1000000).toFixed(1)}M` - } - if (num >= 1000) { - return `${(num / 1000).toFixed(0)}K` - } - return num.toString() -} +import { formatCompactNumber } from '@/lib/utils'frontend/src/i18n/locales/zh-CN.json (1)
57-58: 列标题术语建议当前翻译使用"并发限制"(concurrency limit),而速率限制对话框中使用"速率限制"(rate limit)。建议统一术语以提高一致性:
♻️ 可选的术语统一建议
- "pages.api-keys.columns.RPM": "请求并发限制", - "pages.api-keys.columns.TPM": "Token并发限制", + "pages.api-keys.columns.RPM": "RPM", + "pages.api-keys.columns.TPM": "TPM",或者保持与英文版一致,直接使用缩写 RPM/TPM,因为这些是业界通用术语。
backend/drizzle/0008_fresh_wiccan.sql (1)
1-2: 迁移脚本正确,可选考虑添加范围约束迁移是安全的添加操作,默认值合理。前端显示有效范围为 RPM: 1-10000,TPM: 1-10000000。
如果需要在数据库层强制执行这些范围限制,可以考虑添加 CHECK 约束:
♻️ 可选: 添加数据库级别的范围约束
ALTER TABLE "api_keys" ADD COLUMN "rpm_limit" integer DEFAULT 50 NOT NULL CHECK (rpm_limit >= 1 AND rpm_limit <= 10000); ALTER TABLE "api_keys" ADD COLUMN "tpm_limit" integer DEFAULT 50000 NOT NULL CHECK (tpm_limit >= 1 AND tpm_limit <= 10000000);这可以作为应用层验证之外的额外保护层。
backend/src/api/v1/embeddings.ts (1)
218-226: 后置 Token 消费逻辑实现得当对于嵌入请求,正确地在响应完成后才消费 token。条件检查
store.apiKeyRecord && embeddingRecord.inputTokens > 0确保了只有在有效请求时才计数。不过,建议考虑对
consumeTokens添加错误处理,即使底层实现是 fail-open,显式处理能提高代码健壮性:♻️ 可选:添加显式错误处理
// Consume tokens for TPM rate limiting (post-flight) // For embeddings, we only have input tokens if (store.apiKeyRecord && embeddingRecord.inputTokens > 0) { - await consumeTokens( + await consumeTokens( store.apiKeyRecord.id, store.apiKeyRecord.tpmLimit, embeddingRecord.inputTokens, - ); + ).catch((err) => { + logger.warn("Failed to consume tokens for rate limiting", err); + }); }backend/src/plugins/apiKeyRateLimitPlugin.ts (2)
41-53: Retry-After 头部值可考虑动态计算当前
Retry-After固定为 60 秒。对于 RPM(每分钟请求数)这是合理的,但对于 TPM(每分钟 token 数),实际恢复时间可能更短或更长,取决于消费速率。如果需要更精确,可以根据 token bucket 的补充率计算实际等待时间。但当前实现作为 MVP 是可接受的。
Also applies to: 61-71
29-33: 跳过逻辑需要文档说明当
apiKeyRecord不存在时静默跳过速率限制。虽然注释说明了原因,但建议在 DEBUG 日志中记录此情况,便于排查配置问题。♻️ 可选:添加 DEBUG 日志
const apiKeyRecord = store.apiKeyRecord; if (!apiKeyRecord) { // No API key record means checkApiKey macro wasn't applied or failed // Skip rate limiting in this case + // Consider: logger.debug("Skipping rate limit - no API key record in store"); return; }backend/src/api/admin/apiKey.ts (1)
63-90: 速率限制更新端点实现合理PUT
/apiKey/:key/ratelimit端点设计符合 RESTful 规范。使用updateApiKey并正确设置updatedAt时间戳。一点建议:考虑是否需要对
rpmLimit和tpmLimit设置上限验证,防止误配置过高的值:♻️ 可选:添加上限验证
body: t.Object({ - rpmLimit: t.Number({ minimum: 1 }), - tpmLimit: t.Number({ minimum: 1 }), + rpmLimit: t.Number({ minimum: 1, maximum: 10000 }), + tpmLimit: t.Number({ minimum: 1, maximum: 10000000 }), }),backend/src/utils/apiKey.ts (1)
9-24: 对无效 API 密钥也会更新lastSeen字段当前实现会在验证前先调用
updateApiKey更新lastSeen,这意味着即使是已撤销或已过期的密钥,每次验证尝试都会更新其lastSeen时间戳。这可能不是预期行为,因为通常我们只想追踪有效密钥的最后使用时间。建议将
lastSeen更新移到验证成功之后:♻️ 建议的修复方案
export async function validateApiKey(key: string): Promise<ApiKey | null> { - const r = await updateApiKey({ - key, - lastSeen: new Date(), - }); + // First, fetch the key without updating + const r = await updateApiKey({ key }); if ( r !== null && !r.revoked && (r.expiresAt === null || r.expiresAt > new Date()) ) { + // Only update lastSeen for valid keys + await updateApiKey({ + key, + lastSeen: new Date(), + }); return r; } return null; }frontend/src/pages/api-keys/rate-limit-dialog.tsx (2)
159-165: 输入框的onChange处理可能导致 NaN当用户清空输入框时,
parseInt(e.target.value)返回NaN,然后|| 1会将其回退到 1。这对用户体验不太友好,因为用户可能想先清空再输入新值。建议使用更健壮的处理方式:
♻️ 建议的改进方案
- onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} + onChange={(e) => { + const value = e.target.value === '' ? '' : parseInt(e.target.value, 10); + field.onChange(Number.isNaN(value) ? field.value : value); + }}Also applies to: 179-185
196-198: 建议为提交按钮添加加载状态文案当
mutation.isPending为 true 时,按钮被禁用但文案保持不变。建议添加加载状态的视觉反馈:♻️ 建议的改进方案
- <Button type="submit" disabled={mutation.isPending}> - {t('pages.api-keys.rate-limit.Save')} - </Button> + <Button type="submit" disabled={mutation.isPending}> + {mutation.isPending ? t('common.Saving') : t('pages.api-keys.rate-limit.Save')} + </Button>backend/src/utils/apiKeyRateLimit.ts (1)
150-168:consumeTokens函数的apiKeyId参数类型问题
consumeTokens接收apiKeyId: number,但在调用时(如responses.ts第 254 行)传入的是apiKeyRecord?.id。如果apiKeyRecord为null,apiKeyRecord?.id将是undefined,但函数签名期望的是number。虽然调用处已经检查了
apiKeyRecord存在,但建议增强类型安全:♻️ 建议的修复方案
export async function consumeTokens( - apiKeyId: number, + apiKeyId: number | undefined, tpmLimit: number, tokensUsed: number, ): Promise<void> { + if (apiKeyId === undefined) { + return; + } const { tpmTokensKey, tpmTimestampKey } = getKeys(apiKeyId);或者保持当前签名,但确保所有调用点都有正确的空值检查。
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (21)
backend/drizzle/0008_fresh_wiccan.sqlbackend/drizzle/meta/0008_snapshot.jsonbackend/drizzle/meta/_journal.jsonbackend/src/api/admin/apiKey.tsbackend/src/api/v1/completions.tsbackend/src/api/v1/embeddings.tsbackend/src/api/v1/messages.tsbackend/src/api/v1/responses.tsbackend/src/db/schema.tsbackend/src/plugins/apiKeyPlugin.tsbackend/src/plugins/apiKeyRateLimitPlugin.tsbackend/src/utils/apiKey.tsbackend/src/utils/apiKeyRateLimit.tsfrontend/package.jsonfrontend/src/components/ui/progress.tsxfrontend/src/i18n/locales/en-US.jsonfrontend/src/i18n/locales/zh-CN.jsonfrontend/src/pages/api-keys/columns.tsxfrontend/src/pages/api-keys/rate-limit-cell.tsxfrontend/src/pages/api-keys/rate-limit-dialog.tsxfrontend/src/pages/api-keys/row-action-button.tsx
🧰 Additional context used
🧬 Code graph analysis (12)
backend/src/plugins/apiKeyRateLimitPlugin.ts (3)
backend/src/plugins/apiKeyPlugin.ts (1)
apiKeyPlugin(9-55)backend/src/utils/apiKeyRateLimit.ts (2)
checkRpmLimit(95-117)checkTpmLimit(124-144)backend/src/utils/redisClient.ts (1)
set(71-85)
frontend/src/components/ui/progress.tsx (1)
frontend/src/lib/utils.ts (1)
cn(4-6)
frontend/src/pages/api-keys/columns.tsx (1)
frontend/src/pages/api-keys/rate-limit-cell.tsx (1)
RateLimitCell(22-57)
frontend/src/pages/api-keys/row-action-button.tsx (3)
frontend/src/pages/upstreams/row-action-button.tsx (1)
RowActionButton(31-107)frontend/src/components/ui/dropdown-menu.tsx (1)
DropdownMenuItem(208-208)frontend/src/pages/api-keys/rate-limit-dialog.tsx (1)
RateLimitDialog(39-205)
backend/src/api/admin/apiKey.ts (2)
backend/src/db/index.ts (2)
updateApiKey(69-78)findApiKey(59-67)backend/src/utils/apiKeyRateLimit.ts (1)
getRateLimitStatus(174-214)
frontend/src/pages/api-keys/rate-limit-cell.tsx (4)
frontend/src/lib/utils.ts (1)
formatNumber(28-30)frontend/src/lib/api.ts (1)
api(21-21)frontend/src/components/ui/skeleton.tsx (1)
Skeleton(7-7)frontend/src/components/ui/progress.tsx (1)
Progress(23-23)
backend/src/plugins/apiKeyPlugin.ts (2)
backend/src/db/index.ts (1)
ApiKey(29-29)backend/src/utils/apiKey.ts (1)
validateApiKey(9-24)
backend/src/utils/apiKeyRateLimit.ts (2)
backend/src/utils/redisClient.ts (1)
redisClient(111-111)backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
consumeTokens(6-6)
backend/src/api/v1/messages.ts (4)
backend/src/plugins/apiKeyPlugin.ts (1)
ApiKey(7-7)backend/src/db/index.ts (1)
ApiKey(29-29)backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
apiKeyRateLimitPlugin(21-80)backend/src/plugins/rateLimitPlugin.ts (1)
rateLimitPlugin(12-64)
backend/src/api/v1/completions.ts (3)
backend/src/plugins/apiKeyPlugin.ts (1)
ApiKey(7-7)backend/src/db/index.ts (1)
ApiKey(29-29)backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
apiKeyRateLimitPlugin(21-80)
backend/src/api/v1/responses.ts (4)
backend/src/plugins/apiKeyPlugin.ts (1)
ApiKey(7-7)backend/src/db/index.ts (1)
ApiKey(29-29)backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
apiKeyRateLimitPlugin(21-80)backend/src/plugins/rateLimitPlugin.ts (1)
rateLimitPlugin(12-64)
backend/src/utils/apiKey.ts (2)
backend/src/plugins/apiKeyPlugin.ts (1)
ApiKey(7-7)backend/src/db/index.ts (2)
ApiKey(29-29)updateApiKey(69-78)
🔇 Additional comments (36)
frontend/package.json (1)
24-24: LGTM!新增的
@radix-ui/react-progress依赖版本与项目中其他 Radix UI 组件版本保持一致,符合新增 Progress 组件的需求。frontend/src/components/ui/progress.tsx (1)
1-23: LGTM!Progress 组件实现正确,遵循了 shadcn/ui 的标准模式:
- 正确使用
forwardRef并传递 reftranslateX计算逻辑合理,已处理value为undefined的情况- 样式组合使用
cn工具函数,支持自定义 className 覆盖frontend/src/pages/api-keys/rate-limit-cell.tsx (1)
22-56: LGTM!组件实现良好:
useQuery配置合理,10 秒刷新间隔适合实时用量显示staleTime: 5000避免短时间内重复请求- 正确处理了
limit === 0的边界情况(line 45)- 超限状态(burst usage)的视觉反馈清晰
frontend/src/pages/api-keys/columns.tsx (1)
37-50: LGTM!新增的 RPM 和 TPM 列实现正确:
- 使用相同的
apiKey作为queryKey的一部分,两个RateLimitCell组件可以共享缓存数据,避免重复请求- 遵循了现有列定义的模式
- i18n 键的使用保持了国际化一致性
backend/drizzle/meta/0008_snapshot.json (1)
73-86: LGTM!数据库 schema 变更正确:
rpm_limit默认值 50 和tpm_limit默认值 50000 是合理的初始限制- 两个字段都设置为
NOT NULL,确保数据完整性- 作为 Drizzle ORM 自动生成的快照文件,正确反映了迁移内容
frontend/src/i18n/locales/en-US.json (1)
56-85: LGTM!国际化文本完整且清晰,键名遵循现有的命名规范。描述中包含了有效的输入范围(RPM: 1-10000,TPM: 1-10000000),与后端验证逻辑对应。
frontend/src/i18n/locales/zh-CN.json (1)
74-86: LGTM!速率限制对话框的翻译完整准确,与英文版本结构一致。
backend/drizzle/meta/_journal.json (1)
60-67: LGTM!迁移日志条目格式正确,索引顺序连续,标签与 SQL 迁移文件名匹配。
backend/src/db/schema.ts (1)
27-29: LGTM!Schema 定义与 SQL 迁移完全一致,字段命名遵循驼峰命名规范,注释清晰说明了字段用途。
frontend/src/pages/api-keys/row-action-button.tsx (3)
35-36: 状态管理实现正确新增的
rateLimitDialogOpen状态与RateLimitDialog组件正确配合,遵循了 React 的受控组件模式。
107-110: 菜单项集成良好速率限制配置菜单项的放置位置合理,位于查看请求操作之后、破坏性操作(撤销)之前,符合 UX 最佳实践。图标选择
GaugeIcon也很贴切。
143-145: Dialog 集成正确
RateLimitDialog组件放置在AlertDialog外部是正确的,避免了嵌套对话框的潜在问题。Props 传递完整且类型安全。backend/src/api/v1/embeddings.ts (2)
66-66: 速率限制插件集成正确中间件链顺序正确:
apiKeyPlugin→apiKeyRateLimitPlugin→rateLimitPlugin。这确保了 API 密钥先验证,再检查速率限制。
238-244: 路由配置完整
apiKeyRateLimit: true正确启用了每请求的速率限制检查,与checkApiKey: true配合确保了完整的认证和限流流程。backend/src/plugins/apiKeyRateLimitPlugin.ts (2)
11-19: 错误响应格式符合 OpenAI 规范
createRateLimitError函数生成的错误结构与 OpenAI API 的rate_limit_error格式一致,有助于客户端兼容性。
73-78: 成功请求的响应头设置完整在请求成功时设置
X-RateLimit-*头部是良好的实践,让客户端可以主动监控配额使用情况,避免不必要的 429 错误。backend/src/plugins/apiKeyPlugin.ts (3)
4-7: 类型导出便于下游使用重新导出
ApiKey类型使得消费此插件的模块可以直接从@/plugins/apiKeyPlugin获取类型,避免额外的导入路径。
31-31: 状态类型定义正确
apiKeyRecord状态的类型ApiKey | null正确反映了验证前后的状态,便于 TypeScript 的类型守卫检查。
34-46: 宏实现从 beforeHandle 改为 resolve 是正确的选择使用
resolve而非beforeHandle允许异步验证并将结果存储到 store 中供后续中间件使用。流程清晰:验证失败返回 401,成功则将完整的 API Key 记录存入store.apiKeyRecord。backend/src/api/admin/apiKey.ts (2)
42-43: 默认值与数据库 schema 保持一致
rpmLimit默认 50,tpmLimit默认 50000,与 PR 目标中提到的数据库默认值一致,确保了创建时的行为一致性。
92-119: 使用量查询端点实现完整GET
/apiKey/:key/usage端点正确地:
- 验证 API Key 存在性
- 从 Redis 获取实时使用状态
- 返回结构化的限制和使用数据
这为前端的实时使用量显示提供了必要的数据支持。
backend/src/utils/apiKey.ts (2)
26-35: 废弃标记使用正确
checkApiKey函数的废弃处理方式合理,保持了向后兼容性,同时引导用户使用新的validateApiKey函数。
41-45: LGTM!
generateApiKey函数逻辑正确,使用crypto.randomBytes(16)生成 32 字符的十六进制密钥,格式为sk-xxx。backend/src/api/v1/responses.ts (3)
418-423: 流式路径的 token 消耗逻辑正确正确地在消耗 token 之前检查了
inputTokens > 0 && outputTokens > 0,避免了无效数据的消耗。
462-469: 插件注册顺序正确
apiKeyPlugin→apiKeyRateLimitPlugin→rateLimitPlugin的顺序确保了 API 密钥验证在速率限制检查之前完成,store.apiKeyRecord能够正确传递给速率限制插件。
589-596: 路由配置完整
checkApiKey: true和apiKeyRateLimit: true的配置确保了 API 密钥验证和速率限制都被启用。frontend/src/pages/api-keys/rate-limit-dialog.tsx (2)
63-71:useEffect依赖数组可优化将
form对象添加到依赖数组中虽然是 ESLint 推荐的做法,但form.reset方法的引用是稳定的。如果遇到不必要的重置问题,可以考虑使用form.reset的稳定引用。当前实现是可接受的,但请确认在实际使用中没有意外的重置行为。
93-106: 百分比计算逻辑健壮
getRpmPercentage和getTpmPercentage函数正确处理了除零错误和空数据情况,使用Math.min(100, ...)确保百分比不超过 100%。backend/src/api/v1/completions.ts (2)
342-347: 流式路径的 token 消耗逻辑正确正确地检查了
inputTokens > 0 && outputTokens > 0后再消耗 token。
387-395: 插件集成正确与
responses.ts保持一致的插件注册顺序和配置。backend/src/utils/apiKeyRateLimit.ts (3)
101-117: RPM 检查的失败开放策略
checkRpmLimit在 Redis 错误时返回{ allowed: true, remaining: rpmLimit },这是一种"失败开放"策略。这在可用性方面是合理的,但请确保这符合业务需求。在某些对安全性要求较高的场景中,可能需要"失败关闭"策略。建议在文档中明确说明这一行为。
190-206: 状态计算逻辑清晰
getRateLimitStatus正确地计算了使用量和剩余量:
- 使用量 = 容量 - 当前 token
- 剩余量被限制在基础限制(而非突发容量)内,便于 UI 显示
注释说明了 "remaining is capped at limit (not burst capacity) for UI simplicity",逻辑清晰。
6-10: 常量定义合理
WINDOW_SECONDS = 60:1 分钟窗口BURST_MULTIPLIER = 3:3 倍突发容量KEY_EXPIRY = 120:2 倍窗口时间的过期时间,确保 key 不会在窗口期间过期backend/src/api/v1/messages.ts (3)
15-16: LGTM!导入正确,与其他 API 端点(如 completions.ts、responses.ts)保持一致的模式。
402-406: LGTM!校验条件
inputTokens > 0 && outputTokens > 0很好,确保只在有有效 token 计数时才消费。根据 PR 描述,consumeTokens在 Redis 错误时采用 fail-open 策略,因此不需要额外的 try-catch 包装。
451-452: LGTM!速率限制插件集成正确:
- 插件链顺序正确(
apiKeyPlugin→apiKeyRateLimitPlugin→rateLimitPlugin)- 通过
store.apiKeyRecord传递 API 密钥记录到下游处理函数apiKeyRateLimit: true配合checkApiKey: true使用,确保在速率检查前apiKeyRecord已填充与
completions.ts、responses.ts等其他端点保持一致的实现模式。Also applies to: 456-456, 558-558, 568-568, 576-576
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| // Consume tokens for TPM rate limiting (post-flight) | ||
| if (apiKeyRecord) { | ||
| const totalTokens = completion.promptTokens + completion.completionTokens; | ||
| await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens); | ||
| } | ||
|
|
There was a problem hiding this comment.
非流式路径缺少 token 有效性检查(与 responses.ts 相同问题)
与 responses.ts 中的问题一致,非流式路径在消耗 token 之前没有检查 token 值是否有效。completion.promptTokens 和 completion.completionTokens 的初始值为 -1。
建议添加与流式路径(第 343 行)相同的检查:
🐛 建议的修复方案
// Consume tokens for TPM rate limiting (post-flight)
- if (apiKeyRecord) {
+ if (apiKeyRecord && completion.promptTokens > 0 && completion.completionTokens > 0) {
const totalTokens = completion.promptTokens + completion.completionTokens;
await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Consume tokens for TPM rate limiting (post-flight) | |
| if (apiKeyRecord) { | |
| const totalTokens = completion.promptTokens + completion.completionTokens; | |
| await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens); | |
| } | |
| // Consume tokens for TPM rate limiting (post-flight) | |
| if (apiKeyRecord && completion.promptTokens > 0 && completion.completionTokens > 0) { | |
| const totalTokens = completion.promptTokens + completion.completionTokens; | |
| await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens); | |
| } |
🤖 Prompt for AI Agents
In `@backend/src/api/v1/completions.ts` around lines 175 - 180, The non-streaming
completion path consumes tokens without validating token
counts—completion.promptTokens and completion.completionTokens can be -1—so
update the block that calls consumeTokens(apiKeyRecord.id,
apiKeyRecord.tpmLimit, totalTokens) to first verify both promptTokens and
completionTokens are >= 0 (or compute totalTokens only when each is valid),
mirroring the check used in the streaming path (the same validation around
completion.promptTokens/completion.completionTokens used before consumeTokens);
only call consumeTokens when totalTokens is a non-negative number and
apiKeyRecord exists.
| } | ||
|
|
||
| // Store API key record for rate limiting and other uses | ||
| store.apiKeyRecord = apiKeyRecord; |
There was a problem hiding this comment.
?这里是想干什么,全局存ApiKey显然存在并发问题
There was a problem hiding this comment.
Code Review
This pull request introduces a robust per-API-key rate limiting feature, including both RPM (Requests Per Minute) and TPM (Tokens Per Minute). The implementation spans the backend with database schema changes, a Redis-based token bucket algorithm, and new admin API endpoints, as well as the frontend with a new UI for configuration and monitoring. The changes are well-structured and the feature is comprehensive. I've identified a few areas for improvement in the frontend form handling to enhance the user experience. Overall, this is a great addition to the platform.
| const rateLimitSchema = z.object({ | ||
| rpmLimit: z.number().min(1).max(10000), | ||
| tpmLimit: z.number().min(1).max(10000000), | ||
| }) |
There was a problem hiding this comment.
To improve the user experience and simplify the form logic, I suggest using z.coerce.number() for the rate limit fields. This avoids the issue where clearing an input field causes it to default to 1, which can be frustrating for users. This change should be made in conjunction with removing the custom onChange handlers on the Input components.
| const rateLimitSchema = z.object({ | |
| rpmLimit: z.number().min(1).max(10000), | |
| tpmLimit: z.number().min(1).max(10000000), | |
| }) | |
| const rateLimitSchema = z.object({ | |
| rpmLimit: z.coerce.number().min(1).max(10000), | |
| tpmLimit: z.coerce.number().min(1).max(10000000), | |
| }) |
| <Input | ||
| type="number" | ||
| min={1} | ||
| max={10000} | ||
| {...field} | ||
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} | ||
| /> |
There was a problem hiding this comment.
This custom onChange handler can cause a poor user experience. When a user clears the input, it immediately defaults to 1, preventing them from easily typing a new number. After updating the Zod schema to use z.coerce.number(), this custom handler is no longer necessary and should be removed to improve usability.
| <Input | |
| type="number" | |
| min={1} | |
| max={10000} | |
| {...field} | |
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} | |
| /> | |
| <Input | |
| type="number" | |
| min={1} | |
| max={10000} | |
| {...field} | |
| /> |
| <Input | ||
| type="number" | ||
| min={1} | ||
| max={10000000} | ||
| {...field} | ||
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} | ||
| /> |
There was a problem hiding this comment.
This custom onChange handler can cause a poor user experience. When a user clears the input, it immediately defaults to 1, preventing them from easily typing a new number. After updating the Zod schema to use z.coerce.number(), this custom handler is no longer necessary and should be removed to improve usability.
| <Input | |
| type="number" | |
| min={1} | |
| max={10000000} | |
| {...field} | |
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} | |
| /> | |
| <Input | |
| type="number" | |
| min={1} | |
| max={10000000} | |
| {...field} | |
| /> |
…imiting - Fix global store concurrency bug: Replace Elysia `.state()` with `.derive()` + `resolve` macro to ensure apiKeyRecord is request-scoped, preventing rate limits from being applied to wrong keys - Fix Redis race conditions: Implement atomic Lua scripts for token bucket operations to prevent concurrent requests from causing incorrect token counts - Add token validity checks in non-streaming paths to prevent consuming invalid token counts - Add `eval` method to RedisClient for Lua script execution - Add zero-value limit protection to prevent division by zero errors Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add explicit null case handling in StopReason switches - Add missing tool_result case in content block converters - Add missing image and tool_result cases in Anthropic adapter - Add content_block_stop case in OpenAI chat stream handler - Refactor type assertion in openai-chat.ts to avoid lint warning Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@gemini-code-assist review /gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a robust per-API-key rate limiting feature, including both RPM (Requests Per Minute) and TPM (Tokens Per Minute) limits. The implementation is well-designed, utilizing a token bucket algorithm with Redis and Lua scripts for atomic operations, which is excellent for preventing race conditions. The changes span the backend database schema, API endpoints, and new plugins, as well as a comprehensive frontend UI for configuration and monitoring. My review includes a couple of suggestions to improve the user experience on the frontend and to clarify a comment in the backend logic for better maintainability. Overall, this is a solid and well-executed feature.
| -- Consume tokens (allow going negative for tracking) | ||
| tokens = math.max(0, tokens - consume) |
There was a problem hiding this comment.
The comment on line 238 states that the script allows the token count to go negative for tracking purposes. However, the implementation on line 239 with math.max(0, tokens - consume) explicitly prevents the token count from becoming negative. This discrepancy can be misleading for future developers. The comment should be updated to accurately describe the code's behavior.
| -- Consume tokens (allow going negative for tracking) | |
| tokens = math.max(0, tokens - consume) | |
| -- Consume tokens (flooring at 0) | |
| tokens = math.max(0, tokens - consume) |
| min={1} | ||
| max={10000} | ||
| {...field} | ||
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} |
There was a problem hiding this comment.
The || 1 part of this onChange handler causes a poor user experience. When a user clears the input to type a new number, it immediately defaults to 1. Removing this will allow the input to be temporarily empty, and validation will correctly catch it if the user submits an invalid value.
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} | |
| onChange={(e) => field.onChange(parseInt(e.target.value))} |
| min={1} | ||
| max={10000000} | ||
| {...field} | ||
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} |
There was a problem hiding this comment.
Similar to the RPM limit input, the || 1 part of this onChange handler causes a poor user experience. When a user clears the input, it defaults to 1. Removing this fallback will improve the editing experience and allow validation to handle empty/invalid states correctly on submission.
| onChange={(e) => field.onChange(parseInt(e.target.value) || 1)} | |
| onChange={(e) => field.onChange(parseInt(e.target.value))} |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
Features
Backend
rpm_limit(default 50) andtpm_limit(default 50000) fields toapi_keystablePUT /admin/apiKey/:key/ratelimit- Update rate limitsGET /admin/apiKey/:key/usage- Get current usage statisticsFrontend
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
发布说明
✏️ Tip: You can customize this high-level summary in your review settings.