Skip to content

feat: Add per-API-key rate limiting (RPM/TPM)#45

Merged
pescn merged 10 commits into
mainfrom
feat/api-key-rate-limit-v2
Jan 15, 2026
Merged

feat: Add per-API-key rate limiting (RPM/TPM)#45
pescn merged 10 commits into
mainfrom
feat/api-key-rate-limit-v2

Conversation

@pescn
Copy link
Copy Markdown
Contributor

@pescn pescn commented Jan 14, 2026

Summary

  • Add RPM (Requests Per Minute) and TPM (Tokens Per Minute) rate limiting for each API key
  • Implement token bucket algorithm with 3x burst capacity
  • Add admin API endpoints for rate limit configuration
  • Add frontend UI for configuring and monitoring rate limits

Features

Backend

  • Database: Add rpm_limit (default 50) and tpm_limit (default 50000) fields to api_keys table
  • Rate Limiting Service: Token bucket algorithm with Redis-based counters
    • 3x burst capacity support
    • Pre-flight RPM/TPM checks
    • Post-flight token consumption (accurate for streaming responses)
    • Fail-open on Redis errors
  • API Endpoints: Rate limiting applied to all v1 endpoints (completions, messages, embeddings, responses)
  • Admin API:
    • PUT /admin/apiKey/:key/ratelimit - Update rate limits
    • GET /admin/apiKey/:key/usage - Get current usage statistics

Frontend

  • Rate Limit Dialog: Configure RPM/TPM limits with live usage statistics
  • Usage Display: Progress bars in Applications table showing real-time usage
  • i18n: Added translations for rate limit features (en-US, zh-CN)

Test plan

  • Create new API key and verify default rate limits (RPM: 50, TPM: 50000)
  • Update rate limits via admin API and verify changes persist
  • Send multiple requests rapidly to test RPM limiting (expect 429 after limit)
  • Verify streaming responses correctly track token consumption
  • Test frontend rate limit dialog opens and displays current values
  • Verify progress bars update in real-time (10-second refresh)

🤖 Generated with Claude Code

Summary by CodeRabbit

发布说明

  • 新功能
    • 为每个 API 密钥新增可配置速率限制(RPM/TPM),默认 50/50000;可在管理界面查看、编辑并实时监控使用情况。
    • 管理界面新增 RPM/TPM 列、使用率进度条和“配置限额”对话框,支持即时保存并显示成功/失败提示(含本地化文本)。
    • 公共 API 支持查询/更新单个密钥限额与使用数据;超限时会返回限流错误并带上重试提示与限流头。

✏️ Tip: You can customize this high-level summary in your review settings.

pescn and others added 8 commits January 15, 2026 00:33
Add rpmLimit (requests per minute, default 50) and tpmLimit (tokens per
minute, default 50000) fields to the api_keys table for per-key rate
limiting configuration.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement rate limiting with token bucket algorithm:
- Support 3x burst capacity for both RPM and TPM
- Redis-based counters with automatic expiry
- Pre-flight checks for RPM/TPM limits
- Post-flight token consumption for accurate TPM tracking
- Fail-open strategy on Redis errors

Files:
- apiKeyRateLimit.ts: Core rate limiting logic
- apiKeyRateLimitPlugin.ts: Elysia plugin for request interception
- apiKeyPlugin.ts: Store API key record for downstream use

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Integrate rate limiting plugin into v1 API endpoints:
- completions.ts: OpenAI Chat Completions API
- messages.ts: Anthropic Messages API
- embeddings.ts: OpenAI Embeddings API
- responses.ts: OpenAI Response API

Token consumption is tracked post-flight after request completion,
ensuring accurate TPM accounting for streaming responses.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add admin API endpoints for managing API key rate limits:
- POST /admin/apiKey: Support rpmLimit/tpmLimit in creation
- PUT /admin/apiKey/:key/ratelimit: Update rate limit config
- GET /admin/apiKey/:key/usage: Get current usage statistics

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add UI for configuring API key rate limits:
- RateLimitDialog: Form for editing RPM/TPM limits with live usage stats
- Progress component: Radix UI progress bar for usage visualization
- Row action menu: Add "Configure Rate Limits" option

Dependencies:
- @radix-ui/react-progress: Progress bar component

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Display real-time rate limit usage in the Applications table:
- RateLimitCell: Progress bar with remaining/total display
- Auto-refresh every 10 seconds
- Smart number formatting (K/M suffixes)

i18n updates:
- en-US: Add rate limit related translations
- zh-CN: "请求并发限制" for RPM, "Token并发限制" for TPM

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Backend: Cap remaining value at base limit for clearer UI display
- Frontend: Use remaining from API directly, ensure non-negative display
- Fix progress bar percentage calculation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Display format changed from remaining/total to used/total
- Shows actual usage which better represents burst scenarios
- Highlight in orange when usage exceeds base limit (burst usage)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 15, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

引入基于 Redis 的 API Key 速率限制(RPM 与 TPM),在数据库添加限额字段,后端增添限流插件与工具库、扩展路由与管理端点,前端新增速率展示/配置组件及本地化条目与进度条依赖。

Changes

Cohort / File(s) 变更摘要
数据库迁移与快照
backend/drizzle/0008_fresh_wiccan.sql, backend/drizzle/meta/0008_snapshot.json, backend/drizzle/meta/_journal.json, backend/src/db/schema.ts
api_keys 表新增 rpm_limittpm_limit(默认 50 / 50000);包含迁移 SQL、模式快照与 journal 记录
速率限制核心工具
backend/src/utils/apiKeyRateLimit.ts, backend/src/utils/apiKey.ts, backend/src/utils/redisClient.ts
新增 Redis 驱动的令牌桶限流实现(checkRpmLimit、checkTpmLimit、consumeTokens、getRateLimitStatus)、validateApiKey(返回完整 ApiKey)并在 Redis 客户端添加 eval 方法
插件与认证改动
backend/src/plugins/apiKeyRateLimitPlugin.ts, backend/src/plugins/apiKeyPlugin.ts
新增 apiKeyRateLimitPlugin 并将其作为中间件/宏集成;将 key 验证改为 validate 流程并在请求上下文派生 apiKeyRecord;导出 ApiKey 类型与 consumeTokens 重导出
管理端点扩展
backend/src/api/admin/apiKey.ts
在创建 API Key 时包含 rpmLimit/tpmLimit,新增 PUT /apiKey/:key/ratelimit(更新限额)与 GET /apiKey/:key/usage(查询使用情况)端点
公开 API 集成
backend/src/api/v1/completions.ts, backend/src/api/v1/embeddings.ts, backend/src/api/v1/messages.ts, backend/src/api/v1/responses.ts
在这些路由注入 apiKeyRateLimitPlugin,扩展处理函数以接收 apiKeyRecord,并在响应后执行 TPM 消耗(consumeTokens);路由选项新增 apiKeyRateLimit: true
前端组件与页面
frontend/src/components/ui/progress.tsx, frontend/src/pages/api-keys/rate-limit-cell.tsx, frontend/src/pages/api-keys/rate-limit-dialog.tsx
新增 Progress 组件、RateLimitCell(实时展示 RPM/TPM 使用)与 RateLimitDialog(配置对话框,含表单校验与更新逻辑)
表格列与行操作
frontend/src/pages/api-keys/columns.tsx, frontend/src/pages/api-keys/row-action-button.tsx
在 API Keys 列表添加 RPM/TPM 列,并在行操作菜单加入“配置速率限制”以打开对话框
本地化与依赖
frontend/src/i18n/locales/en-US.json, frontend/src/i18n/locales/zh-CN.json, frontend/package.json
增加 RPM/TPM 相关的本地化条目(en-US 与 zh-CN),并在前端依赖中添加 @radix-ui/react-progress

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ApiServer as API服务器
    participant RateLimitPlugin as apiKeyRateLimitPlugin
    participant Handler as 业务处理器
    participant Redis
    participant TokenConsume as consumeTokens

    Client->>ApiServer: 带 API Key 发起请求
    ApiServer->>RateLimitPlugin: 读取 apiKeyRecord 并执行限流检查
    RateLimitPlugin->>Redis: 执行 checkRpmLimit(消耗 1 请求令牌)
    Redis-->>RateLimitPlugin: 返回 RPM 允许/剩余
    RateLimitPlugin->>Redis: 执行 checkTpmLimit(检查 TPM 可用)
    Redis-->>RateLimitPlugin: 返回 TPM 允许/剩余
    RateLimitPlugin->>ApiServer: 若被拒绝返回 429,否则继续
    ApiServer->>Handler: 执行业务处理(生成响应,计算 tokens)
    Handler->>TokenConsume: 调用 consumeTokens 更新 TPM(后置消耗)
    TokenConsume->>Redis: 原子消费 TPM 令牌
    Redis-->>TokenConsume: 确认已消费
    Handler-->>Client: 返回最终响应
    Client->>ApiServer: 或请求 GET /apiKey/:key/usage
    ApiServer->>Redis: 调用 getRateLimitStatus
    Redis-->>ApiServer: 返回 RPM/TPM 当前状态
    ApiServer-->>Client: 返回使用/限额数据
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 分钟

Poem

🐇 速率小兔在 Redis 场,
桶中令牌静又忙,
RPM 与 TPM 分秒量,
前端仪表一目彰,
配置生效护门户。

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed 标题清晰准确地总结了PR的主要变更:添加基于API密钥的速率限制功能(RPM/TPM)。
Docstring Coverage ✅ Passed Docstring coverage is 86.36% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings


📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ccce3fd and 6c6cf1e.

📒 Files selected for processing (5)
  • backend/src/adapters/response/anthropic.ts
  • backend/src/adapters/response/openai-chat.ts
  • backend/src/adapters/response/openai-response.ts
  • backend/src/adapters/upstream/anthropic.ts
  • backend/src/adapters/upstream/openai.ts
🔇 Additional comments (9)
backend/src/adapters/upstream/openai.ts (1)

239-242: LGTM!

显式处理 null 情况是一个良好的改进,使代码意图更清晰,同时与 PR 中其他适配器的 null 处理方式保持一致。

backend/src/adapters/response/anthropic.ts (2)

64-66: LGTM!

显式处理 null 情况并移除默认分支,依赖 TypeScript 的穷尽性检查来确保所有 StopReason 值都被覆盖。这是一个更严格且更安全的实现方式。


94-97: LGTM!

显式处理 tool_result 类型是正确的做法。注释清晰地说明了原因:工具结果不会包含在 Anthropic 助手响应中。

backend/src/adapters/upstream/anthropic.ts (2)

260-263: LGTM!

显式处理 null 情况,与其他适配器的改动保持一致。保留 default 分支确保了向前兼容性。


287-290: LGTM!

显式忽略 imagetool_result 块类型是合理的。这些块类型在内部格式中没有对应的表示,convertResponse 函数(第 301-304 行)会正确过滤掉返回 null 的结果。

backend/src/adapters/response/openai-response.ts (1)

71-72: LGTM!

null 映射到 "completed" 对于 Response API 是合理的语义。移除 default 分支后,switch 语句依赖 TypeScript 穷尽性检查确保所有 StopReason 值都被覆盖。

backend/src/adapters/response/openai-chat.ts (3)

110-112: LGTM!

显式处理 null 情况,与其他适配器保持一致。


274-286: LGTM!

使用显式类型扩展 OpenAIChatDelta & { reasoning_content?: string } 来处理 thinking 内容是一个清晰的实现方式,避免了类型断言,同时保持了与支持 reasoning_content 字段的模型(如 DeepSeek)的兼容性。


325-328: LGTM!

OpenAI Chat 格式确实没有显式的 block stop 事件,返回空字符串是正确的行为。注释清晰地说明了原因。

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@koitococo
Copy link
Copy Markdown
Contributor

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@backend/src/api/v1/completions.ts`:
- Around line 175-180: The non-streaming completion path consumes tokens without
validating token counts—completion.promptTokens and completion.completionTokens
can be -1—so update the block that calls consumeTokens(apiKeyRecord.id,
apiKeyRecord.tpmLimit, totalTokens) to first verify both promptTokens and
completionTokens are >= 0 (or compute totalTokens only when each is valid),
mirroring the check used in the streaming path (the same validation around
completion.promptTokens/completion.completionTokens used before consumeTokens);
only call consumeTokens when totalTokens is a non-negative number and
apiKeyRecord exists.

In `@backend/src/api/v1/messages.ts`:
- Around line 240-244: The non-stream branch consumes tokens without validating
prompt/completion token counts, risking negative consumption when
completion.promptTokens or completion.completionTokens default to -1; update the
block that calls consumeTokens (using apiKeyRecord.id, apiKeyRecord.tpmLimit,
totalTokens) to first check that completion.promptTokens > 0 &&
completion.completionTokens > 0 (or equivalently compute inputTokens and
outputTokens and ensure both > 0) and only call consumeTokens when that check
passes, matching the streaming-path guard.

In `@backend/src/api/v1/responses.ts`:
- Around line 251-256: The non-streaming path may call consumeTokens with
negative values because completion.promptTokens and completion.completionTokens
default to -1; add the same guard used in the streaming path to only call
consumeTokens when completion.promptTokens > 0 && completion.completionTokens >
0 (use those exact symbols) and pass apiKeyRecord.id and apiKeyRecord.tpmLimit
to consumeTokens only in that case to avoid consuming negative tokens.
🧹 Nitpick comments (11)
frontend/src/pages/api-keys/rate-limit-cell.tsx (1)

12-20: 建议将 formatNumber 提取到公共工具模块。

此处的 formatNumber 函数用于紧凑格式显示(如 1.2K1.0M),与 @/lib/utils.ts 中已有的 formatNumber 功能不同。建议将此函数移至 utils.ts 并重命名为 formatCompactNumber,以便在其他需要紧凑数字显示的地方复用。

♻️ 建议的重构

frontend/src/lib/utils.ts 中添加:

export function formatCompactNumber(num: number): string {
  if (num >= 1000000) {
    return `${(num / 1000000).toFixed(1)}M`
  }
  if (num >= 1000) {
    return `${(num / 1000).toFixed(0)}K`
  }
  return num.toString()
}

然后在此文件中导入使用:

-function formatNumber(num: number): string {
-  if (num >= 1000000) {
-    return `${(num / 1000000).toFixed(1)}M`
-  }
-  if (num >= 1000) {
-    return `${(num / 1000).toFixed(0)}K`
-  }
-  return num.toString()
-}
+import { formatCompactNumber } from '@/lib/utils'
frontend/src/i18n/locales/zh-CN.json (1)

57-58: 列标题术语建议

当前翻译使用"并发限制"(concurrency limit),而速率限制对话框中使用"速率限制"(rate limit)。建议统一术语以提高一致性:

♻️ 可选的术语统一建议
-  "pages.api-keys.columns.RPM": "请求并发限制",
-  "pages.api-keys.columns.TPM": "Token并发限制",
+  "pages.api-keys.columns.RPM": "RPM",
+  "pages.api-keys.columns.TPM": "TPM",

或者保持与英文版一致,直接使用缩写 RPM/TPM,因为这些是业界通用术语。

backend/drizzle/0008_fresh_wiccan.sql (1)

1-2: 迁移脚本正确,可选考虑添加范围约束

迁移是安全的添加操作,默认值合理。前端显示有效范围为 RPM: 1-10000,TPM: 1-10000000。

如果需要在数据库层强制执行这些范围限制,可以考虑添加 CHECK 约束:

♻️ 可选: 添加数据库级别的范围约束
ALTER TABLE "api_keys" ADD COLUMN "rpm_limit" integer DEFAULT 50 NOT NULL CHECK (rpm_limit >= 1 AND rpm_limit <= 10000);
ALTER TABLE "api_keys" ADD COLUMN "tpm_limit" integer DEFAULT 50000 NOT NULL CHECK (tpm_limit >= 1 AND tpm_limit <= 10000000);

这可以作为应用层验证之外的额外保护层。

backend/src/api/v1/embeddings.ts (1)

218-226: 后置 Token 消费逻辑实现得当

对于嵌入请求,正确地在响应完成后才消费 token。条件检查 store.apiKeyRecord && embeddingRecord.inputTokens > 0 确保了只有在有效请求时才计数。

不过,建议考虑对 consumeTokens 添加错误处理,即使底层实现是 fail-open,显式处理能提高代码健壮性:

♻️ 可选:添加显式错误处理
       // Consume tokens for TPM rate limiting (post-flight)
       // For embeddings, we only have input tokens
       if (store.apiKeyRecord && embeddingRecord.inputTokens > 0) {
-        await consumeTokens(
+        await consumeTokens(
           store.apiKeyRecord.id,
           store.apiKeyRecord.tpmLimit,
           embeddingRecord.inputTokens,
-        );
+        ).catch((err) => {
+          logger.warn("Failed to consume tokens for rate limiting", err);
+        });
       }
backend/src/plugins/apiKeyRateLimitPlugin.ts (2)

41-53: Retry-After 头部值可考虑动态计算

当前 Retry-After 固定为 60 秒。对于 RPM(每分钟请求数)这是合理的,但对于 TPM(每分钟 token 数),实际恢复时间可能更短或更长,取决于消费速率。

如果需要更精确,可以根据 token bucket 的补充率计算实际等待时间。但当前实现作为 MVP 是可接受的。

Also applies to: 61-71


29-33: 跳过逻辑需要文档说明

apiKeyRecord 不存在时静默跳过速率限制。虽然注释说明了原因,但建议在 DEBUG 日志中记录此情况,便于排查配置问题。

♻️ 可选:添加 DEBUG 日志
         const apiKeyRecord = store.apiKeyRecord;
         if (!apiKeyRecord) {
           // No API key record means checkApiKey macro wasn't applied or failed
           // Skip rate limiting in this case
+          // Consider: logger.debug("Skipping rate limit - no API key record in store");
           return;
         }
backend/src/api/admin/apiKey.ts (1)

63-90: 速率限制更新端点实现合理

PUT /apiKey/:key/ratelimit 端点设计符合 RESTful 规范。使用 updateApiKey 并正确设置 updatedAt 时间戳。

一点建议:考虑是否需要对 rpmLimittpmLimit 设置上限验证,防止误配置过高的值:

♻️ 可选:添加上限验证
       body: t.Object({
-        rpmLimit: t.Number({ minimum: 1 }),
-        tpmLimit: t.Number({ minimum: 1 }),
+        rpmLimit: t.Number({ minimum: 1, maximum: 10000 }),
+        tpmLimit: t.Number({ minimum: 1, maximum: 10000000 }),
       }),
backend/src/utils/apiKey.ts (1)

9-24: 对无效 API 密钥也会更新 lastSeen 字段

当前实现会在验证前先调用 updateApiKey 更新 lastSeen,这意味着即使是已撤销或已过期的密钥,每次验证尝试都会更新其 lastSeen 时间戳。这可能不是预期行为,因为通常我们只想追踪有效密钥的最后使用时间。

建议将 lastSeen 更新移到验证成功之后:

♻️ 建议的修复方案
 export async function validateApiKey(key: string): Promise<ApiKey | null> {
-  const r = await updateApiKey({
-    key,
-    lastSeen: new Date(),
-  });
+  // First, fetch the key without updating
+  const r = await updateApiKey({ key });
 
   if (
     r !== null &&
     !r.revoked &&
     (r.expiresAt === null || r.expiresAt > new Date())
   ) {
+    // Only update lastSeen for valid keys
+    await updateApiKey({
+      key,
+      lastSeen: new Date(),
+    });
     return r;
   }
 
   return null;
 }
frontend/src/pages/api-keys/rate-limit-dialog.tsx (2)

159-165: 输入框的 onChange 处理可能导致 NaN

当用户清空输入框时,parseInt(e.target.value) 返回 NaN,然后 || 1 会将其回退到 1。这对用户体验不太友好,因为用户可能想先清空再输入新值。

建议使用更健壮的处理方式:

♻️ 建议的改进方案
-  onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
+  onChange={(e) => {
+    const value = e.target.value === '' ? '' : parseInt(e.target.value, 10);
+    field.onChange(Number.isNaN(value) ? field.value : value);
+  }}

Also applies to: 179-185


196-198: 建议为提交按钮添加加载状态文案

mutation.isPending 为 true 时,按钮被禁用但文案保持不变。建议添加加载状态的视觉反馈:

♻️ 建议的改进方案
-  <Button type="submit" disabled={mutation.isPending}>
-    {t('pages.api-keys.rate-limit.Save')}
-  </Button>
+  <Button type="submit" disabled={mutation.isPending}>
+    {mutation.isPending ? t('common.Saving') : t('pages.api-keys.rate-limit.Save')}
+  </Button>
backend/src/utils/apiKeyRateLimit.ts (1)

150-168: consumeTokens 函数的 apiKeyId 参数类型问题

consumeTokens 接收 apiKeyId: number,但在调用时(如 responses.ts 第 254 行)传入的是 apiKeyRecord?.id。如果 apiKeyRecordnullapiKeyRecord?.id 将是 undefined,但函数签名期望的是 number

虽然调用处已经检查了 apiKeyRecord 存在,但建议增强类型安全:

♻️ 建议的修复方案
 export async function consumeTokens(
-  apiKeyId: number,
+  apiKeyId: number | undefined,
   tpmLimit: number,
   tokensUsed: number,
 ): Promise<void> {
+  if (apiKeyId === undefined) {
+    return;
+  }
   const { tpmTokensKey, tpmTimestampKey } = getKeys(apiKeyId);

或者保持当前签名,但确保所有调用点都有正确的空值检查。

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f7cf5b9 and 7d27234.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (21)
  • backend/drizzle/0008_fresh_wiccan.sql
  • backend/drizzle/meta/0008_snapshot.json
  • backend/drizzle/meta/_journal.json
  • backend/src/api/admin/apiKey.ts
  • backend/src/api/v1/completions.ts
  • backend/src/api/v1/embeddings.ts
  • backend/src/api/v1/messages.ts
  • backend/src/api/v1/responses.ts
  • backend/src/db/schema.ts
  • backend/src/plugins/apiKeyPlugin.ts
  • backend/src/plugins/apiKeyRateLimitPlugin.ts
  • backend/src/utils/apiKey.ts
  • backend/src/utils/apiKeyRateLimit.ts
  • frontend/package.json
  • frontend/src/components/ui/progress.tsx
  • frontend/src/i18n/locales/en-US.json
  • frontend/src/i18n/locales/zh-CN.json
  • frontend/src/pages/api-keys/columns.tsx
  • frontend/src/pages/api-keys/rate-limit-cell.tsx
  • frontend/src/pages/api-keys/rate-limit-dialog.tsx
  • frontend/src/pages/api-keys/row-action-button.tsx
🧰 Additional context used
🧬 Code graph analysis (12)
backend/src/plugins/apiKeyRateLimitPlugin.ts (3)
backend/src/plugins/apiKeyPlugin.ts (1)
  • apiKeyPlugin (9-55)
backend/src/utils/apiKeyRateLimit.ts (2)
  • checkRpmLimit (95-117)
  • checkTpmLimit (124-144)
backend/src/utils/redisClient.ts (1)
  • set (71-85)
frontend/src/components/ui/progress.tsx (1)
frontend/src/lib/utils.ts (1)
  • cn (4-6)
frontend/src/pages/api-keys/columns.tsx (1)
frontend/src/pages/api-keys/rate-limit-cell.tsx (1)
  • RateLimitCell (22-57)
frontend/src/pages/api-keys/row-action-button.tsx (3)
frontend/src/pages/upstreams/row-action-button.tsx (1)
  • RowActionButton (31-107)
frontend/src/components/ui/dropdown-menu.tsx (1)
  • DropdownMenuItem (208-208)
frontend/src/pages/api-keys/rate-limit-dialog.tsx (1)
  • RateLimitDialog (39-205)
backend/src/api/admin/apiKey.ts (2)
backend/src/db/index.ts (2)
  • updateApiKey (69-78)
  • findApiKey (59-67)
backend/src/utils/apiKeyRateLimit.ts (1)
  • getRateLimitStatus (174-214)
frontend/src/pages/api-keys/rate-limit-cell.tsx (4)
frontend/src/lib/utils.ts (1)
  • formatNumber (28-30)
frontend/src/lib/api.ts (1)
  • api (21-21)
frontend/src/components/ui/skeleton.tsx (1)
  • Skeleton (7-7)
frontend/src/components/ui/progress.tsx (1)
  • Progress (23-23)
backend/src/plugins/apiKeyPlugin.ts (2)
backend/src/db/index.ts (1)
  • ApiKey (29-29)
backend/src/utils/apiKey.ts (1)
  • validateApiKey (9-24)
backend/src/utils/apiKeyRateLimit.ts (2)
backend/src/utils/redisClient.ts (1)
  • redisClient (111-111)
backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
  • consumeTokens (6-6)
backend/src/api/v1/messages.ts (4)
backend/src/plugins/apiKeyPlugin.ts (1)
  • ApiKey (7-7)
backend/src/db/index.ts (1)
  • ApiKey (29-29)
backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
  • apiKeyRateLimitPlugin (21-80)
backend/src/plugins/rateLimitPlugin.ts (1)
  • rateLimitPlugin (12-64)
backend/src/api/v1/completions.ts (3)
backend/src/plugins/apiKeyPlugin.ts (1)
  • ApiKey (7-7)
backend/src/db/index.ts (1)
  • ApiKey (29-29)
backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
  • apiKeyRateLimitPlugin (21-80)
backend/src/api/v1/responses.ts (4)
backend/src/plugins/apiKeyPlugin.ts (1)
  • ApiKey (7-7)
backend/src/db/index.ts (1)
  • ApiKey (29-29)
backend/src/plugins/apiKeyRateLimitPlugin.ts (1)
  • apiKeyRateLimitPlugin (21-80)
backend/src/plugins/rateLimitPlugin.ts (1)
  • rateLimitPlugin (12-64)
backend/src/utils/apiKey.ts (2)
backend/src/plugins/apiKeyPlugin.ts (1)
  • ApiKey (7-7)
backend/src/db/index.ts (2)
  • ApiKey (29-29)
  • updateApiKey (69-78)
🔇 Additional comments (36)
frontend/package.json (1)

24-24: LGTM!

新增的 @radix-ui/react-progress 依赖版本与项目中其他 Radix UI 组件版本保持一致,符合新增 Progress 组件的需求。

frontend/src/components/ui/progress.tsx (1)

1-23: LGTM!

Progress 组件实现正确,遵循了 shadcn/ui 的标准模式:

  • 正确使用 forwardRef 并传递 ref
  • translateX 计算逻辑合理,已处理 valueundefined 的情况
  • 样式组合使用 cn 工具函数,支持自定义 className 覆盖
frontend/src/pages/api-keys/rate-limit-cell.tsx (1)

22-56: LGTM!

组件实现良好:

  • useQuery 配置合理,10 秒刷新间隔适合实时用量显示
  • staleTime: 5000 避免短时间内重复请求
  • 正确处理了 limit === 0 的边界情况(line 45)
  • 超限状态(burst usage)的视觉反馈清晰
frontend/src/pages/api-keys/columns.tsx (1)

37-50: LGTM!

新增的 RPM 和 TPM 列实现正确:

  • 使用相同的 apiKey 作为 queryKey 的一部分,两个 RateLimitCell 组件可以共享缓存数据,避免重复请求
  • 遵循了现有列定义的模式
  • i18n 键的使用保持了国际化一致性
backend/drizzle/meta/0008_snapshot.json (1)

73-86: LGTM!

数据库 schema 变更正确:

  • rpm_limit 默认值 50 和 tpm_limit 默认值 50000 是合理的初始限制
  • 两个字段都设置为 NOT NULL,确保数据完整性
  • 作为 Drizzle ORM 自动生成的快照文件,正确反映了迁移内容
frontend/src/i18n/locales/en-US.json (1)

56-85: LGTM!

国际化文本完整且清晰,键名遵循现有的命名规范。描述中包含了有效的输入范围(RPM: 1-10000,TPM: 1-10000000),与后端验证逻辑对应。

frontend/src/i18n/locales/zh-CN.json (1)

74-86: LGTM!

速率限制对话框的翻译完整准确,与英文版本结构一致。

backend/drizzle/meta/_journal.json (1)

60-67: LGTM!

迁移日志条目格式正确,索引顺序连续,标签与 SQL 迁移文件名匹配。

backend/src/db/schema.ts (1)

27-29: LGTM!

Schema 定义与 SQL 迁移完全一致,字段命名遵循驼峰命名规范,注释清晰说明了字段用途。

frontend/src/pages/api-keys/row-action-button.tsx (3)

35-36: 状态管理实现正确

新增的 rateLimitDialogOpen 状态与 RateLimitDialog 组件正确配合,遵循了 React 的受控组件模式。


107-110: 菜单项集成良好

速率限制配置菜单项的放置位置合理,位于查看请求操作之后、破坏性操作(撤销)之前,符合 UX 最佳实践。图标选择 GaugeIcon 也很贴切。


143-145: Dialog 集成正确

RateLimitDialog 组件放置在 AlertDialog 外部是正确的,避免了嵌套对话框的潜在问题。Props 传递完整且类型安全。

backend/src/api/v1/embeddings.ts (2)

66-66: 速率限制插件集成正确

中间件链顺序正确:apiKeyPluginapiKeyRateLimitPluginrateLimitPlugin。这确保了 API 密钥先验证,再检查速率限制。


238-244: 路由配置完整

apiKeyRateLimit: true 正确启用了每请求的速率限制检查,与 checkApiKey: true 配合确保了完整的认证和限流流程。

backend/src/plugins/apiKeyRateLimitPlugin.ts (2)

11-19: 错误响应格式符合 OpenAI 规范

createRateLimitError 函数生成的错误结构与 OpenAI API 的 rate_limit_error 格式一致,有助于客户端兼容性。


73-78: 成功请求的响应头设置完整

在请求成功时设置 X-RateLimit-* 头部是良好的实践,让客户端可以主动监控配额使用情况,避免不必要的 429 错误。

backend/src/plugins/apiKeyPlugin.ts (3)

4-7: 类型导出便于下游使用

重新导出 ApiKey 类型使得消费此插件的模块可以直接从 @/plugins/apiKeyPlugin 获取类型,避免额外的导入路径。


31-31: 状态类型定义正确

apiKeyRecord 状态的类型 ApiKey | null 正确反映了验证前后的状态,便于 TypeScript 的类型守卫检查。


34-46: 宏实现从 beforeHandle 改为 resolve 是正确的选择

使用 resolve 而非 beforeHandle 允许异步验证并将结果存储到 store 中供后续中间件使用。流程清晰:验证失败返回 401,成功则将完整的 API Key 记录存入 store.apiKeyRecord

backend/src/api/admin/apiKey.ts (2)

42-43: 默认值与数据库 schema 保持一致

rpmLimit 默认 50,tpmLimit 默认 50000,与 PR 目标中提到的数据库默认值一致,确保了创建时的行为一致性。


92-119: 使用量查询端点实现完整

GET /apiKey/:key/usage 端点正确地:

  1. 验证 API Key 存在性
  2. 从 Redis 获取实时使用状态
  3. 返回结构化的限制和使用数据

这为前端的实时使用量显示提供了必要的数据支持。

backend/src/utils/apiKey.ts (2)

26-35: 废弃标记使用正确

checkApiKey 函数的废弃处理方式合理,保持了向后兼容性,同时引导用户使用新的 validateApiKey 函数。


41-45: LGTM!

generateApiKey 函数逻辑正确,使用 crypto.randomBytes(16) 生成 32 字符的十六进制密钥,格式为 sk-xxx

backend/src/api/v1/responses.ts (3)

418-423: 流式路径的 token 消耗逻辑正确

正确地在消耗 token 之前检查了 inputTokens > 0 && outputTokens > 0,避免了无效数据的消耗。


462-469: 插件注册顺序正确

apiKeyPluginapiKeyRateLimitPluginrateLimitPlugin 的顺序确保了 API 密钥验证在速率限制检查之前完成,store.apiKeyRecord 能够正确传递给速率限制插件。


589-596: 路由配置完整

checkApiKey: trueapiKeyRateLimit: true 的配置确保了 API 密钥验证和速率限制都被启用。

frontend/src/pages/api-keys/rate-limit-dialog.tsx (2)

63-71: useEffect 依赖数组可优化

form 对象添加到依赖数组中虽然是 ESLint 推荐的做法,但 form.reset 方法的引用是稳定的。如果遇到不必要的重置问题,可以考虑使用 form.reset 的稳定引用。

当前实现是可接受的,但请确认在实际使用中没有意外的重置行为。


93-106: 百分比计算逻辑健壮

getRpmPercentagegetTpmPercentage 函数正确处理了除零错误和空数据情况,使用 Math.min(100, ...) 确保百分比不超过 100%。

backend/src/api/v1/completions.ts (2)

342-347: 流式路径的 token 消耗逻辑正确

正确地检查了 inputTokens > 0 && outputTokens > 0 后再消耗 token。


387-395: 插件集成正确

responses.ts 保持一致的插件注册顺序和配置。

backend/src/utils/apiKeyRateLimit.ts (3)

101-117: RPM 检查的失败开放策略

checkRpmLimit 在 Redis 错误时返回 { allowed: true, remaining: rpmLimit },这是一种"失败开放"策略。这在可用性方面是合理的,但请确保这符合业务需求。

在某些对安全性要求较高的场景中,可能需要"失败关闭"策略。建议在文档中明确说明这一行为。


190-206: 状态计算逻辑清晰

getRateLimitStatus 正确地计算了使用量和剩余量:

  • 使用量 = 容量 - 当前 token
  • 剩余量被限制在基础限制(而非突发容量)内,便于 UI 显示

注释说明了 "remaining is capped at limit (not burst capacity) for UI simplicity",逻辑清晰。


6-10: 常量定义合理

  • WINDOW_SECONDS = 60:1 分钟窗口
  • BURST_MULTIPLIER = 3:3 倍突发容量
  • KEY_EXPIRY = 120:2 倍窗口时间的过期时间,确保 key 不会在窗口期间过期
backend/src/api/v1/messages.ts (3)

15-16: LGTM!

导入正确,与其他 API 端点(如 completions.ts、responses.ts)保持一致的模式。


402-406: LGTM!

校验条件 inputTokens > 0 && outputTokens > 0 很好,确保只在有有效 token 计数时才消费。根据 PR 描述,consumeTokens 在 Redis 错误时采用 fail-open 策略,因此不需要额外的 try-catch 包装。


451-452: LGTM!

速率限制插件集成正确:

  • 插件链顺序正确(apiKeyPluginapiKeyRateLimitPluginrateLimitPlugin
  • 通过 store.apiKeyRecord 传递 API 密钥记录到下游处理函数
  • apiKeyRateLimit: true 配合 checkApiKey: true 使用,确保在速率检查前 apiKeyRecord 已填充

completions.tsresponses.ts 等其他端点保持一致的实现模式。

Also applies to: 456-456, 558-558, 568-568, 576-576

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +175 to +180
// Consume tokens for TPM rate limiting (post-flight)
if (apiKeyRecord) {
const totalTokens = completion.promptTokens + completion.completionTokens;
await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

非流式路径缺少 token 有效性检查(与 responses.ts 相同问题)

responses.ts 中的问题一致,非流式路径在消耗 token 之前没有检查 token 值是否有效。completion.promptTokenscompletion.completionTokens 的初始值为 -1

建议添加与流式路径(第 343 行)相同的检查:

🐛 建议的修复方案
   // Consume tokens for TPM rate limiting (post-flight)
-  if (apiKeyRecord) {
+  if (apiKeyRecord && completion.promptTokens > 0 && completion.completionTokens > 0) {
     const totalTokens = completion.promptTokens + completion.completionTokens;
     await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Consume tokens for TPM rate limiting (post-flight)
if (apiKeyRecord) {
const totalTokens = completion.promptTokens + completion.completionTokens;
await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens);
}
// Consume tokens for TPM rate limiting (post-flight)
if (apiKeyRecord && completion.promptTokens > 0 && completion.completionTokens > 0) {
const totalTokens = completion.promptTokens + completion.completionTokens;
await consumeTokens(apiKeyRecord.id, apiKeyRecord.tpmLimit, totalTokens);
}
🤖 Prompt for AI Agents
In `@backend/src/api/v1/completions.ts` around lines 175 - 180, The non-streaming
completion path consumes tokens without validating token
counts—completion.promptTokens and completion.completionTokens can be -1—so
update the block that calls consumeTokens(apiKeyRecord.id,
apiKeyRecord.tpmLimit, totalTokens) to first verify both promptTokens and
completionTokens are >= 0 (or compute totalTokens only when each is valid),
mirroring the check used in the streaming path (the same validation around
completion.promptTokens/completion.completionTokens used before consumeTokens);
only call consumeTokens when totalTokens is a non-negative number and
apiKeyRecord exists.

Comment thread backend/src/api/v1/messages.ts
Comment thread backend/src/api/v1/responses.ts
Comment thread backend/src/utils/apiKeyRateLimit.ts Outdated
Comment thread backend/src/plugins/apiKeyPlugin.ts Outdated
}

// Store API key record for rate limiting and other uses
store.apiKeyRecord = apiKeyRecord;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?这里是想干什么,全局存ApiKey显然存在并发问题

https://elysiajs.com/patterns/extends-context.html#state

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust per-API-key rate limiting feature, including both RPM (Requests Per Minute) and TPM (Tokens Per Minute). The implementation spans the backend with database schema changes, a Redis-based token bucket algorithm, and new admin API endpoints, as well as the frontend with a new UI for configuration and monitoring. The changes are well-structured and the feature is comprehensive. I've identified a few areas for improvement in the frontend form handling to enhance the user experience. Overall, this is a great addition to the platform.

Comment on lines +26 to +29
const rateLimitSchema = z.object({
rpmLimit: z.number().min(1).max(10000),
tpmLimit: z.number().min(1).max(10000000),
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve the user experience and simplify the form logic, I suggest using z.coerce.number() for the rate limit fields. This avoids the issue where clearing an input field causes it to default to 1, which can be frustrating for users. This change should be made in conjunction with removing the custom onChange handlers on the Input components.

Suggested change
const rateLimitSchema = z.object({
rpmLimit: z.number().min(1).max(10000),
tpmLimit: z.number().min(1).max(10000000),
})
const rateLimitSchema = z.object({
rpmLimit: z.coerce.number().min(1).max(10000),
tpmLimit: z.coerce.number().min(1).max(10000000),
})

Comment on lines +159 to +165
<Input
type="number"
min={1}
max={10000}
{...field}
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
/>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This custom onChange handler can cause a poor user experience. When a user clears the input, it immediately defaults to 1, preventing them from easily typing a new number. After updating the Zod schema to use z.coerce.number(), this custom handler is no longer necessary and should be removed to improve usability.

Suggested change
<Input
type="number"
min={1}
max={10000}
{...field}
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
/>
<Input
type="number"
min={1}
max={10000}
{...field}
/>

Comment on lines +179 to +185
<Input
type="number"
min={1}
max={10000000}
{...field}
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
/>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This custom onChange handler can cause a poor user experience. When a user clears the input, it immediately defaults to 1, preventing them from easily typing a new number. After updating the Zod schema to use z.coerce.number(), this custom handler is no longer necessary and should be removed to improve usability.

Suggested change
<Input
type="number"
min={1}
max={10000000}
{...field}
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
/>
<Input
type="number"
min={1}
max={10000000}
{...field}
/>

pescn and others added 2 commits January 15, 2026 21:04
…imiting

- Fix global store concurrency bug: Replace Elysia `.state()` with `.derive()` + `resolve` macro
  to ensure apiKeyRecord is request-scoped, preventing rate limits from being applied to wrong keys

- Fix Redis race conditions: Implement atomic Lua scripts for token bucket operations
  to prevent concurrent requests from causing incorrect token counts

- Add token validity checks in non-streaming paths to prevent consuming invalid token counts

- Add `eval` method to RedisClient for Lua script execution

- Add zero-value limit protection to prevent division by zero errors

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add explicit null case handling in StopReason switches
- Add missing tool_result case in content block converters
- Add missing image and tool_result cases in Anthropic adapter
- Add content_block_stop case in OpenAI chat stream handler
- Refactor type assertion in openai-chat.ts to avoid lint warning

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@pescn
Copy link
Copy Markdown
Contributor Author

pescn commented Jan 15, 2026

@gemini-code-assist review

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust per-API-key rate limiting feature, including both RPM (Requests Per Minute) and TPM (Tokens Per Minute) limits. The implementation is well-designed, utilizing a token bucket algorithm with Redis and Lua scripts for atomic operations, which is excellent for preventing race conditions. The changes span the backend database schema, API endpoints, and new plugins, as well as a comprehensive frontend UI for configuration and monitoring. My review includes a couple of suggestions to improve the user experience on the frontend and to clarify a comment in the backend logic for better maintainability. Overall, this is a solid and well-executed feature.

Comment on lines +238 to +239
-- Consume tokens (allow going negative for tracking)
tokens = math.max(0, tokens - consume)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment on line 238 states that the script allows the token count to go negative for tracking purposes. However, the implementation on line 239 with math.max(0, tokens - consume) explicitly prevents the token count from becoming negative. This discrepancy can be misleading for future developers. The comment should be updated to accurately describe the code's behavior.

Suggested change
-- Consume tokens (allow going negative for tracking)
tokens = math.max(0, tokens - consume)
-- Consume tokens (flooring at 0)
tokens = math.max(0, tokens - consume)

min={1}
max={10000}
{...field}
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The || 1 part of this onChange handler causes a poor user experience. When a user clears the input to type a new number, it immediately defaults to 1. Removing this will allow the input to be temporarily empty, and validation will correctly catch it if the user submits an invalid value.

Suggested change
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
onChange={(e) => field.onChange(parseInt(e.target.value))}

min={1}
max={10000000}
{...field}
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the RPM limit input, the || 1 part of this onChange handler causes a poor user experience. When a user clears the input, it defaults to 1. Removing this fallback will improve the editing experience and allow validation to handle empty/invalid states correctly on submission.

Suggested change
onChange={(e) => field.onChange(parseInt(e.target.value) || 1)}
onChange={(e) => field.onChange(parseInt(e.target.value))}

@pescn
Copy link
Copy Markdown
Contributor Author

pescn commented Jan 15, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants