feat(dashboard): add optional Grafana dashboard embedding on overview page#65
Conversation
… page Add an admin settings API and allow users to configure a Grafana dashboard URL. When configured, a toggle appears on the overview page to switch between built-in Recharts charts and an embedded Grafana iframe. Includes a pre-built Grafana dashboard JSON for easy import. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary of ChangesHello @pescn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the dashboard's flexibility by allowing administrators to integrate external Grafana dashboards directly into the overview page. It provides a new backend API for storing configuration values, a frontend mechanism to fetch and display the Grafana dashboard via an iframe, and a user interface toggle to switch between the native charts and the embedded Grafana view. This enables users to leverage their existing Grafana setups for monitoring alongside the application's built-in analytics. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces an excellent new feature for embedding a Grafana dashboard on the overview page. The implementation is solid, covering both backend API additions for settings and frontend components for rendering and toggling the view. I've identified a few areas for improvement, mainly around robustness and security, such as adding more specific validation for API inputs, improving error handling in the data fetching hook, and making the Grafana embedding component more resilient to invalid URLs. Addressing these points will make the feature more stable and secure.
📝 WalkthroughWalkthrough新增 /admin/settings 与 /admin/dashboards 路由组并挂载;后端增加 Grafana 仪表板 schema、存取、迁移与 env 覆盖逻辑;Prometheus 指标加入成本相关 counters;前端新增 Grafana Hook、嵌入组件与视图切换,更新路由查询 schema、本地化词条及监控部署配置。 Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(245,245,255,0.5)
participant 用户 as User
participant 前端 as Frontend
participant 后端API as API
participant 存储 as DB
participant Grafana as Grafana
end
用户->>前端: 访问概览页面
前端->>后端API: GET /admin/dashboards
后端API->>存储: 读取 DASHBOARDS_KEY / legacy key 或 env
存储-->>后端API: 返回 dashboards 或 空
alt env override active
后端API-->>前端: 返回 { dashboards: from env, envOverride: true }
else
后端API-->>前端: 返回 { dashboards, envOverride: false }
end
前端->>前端: 渲染 ViewModeToggle (dashboards)
用户->>前端: 切换 view -> grafana
前端->>前端: 渲染 GrafanaEmbed(url)
GrafanaEmbed->>GrafanaEmbed: 确保 URL 包含 ?kiosk=
GrafanaEmbed->>Grafana: iframe 请求 URL
Grafana-->>GrafanaEmbed: 返回嵌入内容
GrafanaEmbed-->>用户: 显示仪表盘
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
frontend/src/routes/_dashboard/route.tsx (1)
27-64: Grafana 配置被移除时可能“卡死”在 grafana 视图,导致桌面端无法切回或调整时间范围。当 URL 仍带
view=grafana但grafanaUrl为空时,Toggle 隐藏且 TimeRangeSelect 被隐藏,用户无法恢复。建议引入effectiveViewMode,在未配置 Grafana 时强制回退到builtin。🩹 建议改动
const grafanaUrl = useGrafanaDashboardUrl() const timeRange = range as TimeRange - const viewMode = view as ViewMode + const viewMode = view as ViewMode + const effectiveViewMode: ViewMode = grafanaUrl ? viewMode : 'builtin' @@ - {grafanaUrl && <ViewModeToggle value={viewMode} onChange={handleViewModeChange} />} - {viewMode !== 'grafana' && <TimeRangeSelect value={timeRange} onChange={handleTimeRangeChange} />} + {grafanaUrl && <ViewModeToggle value={effectiveViewMode} onChange={handleViewModeChange} />} + {effectiveViewMode !== 'grafana' && ( + <TimeRangeSelect value={timeRange} onChange={handleTimeRangeChange} /> + )}
🤖 Fix all issues with AI agents
In `@frontend/src/hooks/use-settings.ts`:
- Around line 7-20: The hook currently casts the settings value to string
without validation, which can let non-string values through and break Grafana
embedding; update grafanaDashboardUrlQueryOptions (and/or the
useGrafanaDashboardUrl return logic) to validate the runtime type: after
fetching data from api.admin.settings[GRAFANA_DASHBOARD_URL_KEY].get(), check
typeof value === 'string' and that value.trim() !== '' before returning it,
otherwise return undefined (or null normalized to undefined) so
useGrafanaDashboardUrl only ever returns a non-empty string or undefined.
In `@frontend/src/pages/overview/grafana-embed.tsx`:
- Around line 9-12: The code constructs embedUrl with new URL(url) which will
throw for invalid/undefined input; wrap the creation in a try/catch (around the
new URL(url) call in grafana-embed.tsx) or validate the input string first, and
if it fails return a safe fallback (e.g., null/placeholder component) or log the
error and skip setting kiosk; ensure any subsequent use of embedUrl is guarded
so an invalid URL does not cause a runtime crash.
In `@grafana/nexusgate-overview.json`:
- Around line 152-158: The Success Rate PromQL expression uses division by
sum(nexusgate_completions_total) and can divide by zero when there are no
completions; update the query that uses the expr value to guard against zero by
applying a safe denominator (e.g., clamp_min on sum(nexusgate_completions_total)
or using an or vector(0) fallback) so the metric
(sum(nexusgate_completions_total{status="completed"}) /
sum(nexusgate_completions_total)) cannot produce NaN or no-data; change the expr
for the target named "A" to use one of these patterns to prevent divide-by-zero.
🧹 Nitpick comments (2)
frontend/src/pages/overview/grafana-embed.tsx (1)
20-26: iframe 的 title 属性建议使用 i18n。为保持一致性和可访问性,建议将 iframe 的
title属性也使用国际化翻译,与CardTitle保持一致。♻️ 建议的修改
<iframe src={embedUrl.toString()} className="h-[800px] w-full rounded border-0" sandbox="allow-scripts allow-same-origin allow-popups allow-forms" loading="lazy" - title="Grafana Dashboard" + title={t('pages.overview.grafana.title')} />frontend/src/pages/overview/view-mode-toggle.tsx (1)
18-33: 建议添加无障碍属性以提升可访问性。切换按钮缺少
aria-pressed属性,屏幕阅读器用户无法感知当前选中状态。♿ 建议的修改
<Button variant="ghost" size="xs" className={cn(value === 'builtin' && 'bg-accent')} onClick={() => onChange('builtin')} + aria-pressed={value === 'builtin'} > {t('pages.overview.viewMode.builtin')} </Button> <Button variant="ghost" size="xs" className={cn(value === 'grafana' && 'bg-accent')} onClick={() => onChange('grafana')} + aria-pressed={value === 'grafana'} > {t('pages.overview.viewMode.grafana')} </Button>
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
backend/src/api/admin/index.tsbackend/src/api/admin/settings.tsfrontend/src/hooks/use-settings.tsfrontend/src/i18n/locales/en-US.jsonfrontend/src/i18n/locales/zh-CN.jsonfrontend/src/pages/overview/grafana-embed.tsxfrontend/src/pages/overview/index.tsxfrontend/src/pages/overview/view-mode-toggle.tsxfrontend/src/routeTree.gen.tsfrontend/src/routes/_dashboard/index.tsxfrontend/src/routes/_dashboard/route.tsxgrafana/nexusgate-overview.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-24T18:23:42.635Z
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 59
File: backend/src/api/v1/responses.ts:566-577
Timestamp: 2026-01-24T18:23:42.635Z
Learning: When using Elysia with the apiKeyPlugin, if a route option has checkApiKey: true, the apiKeyRecord parameter is guaranteed to be non-null. Do not add explicit null checks or non-null assertions for apiKeyRecord in such routes; rely on this contract and avoid unnecessary guards to improve readability. If there is any doubt about the guarantee in a specific place, add a runtime assertion at the boundary where the plugin is configured, but otherwise omit null checks for apiKeyRecord in backend code under backend/.
Applied to files:
backend/src/api/admin/settings.tsbackend/src/api/admin/index.ts
🧬 Code graph analysis (5)
frontend/src/hooks/use-settings.ts (1)
frontend/src/lib/api.ts (1)
api(21-21)
frontend/src/pages/overview/view-mode-toggle.tsx (2)
frontend/src/components/ui/button.tsx (1)
Button(50-50)frontend/src/lib/utils.ts (1)
cn(4-6)
backend/src/api/admin/settings.ts (1)
backend/src/db/index.ts (2)
getAllSettings(583-590)upsertSetting(613-625)
backend/src/api/admin/index.ts (1)
backend/src/api/admin/settings.ts (1)
adminSettings(4-58)
frontend/src/routes/_dashboard/route.tsx (5)
frontend/src/hooks/use-settings.ts (1)
useGrafanaDashboardUrl(18-21)frontend/src/pages/overview/use-overview-stats.ts (1)
TimeRange(6-6)frontend/src/pages/overview/view-mode-toggle.tsx (2)
ViewMode(6-6)ViewModeToggle(13-36)frontend/src/components/app/app-header.tsx (1)
AppHeaderPart(23-29)frontend/src/pages/overview/time-range-select.tsx (1)
TimeRangeSelect(22-39)
🔇 Additional comments (9)
frontend/src/i18n/locales/en-US.json (1)
397-400: LGTM!新增的国际化键命名规范,与现有的
pages.overview.*命名模式保持一致,翻译内容清晰准确。frontend/src/i18n/locales/zh-CN.json (1)
398-401: LGTM!中文翻译准确,"内置"、"Grafana" 和 "Grafana 仪表盘" 的用词恰当,与文件中现有的术语风格一致。
grafana/nexusgate-overview.json (1)
1-367: Dashboard 整体结构良好。Grafana 仪表盘配置完善,包含关键指标(请求数、延迟百分位、Token 使用量、成功率等),面板布局合理,为 NexusGate 提供了良好的可观测性支持。
frontend/src/pages/overview/view-mode-toggle.tsx (1)
6-11: LGTM!
ViewMode类型定义清晰,组件接口设计简洁,导出类型方便其他组件复用。backend/src/api/admin/index.ts (1)
10-10: LGTM!
adminSettings模块正确集成到管理路由中,受checkAdminApiKey守卫保护,与现有的管理模块挂载模式一致。Also applies to: 33-33
frontend/src/routes/_dashboard/index.tsx (1)
8-10: 视图参数纳入搜索校验,设计合理。frontend/src/pages/overview/index.tsx (1)
62-111: Grafana/内置视图切换渲染逻辑清晰,移动端控制也处理到位。frontend/src/routeTree.gen.ts (1)
7-312: 自动生成文件,无需人工评审。backend/src/api/admin/settings.ts (1)
4-57: 不存在认证绕过风险。adminSettings已通过.guard({ checkAdminApiKey: true })保护,所有设置相关的 GET、PUT、DELETE 操作均需要有效的管理员 API 密钥。
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Add support for multiple Grafana dashboards with environment variable override. When GRAFANA_DASHBOARDS env var is set, it takes precedence over database settings. Backend changes: - Add GRAFANA_DASHBOARDS env var schema in config.ts - Create dashboard utility (backend/src/utils/dashboards.ts) with env override detection and legacy migration - Add /api/admin/dashboards endpoints (GET/PUT/DELETE) - Return 409 Conflict when trying to modify while env override active Frontend changes: - Update use-settings hook with useGrafanaDashboards() - Convert ViewModeToggle to multi-option selector - Support dynamic dashboard list from API - Change URL param from enum to string for dashboard IDs Also updates quick-start.sh to auto-configure GRAFANA_DASHBOARDS when monitoring is enabled. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@frontend/src/routes/_dashboard/route.tsx`:
- Around line 62-66: The mobile users can get stuck in Grafana view because
ViewModeToggle and TimeRangeSelect are hidden on mobile; update the route logic
in route.tsx to detect mobile (use existing viewMode and handleViewModeChange)
and either (A) render a minimal ViewModeToggle on mobile so users can switch
back to 'builtin', or (B) enforce built-in on mobile by checking the query param
(?view=...) during route load and redirecting/resetting viewMode to 'builtin'
when on a mobile device; locate the viewMode, handleViewModeChange,
ViewModeToggle and TimeRangeSelect usage in route.tsx and implement one of these
fixes in that router/component initialization path.
In `@scripts/quick-start.sh`:
- Around line 408-410: The GRAFANA_DASHBOARD_EMBED_URL is hardcoding "localhost"
which breaks remote access; update the script to build the URL from a
configurable host (e.g., use an environment variable like GRAFANA_HOST or
GRAFANA_DASHBOARD_HOST with a fallback to "localhost" when unset) instead of
always using "localhost", and add a short note in the startup/config summary
instructing users to set that env var (or manually edit GRAFANA_DASHBOARDS in
.env) to their public IP or domain for remote iframe embedding to work;
reference the GRAFANA_DASHBOARD_EMBED_URL variable and the .env
GRAFANA_DASHBOARDS setting when making these changes.
🧹 Nitpick comments (5)
frontend/src/pages/overview/view-mode-toggle.tsx (1)
16-37: 建议添加无障碍属性。切换组件缺少无障碍属性,屏幕阅读器用户可能难以理解其用途。
♻️ 建议的改进
return ( - <div className="inline-flex items-center rounded-md border p-0.5"> + <div className="inline-flex items-center rounded-md border p-0.5" role="group" aria-label={t('pages.overview.viewMode.label')}> <Button variant="ghost" size="xs" className={cn(value === 'builtin' && 'bg-accent')} onClick={() => onChange('builtin')} + aria-pressed={value === 'builtin'} >backend/src/api/admin/dashboards.ts (2)
10-16: 存在重复的 schema 定义。
dashboardSchema(TypeBox) 与grafanaDashboardsSchema(Zod) 定义了相同的结构。虽然 TypeBox 用于 Elysia 请求验证、Zod 用于额外校验是可行的,但维护两套 schema 可能导致不一致。可考虑统一使用 Zod schema 并通过
@elysiajs/typebox转换,或仅依赖 Elysia 内置验证。当前实现可接受,但需注意两者同步更新。
52-60: 基于字符串匹配的错误检测较脆弱。当前通过
error.message.includes("GRAFANA_DASHBOARDS")识别环境变量覆盖错误。如果dashboards.ts中的错误消息变更,此检测会静默失效。建议使用自定义错误类或在
setGrafanaDashboards/clearGrafanaDashboards中抛出带有code属性的错误对象,以实现更可靠的错误识别。♻️ 推荐的改进方案
在
dashboards.ts中定义自定义错误类:export class EnvOverrideError extends Error { readonly code = 'ENV_OVERRIDE_ACTIVE'; constructor() { super('Cannot modify dashboards when GRAFANA_DASHBOARDS environment variable is set'); } }然后在此处使用:
- if ( - error instanceof Error && - error.message.includes("GRAFANA_DASHBOARDS") - ) { + if (error instanceof EnvOverrideError) {frontend/src/hooks/use-settings.ts (1)
18-24: 建议对 API 响应数据进行运行时类型校验。当前直接将
data断言为DashboardsResponse(第 23 行),若后端返回格式异常(如字段缺失或类型不匹配),可能导致运行时错误或undefined访问。建议添加基本的结构验证:
♻️ 推荐的改进方案
const dashboardsQueryOptions = queryOptions({ queryKey: ['dashboards'], queryFn: async (): Promise<DashboardsResponse> => { const { data, error } = await api.admin.dashboards.get() if (error) { return { dashboards: [], envOverride: false } } - return data as DashboardsResponse + // Basic runtime validation + const response = data as DashboardsResponse + if (!Array.isArray(response?.dashboards)) { + return { dashboards: [], envOverride: false } + } + return response }, staleTime: 5 * 60 * 1000, // 5 minutes retry: false, })backend/src/utils/dashboards.ts (1)
38-53: 迁移逻辑存在轻微的竞态条件,但影响有限。并发请求可能同时触发旧格式迁移:两个请求都读取到旧值后分别执行
upsertSetting和deleteSetting。由于upsertSetting是幂等的,结果数据一致,不会造成数据损坏。建议添加迁移日志以便观察:
♻️ 可选的改进方案
if (legacySetting?.value && typeof legacySetting.value === "string") { + console.log(`[dashboards] Migrating legacy grafana_dashboard_url to new format`); const migrated: GrafanaDashboard[] = [ { id: "grafana", label: "Grafana", url: legacySetting.value, }, ]; // Migrate to new format await upsertSetting({ key: DASHBOARDS_KEY, value: migrated }); // Remove legacy key await deleteSetting(LEGACY_KEY); return migrated; }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
backend/src/api/admin/dashboards.tsbackend/src/api/admin/index.tsbackend/src/utils/config.tsbackend/src/utils/dashboards.tsfrontend/src/hooks/use-settings.tsfrontend/src/pages/overview/index.tsxfrontend/src/pages/overview/view-mode-toggle.tsxfrontend/src/routes/_dashboard/index.tsxfrontend/src/routes/_dashboard/route.tsxscripts/quick-start.sh
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-01-24T18:23:42.635Z
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 59
File: backend/src/api/v1/responses.ts:566-577
Timestamp: 2026-01-24T18:23:42.635Z
Learning: When using Elysia with the apiKeyPlugin, if a route option has checkApiKey: true, the apiKeyRecord parameter is guaranteed to be non-null. Do not add explicit null checks or non-null assertions for apiKeyRecord in such routes; rely on this contract and avoid unnecessary guards to improve readability. If there is any doubt about the guarantee in a specific place, add a runtime assertion at the boundary where the plugin is configured, but otherwise omit null checks for apiKeyRecord in backend code under backend/.
Applied to files:
backend/src/api/admin/index.tsbackend/src/utils/config.tsbackend/src/api/admin/dashboards.tsbackend/src/utils/dashboards.ts
📚 Learning: 2026-01-24T22:33:16.471Z
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 59
File: backend/src/utils/streaming-context.ts:132-146
Timestamp: 2026-01-24T22:33:16.471Z
Learning: In NexusGate, for ReqId deduplication (specifically in streaming-context.ts and reqIdHandler.ts), when reqIdContext is present, finalizeReqId should use updateCompletion instead of addCompletions. This causes ReqId requests to not write srv_logs entries; errors are captured in the completion record's status (e.g., failed/aborted). This avoids duplicate srv_logs entries during retry scenarios and keeps the audit trail cleaner. Treat the completion record as the source of truth for ReqId request outcomes.
Applied to files:
backend/src/utils/config.tsbackend/src/utils/dashboards.ts
🧬 Code graph analysis (6)
backend/src/api/admin/index.ts (2)
backend/src/api/admin/settings.ts (1)
adminSettings(4-58)backend/src/api/admin/dashboards.ts (1)
adminDashboards(18-100)
backend/src/utils/config.ts (1)
frontend/src/hooks/use-settings.ts (1)
GrafanaDashboard(5-9)
backend/src/api/admin/dashboards.ts (2)
backend/src/utils/dashboards.ts (4)
getGrafanaDashboards(23-56)isEnvOverrideActive(15-17)setGrafanaDashboards(62-72)clearGrafanaDashboards(78-86)backend/src/utils/config.ts (1)
grafanaDashboardsSchema(161-161)
frontend/src/hooks/use-settings.ts (2)
backend/src/utils/config.ts (1)
GrafanaDashboard(162-162)frontend/src/lib/api.ts (1)
api(21-21)
backend/src/utils/dashboards.ts (2)
backend/src/utils/config.ts (3)
GRAFANA_DASHBOARDS(164-167)GrafanaDashboard(162-162)grafanaDashboardsSchema(161-161)backend/src/db/index.ts (1)
upsertSetting(613-625)
frontend/src/pages/overview/index.tsx (3)
frontend/src/hooks/use-settings.ts (1)
useGrafanaDashboards(29-31)frontend/src/pages/overview/use-overview-stats.ts (1)
TimeRange(6-6)frontend/src/pages/overview/grafana-embed.tsx (1)
GrafanaEmbed(5-30)
🔇 Additional comments (14)
scripts/quick-start.sh (1)
426-428: LGTM!JSON 配置格式正确,与后端
grafanaDashboardsSchema定义的结构一致。kiosk参数是 Grafana 嵌入的最佳实践,可隐藏导航栏。frontend/src/routes/_dashboard/route.tsx (1)
29-33: LGTM!状态管理逻辑正确。使用
replace: true避免了历史记录污染,空数组默认值确保了dashboards在加载期间不会导致渲染错误。frontend/src/routes/_dashboard/index.tsx (1)
8-11: LGTM!Schema 扩展正确,使用
.catch('builtin')确保向后兼容性。对于动态的仪表盘 ID,使用z.string()而非枚举是合理的设计。backend/src/utils/config.ts (1)
154-167: LGTM!Schema 定义清晰,
z.string().url()提供了 URL 格式验证。类型导出与前端GrafanaDashboard接口保持一致。使用zObject包装optional()允许环境变量未设置时返回undefined。backend/src/api/admin/index.ts (2)
6-14: LGTM!新模块正确导入并按字母顺序排列。
34-35: LGTM!
adminSettings和adminDashboards正确注册在 admin guard 内部,所有端点都受到checkAdminApiKey保护。路由路径/settings和/dashboards不会冲突。backend/src/api/admin/dashboards.ts (1)
20-35: LGTM!GET 端点实现清晰,正确返回仪表盘列表和环境变量覆盖状态。
frontend/src/hooks/use-settings.ts (1)
33-37: LGTM!向后兼容的辅助函数实现简洁,使用可选链安全地处理各种边界情况。
frontend/src/pages/overview/index.tsx (3)
62-69: LGTM!仪表盘选择逻辑实现正确:
- 从路由参数获取
view,通过id匹配查找对应仪表盘showGrafana布尔值安全地基于selectedDashboard存在性派生- 数据流清晰,状态管理合理
71-77: LGTM!
handleTimeRangeChange正确使用更新函数合并搜索状态,确保在更改时间范围时保留view参数。
111-113: TypeScript 类型收窄可能不完整。
showGrafana基于!!selectedDashboard派生,但 TypeScript 无法通过中间布尔变量进行类型收窄。如果 TS 编译器未报错,当前代码可正常工作;否则可考虑内联条件检查。请确认项目构建时此处无类型错误。如有问题,可改为:
- {showGrafana ? ( - <GrafanaEmbed url={selectedDashboard.url} /> + {selectedDashboard ? ( + <GrafanaEmbed url={selectedDashboard.url} />backend/src/utils/dashboards.ts (3)
15-17: LGTM!
isEnvOverrideActive函数逻辑正确:区分undefined(未设置)和空数组(显式设置为空)的情况。
62-72: LGTM!
setGrafanaDashboards实现正确:先检查环境变量覆盖状态,再执行数据库写入。错误处理清晰。
78-86: LGTM!
clearGrafanaDashboards实现与setGrafanaDashboards保持一致的模式,逻辑正确。
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Address code review feedback from CodeRabbit and Gemini: - Add error handling for invalid URLs in GrafanaEmbed component with graceful fallback UI when URL parsing fails - Add accessibility attributes (aria-pressed, role, aria-label) to ViewModeToggle for screen reader support - Add runtime validation in useGrafanaDashboards hook to handle unexpected API response formats - Fix divide-by-zero in Grafana dashboard PromQL queries for Success Rate metric using clamp_min() - Add missing i18n translations for new UI strings - Add comment in quick-start.sh noting localhost limitation for remote access Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add Grafana environment variables to enable iframe embedding: - GF_AUTH_ANONYMOUS_ENABLED=true - Allow unauthenticated access - GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer - Read-only access for anonymous users - GF_SECURITY_ALLOW_EMBEDDING=true - Allow Grafana to be embedded in iframes Without these settings, the embedded Grafana dashboard would show a login page instead of the dashboard content. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a great new feature for embedding Grafana dashboards on the overview page. The implementation is comprehensive, covering backend APIs, frontend components, configuration management with backward compatibility, and even a pre-built Grafana dashboard. The code is generally well-written and includes good practices like iframe sandboxing and fixing potential division-by-zero errors in dashboard queries. I've added a couple of suggestions to improve code clarity and robustness in the new backend API.
TanStack Router auto-generated file updated after route changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@docker-compose.monitoring.yaml`:
- Around line 30-34: The docker-compose config currently hardcodes
GF_AUTH_ANONYMOUS_ENABLED=true, GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer and
GF_SECURITY_ALLOW_EMBEDDING=true which opens security risks; change these
entries so they are configurable and default to false (or remove the hardcoded
true values) and only enable them when explicit env variables are provided
(e.g., via an external .env or compose variable substitution), ensuring
GF_AUTH_ANONYMOUS_ENABLED and GF_SECURITY_ALLOW_EMBEDDING default to "false" and
GF_AUTH_ANONYMOUS_ORG_ROLE is omitted or set only when anonymous access is
enabled; update any relevant docs or comments near these vars so future
maintainers know to explicitly opt-in before turning them on.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docker-compose.monitoring.yaml
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 65
File: frontend/src/routes/_dashboard/route.tsx:62-66
Timestamp: 2026-01-28T12:49:25.269Z
Learning: In the NexusGate frontend (frontend/src/routes/_dashboard/route.tsx), the ViewModeToggle component is intentionally hidden on mobile devices because Grafana dashboards don't render well on small screens. The primary use case for Grafana embedding is desktop monitoring dashboards. Mobile users can still access Grafana view via URL parameter if needed, but it's not encouraged.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
…dation - Add custom EnvOverrideError class for type-safe error handling - Remove redundant Zod validation (Elysia TypeBox already validates) - Use instanceof check instead of string matching for errors Addresses PR review feedback about fragile string-based error detection. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comment explaining that anonymous auth and embedding settings are for local development convenience. Recommend using Grafana's Public Dashboard feature for production deployments. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backend/src/utils/dashboards.ts`:
- Around line 28-39: isEnvOverrideActive currently returns false when
GRAFANA_DASHBOARDS is an explicitly empty array, causing the code to fall back
to DB; change the logic so that any defined GRAFANA_DASHBOARDS value counts as
an override (check !== undefined or typeof !== 'undefined'), and in
getGrafanaDashboards when that override is active return the environment value
with a safe fallback to an empty array (use GRAFANA_DASHBOARDS ?? []) to honor
an explicit []; update the functions isEnvOverrideActive and
getGrafanaDashboards to use these checks and return behavior, referencing the
GRAFANA_DASHBOARDS symbol.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/src/api/admin/dashboards.tsbackend/src/utils/dashboards.tsdocker-compose.monitoring.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/src/api/admin/dashboards.ts
- docker-compose.monitoring.yaml
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 65
File: frontend/src/routes/_dashboard/route.tsx:62-66
Timestamp: 2026-01-28T12:49:25.269Z
Learning: In the NexusGate frontend (frontend/src/routes/_dashboard/route.tsx), the ViewModeToggle component is intentionally hidden on mobile devices because Grafana dashboards don't render well on small screens. The primary use case for Grafana embedding is desktop monitoring dashboards. Mobile users can still access Grafana view via URL parameter if needed, but it's not encouraged.
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 65
File: docker-compose.monitoring.yaml:30-34
Timestamp: 2026-01-28T13:12:52.161Z
Learning: The docker-compose.monitoring.yaml file in NexusGate is intended for local development and quick-start scenarios where convenience is prioritized. For production deployments, users should use the Helm chart or custom configuration with appropriate security settings. The file intentionally enables GF_AUTH_ANONYMOUS_ENABLED and GF_SECURITY_ALLOW_EMBEDDING for local iframe embedding testing, but production deployments should use Grafana's Public Dashboard feature, auth proxy, or direct Grafana access instead.
📚 Learning: 2026-01-24T18:23:42.635Z
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 59
File: backend/src/api/v1/responses.ts:566-577
Timestamp: 2026-01-24T18:23:42.635Z
Learning: When using Elysia with the apiKeyPlugin, if a route option has checkApiKey: true, the apiKeyRecord parameter is guaranteed to be non-null. Do not add explicit null checks or non-null assertions for apiKeyRecord in such routes; rely on this contract and avoid unnecessary guards to improve readability. If there is any doubt about the guarantee in a specific place, add a runtime assertion at the boundary where the plugin is configured, but otherwise omit null checks for apiKeyRecord in backend code under backend/.
Applied to files:
backend/src/utils/dashboards.ts
📚 Learning: 2026-01-24T22:33:16.471Z
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 59
File: backend/src/utils/streaming-context.ts:132-146
Timestamp: 2026-01-24T22:33:16.471Z
Learning: In NexusGate, for ReqId deduplication (specifically in streaming-context.ts and reqIdHandler.ts), when reqIdContext is present, finalizeReqId should use updateCompletion instead of addCompletions. This causes ReqId requests to not write srv_logs entries; errors are captured in the completion record's status (e.g., failed/aborted). This avoids duplicate srv_logs entries during retry scenarios and keeps the audit trail cleaner. Treat the completion record as the source of truth for ReqId request outcomes.
Applied to files:
backend/src/utils/dashboards.ts
🧬 Code graph analysis (1)
backend/src/utils/dashboards.ts (3)
backend/src/utils/config.ts (3)
GRAFANA_DASHBOARDS(164-167)GrafanaDashboard(162-162)grafanaDashboardsSchema(161-161)frontend/src/hooks/use-settings.ts (1)
GrafanaDashboard(5-9)backend/src/db/index.ts (1)
upsertSetting(613-625)
🔇 Additional comments (4)
backend/src/utils/dashboards.ts (4)
14-21: EnvOverrideError 设计清晰
错误码与信息明确,便于上层区分“环境覆盖”场景。
42-66: 迁移与回退逻辑清晰
新格式优先、兼容旧键并迁移到新结构的流程可读性很好。
75-83: 写入前覆盖校验合理
覆盖开启时禁止写入的处理符合预期。
89-95: 清理逻辑符合预期
覆盖开启时阻止清理、否则删除新键,行为一致。
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
When GRAFANA_DASHBOARDS is set to [], it should disable dashboards without falling back to DB. Previously required length > 0. Now isEnvOverrideActive() returns true if env var is defined at all, allowing ops to explicitly disable dashboards via GRAFANA_DASHBOARDS=[]. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1. Auto-select first Grafana dashboard when dashboards are configured - Hide built-in option by default when Grafana dashboards exist - Auto-redirect from 'builtin' to first dashboard on load 2. Improve iframe height - Use min-h-[calc(100vh-280px)] for responsive height - Iframe now fills available viewport space 3. Sync theme with NexusGate - Pass &theme=light or &theme=dark to Grafana URL - Uses resolvedTheme from next-themes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@frontend/src/pages/overview/index.tsx`:
- Around line 66-80: The current auto-select useEffect only triggers when view
=== 'builtin', causing inconsistencies if view is missing/invalid; update the
effect in the component that references dashboards, selectedDashboard, view,
useEffect and navigate to also handle a missing/invalid view by checking if
dashboards.length > 0 and (view === 'builtin' || !selectedDashboard) and then
call navigate to set search.view to dashboards[0].id (keep replace: true). Also
include selectedDashboard in the effect dependencies to ensure the fallback runs
when selection becomes undefined.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
frontend/src/pages/overview/grafana-embed.tsxfrontend/src/pages/overview/index.tsxfrontend/src/pages/overview/view-mode-toggle.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
- frontend/src/pages/overview/grafana-embed.tsx
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 65
File: frontend/src/routes/_dashboard/route.tsx:62-66
Timestamp: 2026-01-28T12:49:25.269Z
Learning: In the NexusGate frontend (frontend/src/routes/_dashboard/route.tsx), the ViewModeToggle component is intentionally hidden on mobile devices because Grafana dashboards don't render well on small screens. The primary use case for Grafana embedding is desktop monitoring dashboards. Mobile users can still access Grafana view via URL parameter if needed, but it's not encouraged.
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 65
File: docker-compose.monitoring.yaml:30-34
Timestamp: 2026-01-28T13:12:52.161Z
Learning: The docker-compose.monitoring.yaml file in NexusGate is intended for local development and quick-start scenarios where convenience is prioritized. For production deployments, users should use the Helm chart or custom configuration with appropriate security settings. The file intentionally enables GF_AUTH_ANONYMOUS_ENABLED and GF_SECURITY_ALLOW_EMBEDDING for local iframe embedding testing, but production deployments should use Grafana's Public Dashboard feature, auth proxy, or direct Grafana access instead.
📚 Learning: 2026-01-28T12:49:25.269Z
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 65
File: frontend/src/routes/_dashboard/route.tsx:62-66
Timestamp: 2026-01-28T12:49:25.269Z
Learning: In the NexusGate frontend (frontend/src/routes/_dashboard/route.tsx), the ViewModeToggle component is intentionally hidden on mobile devices because Grafana dashboards don't render well on small screens. The primary use case for Grafana embedding is desktop monitoring dashboards. Mobile users can still access Grafana view via URL parameter if needed, but it's not encouraged.
Applied to files:
frontend/src/pages/overview/index.tsxfrontend/src/pages/overview/view-mode-toggle.tsx
🔇 Additional comments (4)
frontend/src/pages/overview/view-mode-toggle.tsx (1)
14-50: 实现清晰,切换逻辑一致。默认隐藏内置选项、仅在有必要时展示切换按钮的行为与目标一致,ARIA 标注也到位。
frontend/src/pages/overview/index.tsx (3)
82-88: 合并 search 的方式合理。保留既有查询参数并只更新
range,能避免切换时间范围时丢失view。
108-113: 移动端仅在内置模式显示时间范围选择器,逻辑清晰。
122-174: Grafana 与内置图表分支渲染结构清楚。SummaryCards 在两种模式下保持可见,符合需求。
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| const dashboards = dashboardsData?.dashboards ?? [] | ||
| const selectedDashboard = dashboards.find((d) => d.id === view) | ||
| const timeRange = range as TimeRange | ||
| const showGrafana = !!selectedDashboard | ||
|
|
||
| // Auto-select first Grafana dashboard when dashboards are configured | ||
| useEffect(() => { | ||
| if (dashboards.length > 0 && view === 'builtin') { | ||
| navigate({ | ||
| to: '/', | ||
| search: (prev) => ({ ...prev, view: dashboards[0].id }), | ||
| replace: true, | ||
| }) | ||
| } | ||
| }, [dashboards, view, navigate]) |
There was a problem hiding this comment.
处理无效/缺失 view 时的自动回落。
当 dashboards 存在但 view 为空或不匹配任何 dashboard 时,selectedDashboard 为 undefined,showGrafana 为 false,但内置选项默认隐藏,容易出现“展示内置图表却没有内置按钮”的不一致。建议在自动选择逻辑中同时处理无效/缺失的 view。
🛠️ 建议修复
- useEffect(() => {
- if (dashboards.length > 0 && view === 'builtin') {
- navigate({
- to: '/',
- search: (prev) => ({ ...prev, view: dashboards[0].id }),
- replace: true,
- })
- }
- }, [dashboards, view, navigate])
+ useEffect(() => {
+ if (dashboards.length === 0) return
+ const isBuiltin = view === 'builtin' || !view
+ const isInvalid = !isBuiltin && !selectedDashboard
+ if (isBuiltin || isInvalid) {
+ navigate({
+ to: '/',
+ search: (prev) => ({ ...prev, view: dashboards[0].id }),
+ replace: true,
+ })
+ }
+ }, [dashboards, view, selectedDashboard, navigate])🤖 Prompt for AI Agents
In `@frontend/src/pages/overview/index.tsx` around lines 66 - 80, The current
auto-select useEffect only triggers when view === 'builtin', causing
inconsistencies if view is missing/invalid; update the effect in the component
that references dashboards, selectedDashboard, view, useEffect and navigate to
also handle a missing/invalid view by checking if dashboards.length > 0 and
(view === 'builtin' || !selectedDashboard) and then call navigate to set
search.view to dashboards[0].id (keep replace: true). Also include
selectedDashboard in the effect dependencies to ensure the fallback runs when
selection becomes undefined.
Add cost metrics to Prometheus and enhance Grafana dashboard with advanced analytics panels for Issue #53: Backend changes: - Add getCompletionCostMetrics() DB query joining completions with models table to calculate costs based on pricing - Add 3 new Prometheus counters: nexusgate_cost_prompt_usd_total, nexusgate_cost_completion_usd_total, nexusgate_cost_total_usd_total with labels: model, provider, api_key_comment Grafana dashboard enhancements: - Add P75/P90 quantiles to latency panels (Duration & TTFT) - Add "Cost Analysis" row: Cost Over Time, Cost by Provider/API Key pie charts, Cost Breakdown table - Add "Trend Analysis" row: Request/Cost comparison with previous period, Change % stats - Add "Model Comparison" row: P95 Latency, Error Rate, Cost per 1K Tokens bar gauges, Model Performance Matrix table Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
/gemini review |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@backend/src/db/index.ts`:
- Around line 1553-1561: The total_cost_usd aggregation currently requires both
m.input_price and m.output_price to be non-null which makes it inconsistent with
prompt_cost_usd and completion_cost_usd; update the SUM/CASE expression that
defines total_cost_usd so it independently adds the prompt and completion
contributions (use separate CASE terms that check m.input_price != NULL with
c.prompt_tokens and m.output_price != NULL with c.completion_tokens) rather than
gating the entire sum on both prices being present, ensuring total_cost_usd
equals prompt_cost_usd + completion_cost_usd.
🧹 Nitpick comments (4)
grafana/provisioning/dashboards/json/nexusgate-dashboard.json (4)
1367-1376: 建议:Error Rate 查询缺少除零保护此处的错误率计算
sum(...{status="failed"}) / sum(...)在某个模型没有请求时可能产生 NaN。建议与 Line 214 的修复保持一致,添加clamp_min保护。♻️ 建议的修复
- "expr": "sum(rate(nexusgate_completions_total{status=\"failed\"}[$__rate_interval])) by (model) / sum(rate(nexusgate_completions_total[$__rate_interval])) by (model)", + "expr": "sum(rate(nexusgate_completions_total{status=\"failed\"}[$__rate_interval])) by (model) / clamp_min(sum(rate(nexusgate_completions_total[$__rate_interval])) by (model), 0.001)",
1455-1464: 建议:Cache Hit Rate 查询缺少除零保护与上述问题类似,缓存命中率的计算在没有请求时可能产生 NaN。
♻️ 建议的修复
- "expr": "sum(rate(nexusgate_completions_total{status=\"cache_hit\"}[$__rate_interval])) / sum(rate(nexusgate_completions_total[$__rate_interval]))", + "expr": "sum(rate(nexusgate_completions_total{status=\"cache_hit\"}[$__rate_interval])) / clamp_min(sum(rate(nexusgate_completions_total[$__rate_interval])), 0.001)",
3217-3226: 建议:Model Comparison 中的 Error Rate 同样缺少除零保护此处与 Line 1373 的问题相同,建议添加
clamp_min保护以保持一致性。♻️ 建议的修复
- "expr": "sum(rate(nexusgate_completions_total{status=\"failed\"}[$__rate_interval])) by (model) / sum(rate(nexusgate_completions_total[$__rate_interval])) by (model)", + "expr": "sum(rate(nexusgate_completions_total{status=\"failed\"}[$__rate_interval])) by (model) / clamp_min(sum(rate(nexusgate_completions_total[$__rate_interval])) by (model), 0.001)",
3556-3566: 建议:Model Performance Matrix 表格中的 ErrorRate 查询也需要除零保护为保持仪表盘整体一致性,此处的错误率查询也应添加
clamp_min保护。♻️ 建议的修复
- "expr": "sum(rate(nexusgate_completions_total{status=\"failed\"}[$__rate_interval])) by (model) / sum(rate(nexusgate_completions_total[$__rate_interval])) by (model)", + "expr": "sum(rate(nexusgate_completions_total{status=\"failed\"}[$__rate_interval])) by (model) / clamp_min(sum(rate(nexusgate_completions_total[$__rate_interval])) by (model), 0.001)",
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/src/db/index.tsbackend/src/services/prometheus.tsgrafana/provisioning/dashboards/json/nexusgate-dashboard.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-24T18:23:42.635Z
Learnt from: pescn
Repo: EM-GeekLab/NexusGate PR: 59
File: backend/src/api/v1/responses.ts:566-577
Timestamp: 2026-01-24T18:23:42.635Z
Learning: When using Elysia with the apiKeyPlugin, if a route option has checkApiKey: true, the apiKeyRecord parameter is guaranteed to be non-null. Do not add explicit null checks or non-null assertions for apiKeyRecord in such routes; rely on this contract and avoid unnecessary guards to improve readability. If there is any doubt about the guarantee in a specific place, add a runtime assertion at the boundary where the plugin is configured, but otherwise omit null checks for apiKeyRecord in backend code under backend/.
Applied to files:
backend/src/services/prometheus.tsbackend/src/db/index.ts
🧬 Code graph analysis (1)
backend/src/services/prometheus.ts (1)
backend/src/db/index.ts (1)
getCompletionCostMetrics(1534-1577)
🔇 Additional comments (9)
grafana/provisioning/dashboards/json/nexusgate-dashboard.json (6)
214-214: 除零保护修复,LGTM!使用
clamp_min(sum(increase(nexusgate_completions_total[$__range])), 1)保护除数,当没有数据时避免显示 NaN 或 Inf,是一个良好的实践。
743-780: 新增 p75/p90 百分位延迟指标,LGTM!新增的 p75 和 p90 百分位数能更全面地展示延迟分布,对于监控尾部延迟很有帮助。refId 的更新 (B→D, C→E) 也保持了正确的顺序。
2166-2265: 新增 Cost Analysis 区域,整体设计良好!成本分析面板配置合理:
- 使用
currencyUSD单位格式化货币显示- 采用堆叠模式 (
"mode": "normal") 展示各模型成本累计- 图例配置显示 sum 值便于快速查看总成本
2871-2880: 除零保护使用了 clamp_min(... , 1),LGTM!Request Change % 计算正确使用了
clamp_min(..., 1)避免除零错误。
2948-2957: clamp_min 使用 0.001 作为最小值是合理的对于成本变化百分比,使用
clamp_min(..., 0.001)而非1是合理的,因为成本金额可能很小(小于 $1),使用较小的最小值可以得到更准确的百分比计算。
3080-3092: 新增 Model Comparison 区域,设计合理!Model Comparison 部分提供了很好的多维度对比视图:
- P95 延迟、错误率、每千 Token 成本、请求数的横向对比
- 使用 bargauge 可视化便于快速比较
- Model Performance Matrix 表格汇总了关键指标
这对于模型选型和性能优化非常有帮助。
backend/src/db/index.ts (1)
1534-1577: 函数整体实现良好新增的
getCompletionCostMetrics函数:
- 正确使用 LEFT JOIN 获取模型定价和 provider 信息
- 使用 COALESCE 处理 NULL 值
- 类型断言合理,与文件中其他类似函数保持一致
- 按 model、provider、api_key_comment 分组适合 Prometheus 标签维度
backend/src/services/prometheus.ts (2)
10-10: 导入和数据获取结构良好
- 新增导入与现有导入组织方式一致
- 使用
Promise.all并行获取 cost metrics,保持性能最优Also applies to: 176-186
305-361: Cost metrics 实现遵循现有模式实现亮点:
- 仅当
cost > 0时才添加指标,避免无定价模型产生噪声数据- 标签维度 (model, provider, api_key_comment) 与 Grafana dashboard 需求匹配
- 使用
formatCounter与其他计数器保持一致代码结构清晰,与文件中其他指标处理逻辑风格统一。
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Code Review
This is an excellent pull request that adds a significant and well-implemented feature for Grafana dashboard embedding. The backend changes, including the new settings and dashboards APIs, are robust and include a thoughtful migration path for legacy configurations. The frontend implementation provides a great user experience with features like view mode toggling and theme synchronization. The inclusion of a pre-built Grafana dashboard and the fix for the potential division-by-zero error in the success rate calculation are also highly appreciated. I have one suggestion to further improve the robustness of the frontend data fetching logic.
| // Runtime validation to handle unexpected API responses | ||
| const response = data as DashboardsResponse | ||
| if (!Array.isArray(response?.dashboards)) { | ||
| return { dashboards: [], envOverride: false } | ||
| } | ||
| return response |
There was a problem hiding this comment.
The runtime validation here is a good step towards defensive programming. However, it only checks if response.dashboards is an array, but doesn't validate the shape of the objects within the array. To make this more robust against unexpected API responses, I suggest using a schema validation library like Zod, which is already in use elsewhere in the frontend.
This ensures that the entire data structure matches expectations, preventing potential runtime errors if the API contract changes or returns malformed data. You'll need to add import { z } from 'zod' at the top of the file.
const { z } = await import('zod')
const GrafanaDashboardSchema = z.object({
id: z.string(),
label: z.string(),
url: z.string().url(),
})
const DashboardsResponseSchema = z.object({
dashboards: z.array(GrafanaDashboardSchema),
envOverride: z.boolean(),
})
const parsed = DashboardsResponseSchema.safeParse(data)
if (!parsed.success) {
console.error('Failed to parse dashboards response:', parsed.error)
return { dashboards: [], envOverride: false }
}
return parsed.dataReferences
- When parsing data from external API responses, prioritize robust, verbose type guarding over concise functional approaches (like
filterwith type predicates) to defensively handle unexpected data shapes and prevent runtime errors.
- Fix total_cost_usd calculation to independently sum prompt and completion costs (previously required both prices to be non-null) - Add clamp_min(0.001) protection to 4 division queries in Grafana dashboard to prevent NaN when no requests exist: - Error Rate by Model panel - Cache Hit Rate panel - Model Comparison Error Rate bar gauge - Model Performance Matrix ErrorRate column Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Thanks @coderabbitai for the thorough review! Fixed in commit c2f31b3:
|
|
Tip For best results, initiate chat on the files or code changes.
✅ SQL query fix - Independently summing prompt and completion costs ensures ✅ Divide-by-zero protection - Adding Great attention to detail in fixing all four instances! 🎉 |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@grafana/provisioning/dashboards/json/nexusgate-dashboard.json`:
- Around line 2526-2558: The joinByField transformation currently uses "model"
as the join key (id: "joinByField", options.byField: "model") which can drop
rows when a model has multiple provider or api_key_comment combinations; update
the transformations to either use a composite key (concatenate
model+provider+api_key_comment as the join field) or replace the joinByField
step with a seriesToRows transformation so each provider/api_key_comment pair is
preserved before the organize transformation (id: "organize") and renames.
🧹 Nitpick comments (1)
grafana/provisioning/dashboards/json/nexusgate-dashboard.json (1)
1743-1754: 建议:API Key Rate Limit Usage 面板也添加除零保护。此处的除法查询未使用
clamp_min保护:100 * nexusgate_api_key_rpm_usage / nexusgate_api_key_rpm_limit 100 * nexusgate_api_key_tpm_usage / nexusgate_api_key_tpm_limit当
limit为 0 或指标缺失时会产生 NaN。为保持一致性,建议添加保护:♻️ 建议修改
- "expr": "100 * nexusgate_api_key_rpm_usage / nexusgate_api_key_rpm_limit", + "expr": "100 * nexusgate_api_key_rpm_usage / clamp_min(nexusgate_api_key_rpm_limit, 1)",- "expr": "100 * nexusgate_api_key_tpm_usage / nexusgate_api_key_tpm_limit", + "expr": "100 * nexusgate_api_key_tpm_usage / clamp_min(nexusgate_api_key_tpm_limit, 1)",
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/src/db/index.tsgrafana/provisioning/dashboards/json/nexusgate-dashboard.json
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/src/db/index.ts
🔇 Additional comments (6)
grafana/provisioning/dashboards/json/nexusgate-dashboard.json (6)
214-214: 除零保护实现正确。使用
clamp_min(sum(increase(nexusgate_completions_total[$__range])), 1)正确地防止了在没有请求时除以零的情况。对于计数类指标使用下限1是合理的选择。
743-780: 新增 p75/p90 分位数指标,增强延迟监控粒度。新增的百分位数查询结构与现有的 p50/p95/p99 保持一致,
refId命名有序(A→E),面板标题也已正确更新。
1373-1373: 速率类指标的除零保护实现正确。对于
rate()计算的指标使用clamp_min(0.001)是合理的,因为速率值可能非常小但不为零。这与 PR commit 中描述的修复一致。Also applies to: 1461-1461
2871-2880: 趋势对比查询的除零保护一致性良好。
Request Change %使用clamp_min(..., 1)保护计数类指标Cost Change %使用clamp_min(..., 0.001)保护金额类指标保护阈值的选择逻辑清晰:计数至少为 1,成本金额可以很小但正数。
Also applies to: 2948-2957
3223-3223: Model Comparison 部分的除零保护实现正确。
- Error Rate (bargauge):
clamp_min(0.001)用于速率- Cost per 1K Tokens:
clamp_min(1)用于 token 计数- Model Performance Matrix:
clamp_min(0.001)用于速率Also applies to: 3283-3283, 3562-3562
2165-2178: 新增的 Cost Analysis、Trend Analysis 和 Model Comparison 部分提供了全面的成本和模型性能监控。整体结构清晰,面板配置合理:
- Cost Analysis:成本趋势、供应商/API Key 分布、详细成本明细表
- Trend Analysis:同比趋势对比,使用
offset $__range实现周期比较- Model Comparison:多维度模型对比(延迟、错误率、成本效率)
这些新增内容与 PR 目标一致,为 Grafana 仪表板嵌入功能提供了丰富的监控视图。
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| "transformations": [ | ||
| { | ||
| "id": "joinByField", | ||
| "options": { | ||
| "byField": "model", | ||
| "mode": "outer" | ||
| } | ||
| }, | ||
| { | ||
| "id": "organize", | ||
| "options": { | ||
| "excludeByName": { | ||
| "Time": true, | ||
| "Time 1": true, | ||
| "Time 2": true, | ||
| "Time 3": true, | ||
| "api_key_comment 2": true, | ||
| "api_key_comment 3": true, | ||
| "provider 2": true, | ||
| "provider 3": true | ||
| }, | ||
| "indexByName": {}, | ||
| "renameByName": { | ||
| "Value #Completion": "Completion Cost", | ||
| "Value #Prompt": "Prompt Cost", | ||
| "Value #Total": "Total Cost", | ||
| "api_key_comment 1": "API Key", | ||
| "model": "Model", | ||
| "provider 1": "Provider" | ||
| } | ||
| } | ||
| } | ||
| ], |
There was a problem hiding this comment.
Cost Breakdown 表格的 join 转换可能导致数据丢失。
当同一 model 存在多个 provider 或 api_key_comment 组合时,joinByField 使用 model 作为连接字段会导致数据行被覆盖或丢失。建议改用复合键(如 model + provider + api_key_comment)或使用 seriesToRows 转换。
不过,如果当前业务场景中每个 model 只会关联一个 provider 和 api_key,则此实现可以正常工作。
🤖 Prompt for AI Agents
In `@grafana/provisioning/dashboards/json/nexusgate-dashboard.json` around lines
2526 - 2558, The joinByField transformation currently uses "model" as the join
key (id: "joinByField", options.byField: "model") which can drop rows when a
model has multiple provider or api_key_comment combinations; update the
transformations to either use a composite key (concatenate
model+provider+api_key_comment as the join field) or replace the joinByField
step with a seriesToRows transformation so each provider/api_key_comment pair is
preserved before the organize transformation (id: "organize") and renames.
…irect Wrap dashboards array derivation with useMemo to prevent unnecessary useEffect re-runs caused by new array references on each render. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Retain Grafana conditional wrapper and ViewModeToggle import from feature branch while incorporating main's formatting changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
/api/admin/settings/:key) for storing configuration valuesgrafana/nexusgate-overview.json) for easy importTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit
新功能
Bug 修复
杂项
✏️ Tip: You can customize this high-level summary in your review settings.