doc: update metrics doc regarding frontend staged gauges#8459
Conversation
WalkthroughDocumentation updated to reflect new frontend metrics. Replaced deprecated Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
docs/kubernetes/autoscaling.md (1)
351-365: Make queue-depth queries resilient to future stage additions.Docs define queue depth as
preprocess + route + dispatch, but this rule sums all stage series. Consider filtering explicitly (and mirroring the same change in Line 511 KEDA query) to avoid semantic drift if new stages are added later.Suggested doc diff
- metricsQuery: | - sum(<<.Series>>{<<.LabelMatchers>>}) by (namespace, dynamo_namespace) + metricsQuery: | + sum(<<.Series>>{<<.LabelMatchers>>,stage=~"preprocess|route|dispatch"}) by (namespace, dynamo_namespace)- sum(dynamo_frontend_stage_requests{dynamo_namespace="default-sglang-agg"}) + sum(dynamo_frontend_stage_requests{dynamo_namespace="default-sglang-agg",stage=~"preprocess|route|dispatch"})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/kubernetes/autoscaling.md` around lines 351 - 365, The current prometheus-adapter rule for dynamo_queued_requests sums all dynamo_frontend_stage_requests series which risks semantic drift if new stages are added; update the metricsQuery for the rule named "dynamo_queued_requests" to explicitly sum only the preprocess, route, and dispatch stages (e.g., sum of those three label-filtered series) instead of using a wildcard of <<.Series>>; also apply the same explicit-stage filtering change to the corresponding KEDA query that references frontend stage requests so both rules remain consistent.docs/observability/metrics.md (1)
176-180: Clarify aggregation scope in derived PromQL examples.Line 176 says these are “per frontend pod,” but the shown
sum(...)expressions aggregate across all matched series unless scoped/grouped. Consider rewording or addingby(...)examples so operators don’t misread query scope.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/observability/metrics.md` around lines 176 - 180, The PromQL examples claim "per frontend pod" but use bare sum(...) which aggregates across all series; update the docs around dynamo_frontend_stage_requests and dynamo_frontend_active_requests to either (a) clarify that the shown sum() examples produce cluster-wide totals, or (b) provide explicit per-pod aggregation variants such as using sum by(pod)(...) or sum without reduction but grouped appropriately; reference the three derived operators (the two expressions using sum(dynamo_frontend_stage_requests) and sum(dynamo_frontend_active_requests) - sum(dynamo_frontend_stage_requests), and the Router saturation sum(dynamo_frontend_stage_requests{stage="route"})) and add a short note showing the per-pod and cluster-wide forms so readers aren’t misled about scope.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/kubernetes/autoscaling.md`:
- Around line 351-365: The current prometheus-adapter rule for
dynamo_queued_requests sums all dynamo_frontend_stage_requests series which
risks semantic drift if new stages are added; update the metricsQuery for the
rule named "dynamo_queued_requests" to explicitly sum only the preprocess,
route, and dispatch stages (e.g., sum of those three label-filtered series)
instead of using a wildcard of <<.Series>>; also apply the same explicit-stage
filtering change to the corresponding KEDA query that references frontend stage
requests so both rules remain consistent.
In `@docs/observability/metrics.md`:
- Around line 176-180: The PromQL examples claim "per frontend pod" but use bare
sum(...) which aggregates across all series; update the docs around
dynamo_frontend_stage_requests and dynamo_frontend_active_requests to either (a)
clarify that the shown sum() examples produce cluster-wide totals, or (b)
provide explicit per-pod aggregation variants such as using sum by(pod)(...) or
sum without reduction but grouped appropriately; reference the three derived
operators (the two expressions using sum(dynamo_frontend_stage_requests) and
sum(dynamo_frontend_active_requests) - sum(dynamo_frontend_stage_requests), and
the Router saturation sum(dynamo_frontend_stage_requests{stage="route"})) and
add a short note showing the per-pod and cluster-wide forms so readers aren’t
misled about scope.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c658035a-fead-43c4-8a2f-56a426fd75dc
📒 Files selected for processing (2)
docs/kubernetes/autoscaling.mddocs/observability/metrics.md
Overview:
update metrics doc regarding frontend staged gauges
Details:
Follow-up docs for PR #8162 (staged frontend gauges).
Update the documentation for the new staged frontend gauges, and added deprecation notes for the ones to remove
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
Documentation
dynamo_frontend_active_requestsfor request lifetime tracking anddynamo_frontend_stage_requestswith granular stage/phase labels