Skip to content

doc: update metrics doc regarding frontend staged gauges#8459

Merged
jh-nv merged 4 commits into
mainfrom
jihao/frontend_doc
Apr 22, 2026
Merged

doc: update metrics doc regarding frontend staged gauges#8459
jh-nv merged 4 commits into
mainfrom
jihao/frontend_doc

Conversation

@jh-nv
Copy link
Copy Markdown
Contributor

@jh-nv jh-nv commented Apr 21, 2026

Overview:

update metrics doc regarding frontend staged gauges

Details:

Follow-up docs for PR #8162 (staged frontend gauges).

Update the documentation for the new staged frontend gauges, and added deprecation notes for the ones to remove

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Open in Devin Review

Summary by CodeRabbit

Documentation

  • Introduced improved frontend metrics: dynamo_frontend_active_requests for request lifetime tracking and dynamo_frontend_stage_requests with granular stage/phase labels
  • Updated autoscaling configuration guides and monitoring query examples with new metrics
  • Marked legacy metrics as deprecated with clear migration guidance
  • Enhanced troubleshooting documentation and metrics reference with examples and derived signal definitions

@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Apr 21, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

Walkthrough

Documentation updated to reflect new frontend metrics. Replaced deprecated dynamo_frontend_queued_requests and dynamo_frontend_inflight_requests with dynamo_frontend_stage_requests{stage,phase} and dynamo_frontend_active_requests. Prometheus Adapter and KEDA configuration examples revised accordingly.

Changes

Cohort / File(s) Summary
Documentation Updates
docs/kubernetes/autoscaling.md, docs/observability/metrics.md
Replaced deprecated frontend metrics with new metrics. Updated autoscaling examples to use dynamo_frontend_stage_requests and dynamo_frontend_active_requests. Added detailed label semantics documentation and deprecated metric guidance. Added PromQL-style derived signal formulas and cross-references between documentation sections.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: updating metrics documentation for frontend staged gauges, which aligns with the PR's core objective of documenting new staged metrics.
Description check ✅ Passed The description covers the overview, details about the follow-up documentation, and references the related PR #8162, but the 'Where should the reviewer start' section is empty and 'Related Issues' references a placeholder issue number.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
docs/kubernetes/autoscaling.md (1)

351-365: Make queue-depth queries resilient to future stage additions.

Docs define queue depth as preprocess + route + dispatch, but this rule sums all stage series. Consider filtering explicitly (and mirroring the same change in Line 511 KEDA query) to avoid semantic drift if new stages are added later.

Suggested doc diff
-  metricsQuery: |
-    sum(<<.Series>>{<<.LabelMatchers>>}) by (namespace, dynamo_namespace)
+  metricsQuery: |
+    sum(<<.Series>>{<<.LabelMatchers>>,stage=~"preprocess|route|dispatch"}) by (namespace, dynamo_namespace)
-        sum(dynamo_frontend_stage_requests{dynamo_namespace="default-sglang-agg"})
+        sum(dynamo_frontend_stage_requests{dynamo_namespace="default-sglang-agg",stage=~"preprocess|route|dispatch"})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/kubernetes/autoscaling.md` around lines 351 - 365, The current
prometheus-adapter rule for dynamo_queued_requests sums all
dynamo_frontend_stage_requests series which risks semantic drift if new stages
are added; update the metricsQuery for the rule named "dynamo_queued_requests"
to explicitly sum only the preprocess, route, and dispatch stages (e.g., sum of
those three label-filtered series) instead of using a wildcard of <<.Series>>;
also apply the same explicit-stage filtering change to the corresponding KEDA
query that references frontend stage requests so both rules remain consistent.
docs/observability/metrics.md (1)

176-180: Clarify aggregation scope in derived PromQL examples.

Line 176 says these are “per frontend pod,” but the shown sum(...) expressions aggregate across all matched series unless scoped/grouped. Consider rewording or adding by(...) examples so operators don’t misread query scope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/observability/metrics.md` around lines 176 - 180, The PromQL examples
claim "per frontend pod" but use bare sum(...) which aggregates across all
series; update the docs around dynamo_frontend_stage_requests and
dynamo_frontend_active_requests to either (a) clarify that the shown sum()
examples produce cluster-wide totals, or (b) provide explicit per-pod
aggregation variants such as using sum by(pod)(...) or sum without reduction but
grouped appropriately; reference the three derived operators (the two
expressions using sum(dynamo_frontend_stage_requests) and
sum(dynamo_frontend_active_requests) - sum(dynamo_frontend_stage_requests), and
the Router saturation sum(dynamo_frontend_stage_requests{stage="route"})) and
add a short note showing the per-pod and cluster-wide forms so readers aren’t
misled about scope.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/kubernetes/autoscaling.md`:
- Around line 351-365: The current prometheus-adapter rule for
dynamo_queued_requests sums all dynamo_frontend_stage_requests series which
risks semantic drift if new stages are added; update the metricsQuery for the
rule named "dynamo_queued_requests" to explicitly sum only the preprocess,
route, and dispatch stages (e.g., sum of those three label-filtered series)
instead of using a wildcard of <<.Series>>; also apply the same explicit-stage
filtering change to the corresponding KEDA query that references frontend stage
requests so both rules remain consistent.

In `@docs/observability/metrics.md`:
- Around line 176-180: The PromQL examples claim "per frontend pod" but use bare
sum(...) which aggregates across all series; update the docs around
dynamo_frontend_stage_requests and dynamo_frontend_active_requests to either (a)
clarify that the shown sum() examples produce cluster-wide totals, or (b)
provide explicit per-pod aggregation variants such as using sum by(pod)(...) or
sum without reduction but grouped appropriately; reference the three derived
operators (the two expressions using sum(dynamo_frontend_stage_requests) and
sum(dynamo_frontend_active_requests) - sum(dynamo_frontend_stage_requests), and
the Router saturation sum(dynamo_frontend_stage_requests{stage="route"})) and
add a short note showing the per-pod and cluster-wide forms so readers aren’t
misled about scope.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c658035a-fead-43c4-8a2f-56a426fd75dc

📥 Commits

Reviewing files that changed from the base of the PR and between ddd19a6 and b63fddd.

📒 Files selected for processing (2)
  • docs/kubernetes/autoscaling.md
  • docs/observability/metrics.md

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 21, 2026

Comment thread docs/kubernetes/autoscaling.md Outdated
@jh-nv jh-nv merged commit 5135c32 into main Apr 22, 2026
63 of 64 checks passed
@jh-nv jh-nv deleted the jihao/frontend_doc branch April 22, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants