Skip to content

feat: add staged frontend gauges#8162

Merged
jh-nv merged 12 commits into
mainfrom
jihao/frontend_metrics_DIS-1702
Apr 20, 2026
Merged

feat: add staged frontend gauges#8162
jh-nv merged 12 commits into
mainfrom
jihao/frontend_metrics_DIS-1702

Conversation

@jh-nv
Copy link
Copy Markdown
Contributor

@jh-nv jh-nv commented Apr 14, 2026

Overview:

Phase one of #8054 implementation. adding new gauges with clear stage boundary that are continuous without overlap.

Details:

  • Add dynamo_frontend_active_requests{model} gauge — same semantic as inflight_requests (HTTP
    entry to response complete) with a clearer name, emitted alongside the existing gauge
  • Add dynamo_frontend_stage_requests{stage, phase} gauge — per-stage inflight count for
    preprocess, route, and dispatch stages, with phase label (prefill/decode/aggregated) for
    disaggregated mode visibility
  • Add StageGuard RAII type that increments on creation and decrements on drop, used at each
    instrumentation point
  • Add unit tests for StageGuard inc/dec behavior and active_requests tracking through
    InflightGuard

Test plan

  • Unit tests: StageGuard inc/dec, label independence, active_requests tracks InflightGuard
  • Container test (vllm, disagg + kv router): all 7 gauge labels appear in /metrics,
    active_requests=3 observed in-flight, all return to 0 after completion
  • Container test (sglang, disagg + kv router): same verification, all gauges present
  • Container test (trtllm, disagg + kv router): same verification, all gauges present

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Release Notes

  • New Features

    • Added per-stage request performance metrics tracking across pipeline phases for improved visibility into request lifecycle
    • Added active request count monitoring for enhanced system observability
    • Added tracking for model migration failures due to sequence length limit exceeded
  • Refactor

    • Streamlined metric definitions by consolidating manual transport-layer metric declarations

Open with Devin

@jh-nv jh-nv requested review from a team as code owners April 14, 2026 16:47
@github-actions github-actions Bot added feat frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` router Relates to routing, KV-aware routing, etc. labels Apr 14, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 14, 2026

Walkthrough

The PR introduces new Prometheus metrics for frontend pipeline instrumentation. It adds a STAGE_REQUESTS gauge with stage/phase labels to track per-stage inflight requests, implements a StageGuard RAII type for automatic metric increment/decrement, introduces ACTIVE_REQUESTS gauge for frontend request counts, and instruments preprocessor, push router, and HTTP service components with these metrics.

Changes

Cohort / File(s) Summary
Metric Name Constants
lib/runtime/src/metrics/prometheus_names.rs, lib/bindings/python/src/dynamo/prometheus_names.py
Added ACTIVE_REQUESTS and STAGE_REQUESTS constant definitions. Python bindings also removed manually-maintained transport.tcp and transport.nats sub-classes and their associated metric constants.
Frontend Performance Metrics
lib/runtime/src/metrics/frontend_perf.rs
Added STAGE_REQUESTS IntGaugeVec metric with stage and phase labels, introduced StageGuard RAII type for automatic gauge increment/decrement, updated metric registration paths, and added unit tests for guard behavior.
HTTP Service Metrics
lib/llm/src/http/service/metrics.rs
Added active_requests_gauge IntGaugeVec field to Metrics struct, updated inflight gauge accounting to increment/decrement both gauges, and added unit test validating gauge mirroring behavior.
Frontend Instrumentation
lib/llm/src/preprocessor.rs, lib/llm/src/kv_router/push_router.rs
Added StageGuard instrumentation for preprocess, route, and dispatch pipeline stages. Extended RequestGuard with first_response_received flag and dispatch_guard field to track dispatch-stage lifecycle until first backend response arrives.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add staged frontend gauges' directly and clearly describes the main change: introducing new gauge metrics with stage boundaries.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The pull request description follows the required template structure with complete Overview, Details, and Related Issues sections that comprehensively explain the changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

Comment thread lib/llm/src/http/service/metrics.rs
Comment thread lib/runtime/src/metrics/frontend_perf.rs Outdated
Comment thread lib/bindings/python/src/dynamo/prometheus_names.py
Comment thread lib/llm/src/kv_router/push_router.rs
Comment thread lib/llm/src/kv_router/push_router.rs Outdated
Comment thread lib/llm/src/http/service/metrics.rs
Comment thread lib/runtime/src/metrics/frontend_perf.rs
Copy link
Copy Markdown
Contributor

@keivenchang keivenchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good — stage gauges are clean, tests cover the RAII lifecycle well. minor nits inline but nothing blocking.

@dmitry-tokarev-nv
Copy link
Copy Markdown
Contributor

@jh-nv The CI was stuck on this PR as there was an issue on main. It is now resolved. To get the fix please pull main and resolve conflicts.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment thread lib/runtime/src/pipeline/network/egress/push_router.rs
@jh-nv jh-nv merged commit 0c98d9a into main Apr 20, 2026
91 checks passed
@jh-nv jh-nv deleted the jihao/frontend_metrics_DIS-1702 branch April 20, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` router Relates to routing, KV-aware routing, etc. size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants