Skip to content

feat(monitoring): add fairness query param to pending-request-counts#1042

Open
pjb157 wants to merge 2 commits into
mainfrom
feat/fairness-pending-counts
Open

feat(monitoring): add fairness query param to pending-request-counts#1042
pjb157 wants to merge 2 commits into
mainfrom
feat/fairness-pending-counts

Conversation

@pjb157
Copy link
Copy Markdown
Contributor

@pjb157 pjb157 commented May 3, 2026

Summary

Adds an opt-in ?fairness=true query parameter to GET /admin/api/v1/monitoring/pending-request-counts.

When set, each user's contribution to a (model, completion_window) bucket is capped at the bucket's average pending count, so a single dominant user cannot inflate the reported depth beyond what a per-user fair scheduler will actually drain. Default behaviour (flag absent) is unchanged.

Policy

For each (model, window) bucket:

  • p_u = pending count for user u (only users with p_u > 0)
  • U = number of such users
  • T = sum(p_u)
  • fair_share = max(1, ceil(T / U))
  • effective(u) = min(p_u, fair_share)
  • effective_count = sum(effective(u))

Reduces to T when demand is balanced; caps a dominant user's contribution at the bucket average otherwise.

Dependencies

Depends on the upstream fusillade trait method get_effective_pending_request_counts_by_model_and_window:
👉 doublewordai/fusillade#251

CI will fail until that PR ships and dwctl/Cargo.toml is bumped to the new fusillade release. The new dependency was verified locally against the fusillade branch via [patch.crates-io].

Test plan

  • New handler test test_pending_request_counts_fairness_caps_dominant_user passes locally
  • Existing queue::tests (4 tests including the new one) all pass
  • just lint rust passes (with local fusillade patch)
  • Re-run CI once fusillade is released and bumped here

pjb157 added 2 commits May 3, 2026 21:42
Adds an opt-in '?fairness=true' query parameter that switches the
endpoint to a per-user fair-share-capped depth signal. When set, each
user's contribution to a (model, window) bucket is capped at the
bucket's average pending count, so a single dominant user cannot
inflate the reported depth beyond what a fair scheduler will drain.

Default behaviour (flag absent) is unchanged.

Depends on fusillade trait method
get_effective_pending_request_counts_by_model_and_window
(doublewordai/fusillade#251).
Describes the per-user fair-share cap formula, its properties, and the
contract with claim_requests in fusillade — both call sites must move
together when the policy changes.
Copilot AI review requested due to automatic review settings May 3, 2026 23:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in fairness mode to the monitoring endpoint that reports pending request depth, so downstream consumers can request a per-user-capped view of queue depth instead of the existing raw counts. In the dwctl codebase, this extends the backend monitoring surface and updates the accompanying SCOUTER documentation for operators.

Changes:

  • Add a fairness query parameter to GET /admin/api/v1/monitoring/pending-request-counts.
  • Route fair-mode requests to a new fusillade storage method and include a new handler test for dominant-user capping.
  • Expand docs/SCOUTER.md with the fairness policy and its intended contract with fusillade scheduling.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
dwctl/src/api/handlers/queue.rs Adds query parsing, fair/non-fair branching in the monitoring handler, tracing, and a new integration-style test.
docs/SCOUTER.md Documents the new fairness-aware mode, policy details, and maintenance guidance tied to fusillade behavior.

Comment thread docs/SCOUTER.md
Comment on lines +24 to +25
- `GET /admin/api/v1/monitoring/pending-request-counts` — raw pending counts (default).
- `GET /admin/api/v1/monitoring/pending-request-counts?fairness=true` — counts with each user's contribution capped at the bucket's average.
Comment on lines +84 to +87
let counts = if q.fairness {
state
.request_manager
.get_effective_pending_request_counts_by_model_and_window(&windows, &states, &model_filter, &service_tier_filter, false)
Comment on lines +84 to 98
let counts = if q.fairness {
state
.request_manager
.get_effective_pending_request_counts_by_model_and_window(&windows, &states, &model_filter, &service_tier_filter, false)
.await
} else {
state
.request_manager
.get_pending_request_counts_by_model_and_window(&windows, &states, &model_filter, &service_tier_filter, false)
.await
}
.map_err(|e| Error::Internal {
operation: format!("get pending request counts: {}", e),
})?;

Comment thread docs/SCOUTER.md
- Caps a single dominant user's contribution at the bucket average so the
reported depth does not over-state demand a per-user fair scheduler will
throttle.
- Stateless wrt in-flight counts — pure function of pending rows.
Comment thread docs/SCOUTER.md
Comment on lines +56 to +57
1. Update the SQL in both `claim_requests` and
`get_effective_pending_request_counts_by_model_and_window`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants