Skip to content

feat: monitoring demand endpoint decoupled from completion windows#987

Open
fergusfinn wants to merge 4 commits into
mainfrom
feat/demand-by-lookahead-endpoint
Open

feat: monitoring demand endpoint decoupled from completion windows#987
fergusfinn wants to merge 4 commits into
mainfrom
feat/demand-by-lookahead-endpoint

Conversation

@fergusfinn
Copy link
Copy Markdown
Contributor

Summary

Adds `GET /admin/api/v1/monitoring/demand?window=<...>` — same per-model deadline-bucketed counts that the existing `/pending-request-counts` returns, but the caller picks the window shape via query parameter instead of implicitly inheriting `config.batches.allowed_completion_windows`.

Each `window=` entry is either:

  • `` — shorthand for `0s:`, matches the legacy "due within N" behaviour. Example: `window=1h,24h`.
  • `:` — disjoint range. Example: `window=0s:1h,1h:24h`.

Both bounds are offsets from `now`. Shapes can be mixed in a single call. The raw input string is echoed back as the response key, so scouter-style callers that send `window=1h,24h` still see `"1h"` / `"24h"` keys.

Why

Scouter reads this endpoint to plan replica allocation and only cares about "demand in the next N hours". The old endpoint picked its buckets from whatever completion-window SLAs the batch API exposed to users — e.g. if product adds a `6h` SLA tier for a customer, scouter's buckets would silently shift under it. This PR treats that coupling as the implementation detail it is and lets planning consumers name their own windows.

Dependencies

  • fusillade#217 — extends `Storage::get_pending_request_counts_by_model_and_completion_window` to take start/end per window and renames it. Pinned via `[patch.crates-io]` in this PR's `Cargo.toml`; remove once fusillade publishes.
  • scouter PR to follow — switch to `/demand?window=1h,24h`.

Kept untouched

`GET /admin/api/v1/monitoring/pending-request-counts` still returns the same shape and still pulls windows from `allowed_completion_windows` — no scouter-side deployment coordination needed. We can remove it once no caller uses it.

Test plan

  • Old `/pending-request-counts` tests still pass (renamed internal call to new fusillade signature with `start=0`).
  • New `/demand` tests cover: permission, missing param, empty param, arbitrary window shapes.
  • `just lint rust` passes.
  • `just test rust --lib queue` passes (10 tests).

Post-merge

  1. Release fusillade#217 to crates.io.
  2. Bump `fusillade` dep in `dwctl/Cargo.toml`, remove the `[patch.crates-io]` block in workspace `Cargo.toml`.
  3. Release control-layer, roll to production.
  4. Ship the scouter PR (it references `/demand?window=1h,24h` — identical semantics to today).
  5. (later) Remove the old `/pending-request-counts` endpoint once nothing reads it.

Adds \`GET /admin/api/v1/monitoring/demand?window=<...>\` — returns the
same per-model deadline-bucketed counts that scouter and friends need
for planning, but lets the caller choose the window shape instead of
implicitly taking whatever is in \`config.batches.allowed_completion_windows\`.

Each \`window=\` entry is either \`<end>\` (shorthand for \`0s:<end>\`,
matching the legacy "due within N" behaviour) or \`<start>:<end>\` for
a disjoint range. Both bounds are offsets from \`now\`. Caller can mix
shapes in one request and the raw string is echoed back as the
response key.

Example:
  GET /admin/api/v1/monitoring/demand?window=1h,24h
  → cumulative: due within 1h, due within 24h
  GET /admin/api/v1/monitoring/demand?window=0s:1h,1h:24h
  → disjoint: due 0-1h, due 1-24h

Follows the trait rename in fusillade#217
(\`get_pending_request_counts_by_model_and_completion_window\` →
\`get_pending_request_counts_by_model_and_window\`, windows now carry
both start and end).

Old \`/pending-request-counts\` endpoint is kept as-is so consumers can
migrate at their own pace; it now feeds the new signature with
\`start=0\` per window for behavioural parity.

Pins fusillade via \`[patch.crates-io]\` temporarily; will drop once
fusillade releases.
Copilot AI review requested due to automatic review settings April 14, 2026 11:41
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 14, 2026

Deploying control-layer with  Cloudflare Pages  Cloudflare Pages

Latest commit: 81926ed
Status: ✅  Deploy successful!
Preview URL: https://1e5e0f13.control-layer.pages.dev
Branch Preview URL: https://feat-demand-by-lookahead-end.control-layer.pages.dev

View logs

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e37416144

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread Cargo.toml Outdated
# TODO(demand-endpoint): remove once fusillade releases the
# `get_pending_request_counts_by_model_and_window` signature (doublewordai/fusillade#217).
[patch.crates-io]
fusillade = { path = "../fusillade" }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Remove local fusillade path override before merging

Adding [patch.crates-io] fusillade = { path = "../fusillade" } makes this repository unbuildable unless a sibling ../fusillade checkout is manually present. In this repo’s CI (.github/workflows/ci.yaml), jobs only do actions/checkout of this repository, so dependency resolution fails (e.g., cargo metadata --format-version 1 errors reading ../fusillade/Cargo.toml) and backend/build pipelines are blocked.

Useful? React with 👍 / 👎.

Comment thread dwctl/src/api/handlers/queue.rs Outdated
return None;
}
let (start_secs, end_secs) = match trimmed.split_once(':') {
Some((start, end)) => (parse_window_to_seconds(start), parse_window_to_seconds(end)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse 0s demand ranges as zero, not 24h

parse_demand_window reuses parse_window_to_seconds for the range start, but that helper treats non-positive values as invalid and converts them to 24h. As a result, documented inputs like window=0s:1h are parsed as (86400, 3600), yielding an inverted/empty bucket instead of “now to 1h,” so the new disjoint-window API returns incorrect demand counts for valid caller input.

Useful? React with 👍 / 👎.

) -> Result<Json<PendingCountsByModelAndWindow>, Error> {
let windows: Vec<(String, i64, i64)> = params.window.split(',').filter_map(parse_demand_window).collect();

if windows.is_empty() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject malformed demand windows instead of coercing

get_demand only returns 400 when no windows are parsed, but malformed entries are still parsed because parse_window_to_seconds silently defaults invalid tokens to 24h. That means requests like window=bad or window=1h:bad return 200 with misleading bucket calculations instead of a client error, which contradicts the endpoint contract and can hide caller mistakes.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new admin monitoring endpoint to query per-model pending/claimed/processing request demand using caller-defined deadline windows, decoupling monitoring buckets from batch completion-window configuration.

Changes:

  • Introduces GET /admin/api/v1/monitoring/demand?window=... and exposes it via router + OpenAPI.
  • Updates internal fusillade calls to use the new (label, start, end) window tuple signature.
  • Temporarily patches the fusillade dependency to a local path to consume an unreleased API change.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
dwctl/src/openapi/admin.rs Registers the new demand handler in the admin OpenAPI surface.
dwctl/src/lib.rs Routes /monitoring/demand to the new handler.
dwctl/src/api/handlers/queue.rs Implements get_demand, query parsing, and adds endpoint tests; updates legacy endpoint to new fusillade signature.
dwctl/src/api/handlers/batches.rs Updates window tuple shapes and fusillade callsites for capacity reservation logic + tests.
Cargo.toml Adds a workspace [patch.crates-io] override for fusillade.
Cargo.lock Reflects the path override by removing registry source/checksum for fusillade.

Comment thread dwctl/src/api/handlers/queue.rs Outdated
Comment on lines +40 to +49
fn parse_demand_window(raw: &str) -> Option<(String, i64, i64)> {
let trimmed = raw.trim();
if trimmed.is_empty() {
return None;
}
let (start_secs, end_secs) = match trimmed.split_once(':') {
Some((start, end)) => (parse_window_to_seconds(start), parse_window_to_seconds(end)),
None => (0, parse_window_to_seconds(trimmed)),
};
Some((trimmed.to_string(), start_secs, end_secs))
response.assert_status_ok();
let counts: HashMap<String, HashMap<String, i64>> = response.json();
assert_eq!(counts.len(), 0, "no pending requests exist in a clean database");
}
Comment thread Cargo.toml Outdated
# TODO(demand-endpoint): remove once fusillade releases the
# `get_pending_request_counts_by_model_and_window` signature (doublewordai/fusillade#217).
[patch.crates-io]
fusillade = { path = "../fusillade" }
Follow-up to fusillade#217 review feedback — the trait method now takes
Option<i64> for the lower bound to preserve the old "includes overdue"
semantic when start is None, and uses a half-open interval on the upper
bound so adjacent disjoint windows don't double-count.

control-layer changes:

- \`/pending-request-counts\`: windows now pass \`None\` for start, so
  overdue requests are still counted exactly like before the rename.
- \`/demand\`: shorthand \`window=N\` maps to \`(label, None, N_secs)\` —
  behavioural parity with the legacy endpoint for scouter and friends.
  Explicit \`window=S:E\` maps to \`(label, Some(S_secs), E_secs)\` for
  strict half-open bounds.
- \`reserve_capacity_for_batch\`: same signature change. The reservation
  path wants to include overdue work too.
fusillade 16.0.0 is now on crates.io with the
\`get_pending_request_counts_by_model_and_window\` signature this PR
uses (doublewordai/fusillade#217). Drop the \`[patch.crates-io]\` block
that pointed at the local checkout and pin the released version.

Cargo.lock also picks up a few windows-sys 0.60 → 0.61 transitive bumps
as a side-effect of re-resolution; unrelated to this PR.
Addresses copilot feedback on #987:

The original \`parse_demand_window\` reused \`parse_window_to_seconds\`
from the batch-API path, which is forgiving by design — it defaults
zero/negative/malformed input to 24h. That meant:

- \`window=0s:1h\` parsed \`0s\` as 86400 (the 24h default for non-positive
  values), flipping the intended "strict from now" semantic.
- \`window=foo\` silently succeeded with 24h buckets instead of being
  rejected.

Fix: a local \`parse_demand_duration\` that strictly parses \`<int><unit>\`
(h/m/s, non-negative integer) and returns \`None\` on anything malformed.
\`parse_demand_window\` propagates the error up as a string, and the
handler maps it to \`400 Bad Request\`.

Tests: added regressions for \`window=0s:1h\` (must succeed) and
\`window=foo\` / \`1x\` / \`1h,bad\` / \`-1h:1h\` (must 400).
@fergusfinn
Copy link
Copy Markdown
Contributor Author

Review feedback addressed in the follow-up commit:

  • P0 (local fusillade patch): already removed in the previous commit — dwctl/Cargo.toml now pins fusillade = "16.0.0" (on crates.io) and the [patch.crates-io] block is gone.
  • P1 (0s coerced to 24h): introduced a local strict parser parse_demand_duration that returns None on malformed/negative input and accepts 0. Demand's parse_demand_window now propagates parse failures, and the handler maps them to 400. window=0s:1h now parses 0s as zero seconds.
  • P2 (malformed input silently accepted): window=foo, window=1x, window=1h,bad, window=-1h:1h all 400 now. Lenient parse_window_to_seconds (batch-API forgiving-on-purpose) is untouched.
  • Test coverage: added test_demand_accepts_zero_start and test_demand_rejects_malformed_window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants