feat: monitoring demand endpoint decoupled from completion windows by fergusfinn · Pull Request #987 · doublewordai/control-layer

fergusfinn · 2026-04-14T11:41:16Z

Summary

Adds `GET /admin/api/v1/monitoring/demand?window=<...>` — same per-model deadline-bucketed counts that the existing `/pending-request-counts` returns, but the caller picks the window shape via query parameter instead of implicitly inheriting `config.batches.allowed_completion_windows`.

Each `window=` entry is either:

`` — shorthand for `0s:`, matches the legacy "due within N" behaviour. Example: `window=1h,24h`.
`:` — disjoint range. Example: `window=0s:1h,1h:24h`.

Both bounds are offsets from `now`. Shapes can be mixed in a single call. The raw input string is echoed back as the response key, so scouter-style callers that send `window=1h,24h` still see `"1h"` / `"24h"` keys.

Why

Scouter reads this endpoint to plan replica allocation and only cares about "demand in the next N hours". The old endpoint picked its buckets from whatever completion-window SLAs the batch API exposed to users — e.g. if product adds a `6h` SLA tier for a customer, scouter's buckets would silently shift under it. This PR treats that coupling as the implementation detail it is and lets planning consumers name their own windows.

Dependencies

fusillade#217 — extends `Storage::get_pending_request_counts_by_model_and_completion_window` to take start/end per window and renames it. Pinned via `[patch.crates-io]` in this PR's `Cargo.toml`; remove once fusillade publishes.
scouter PR to follow — switch to `/demand?window=1h,24h`.

Kept untouched

`GET /admin/api/v1/monitoring/pending-request-counts` still returns the same shape and still pulls windows from `allowed_completion_windows` — no scouter-side deployment coordination needed. We can remove it once no caller uses it.

Test plan

Old `/pending-request-counts` tests still pass (renamed internal call to new fusillade signature with `start=0`).
New `/demand` tests cover: permission, missing param, empty param, arbitrary window shapes.
`just lint rust` passes.
`just test rust --lib queue` passes (10 tests).

Post-merge

Release fusillade#217 to crates.io.
Bump `fusillade` dep in `dwctl/Cargo.toml`, remove the `[patch.crates-io]` block in workspace `Cargo.toml`.
Release control-layer, roll to production.
Ship the scouter PR (it references `/demand?window=1h,24h` — identical semantics to today).
(later) Remove the old `/pending-request-counts` endpoint once nothing reads it.

Adds \`GET /admin/api/v1/monitoring/demand?window=<...>\` — returns the same per-model deadline-bucketed counts that scouter and friends need for planning, but lets the caller choose the window shape instead of implicitly taking whatever is in \`config.batches.allowed_completion_windows\`. Each \`window=\` entry is either \`<end>\` (shorthand for \`0s:<end>\`, matching the legacy "due within N" behaviour) or \`<start>:<end>\` for a disjoint range. Both bounds are offsets from \`now\`. Caller can mix shapes in one request and the raw string is echoed back as the response key. Example: GET /admin/api/v1/monitoring/demand?window=1h,24h → cumulative: due within 1h, due within 24h GET /admin/api/v1/monitoring/demand?window=0s:1h,1h:24h → disjoint: due 0-1h, due 1-24h Follows the trait rename in fusillade#217 (\`get_pending_request_counts_by_model_and_completion_window\` → \`get_pending_request_counts_by_model_and_window\`, windows now carry both start and end). Old \`/pending-request-counts\` endpoint is kept as-is so consumers can migrate at their own pace; it now feeds the new signature with \`start=0\` per window for behavioural parity. Pins fusillade via \`[patch.crates-io]\` temporarily; will drop once fusillade releases.

cloudflare-workers-and-pages · 2026-04-14T11:41:54Z

Deploying control-layer with Cloudflare Pages

Latest commit:	`81926ed`
Status:	✅ Deploy successful!
Preview URL:	https://1e5e0f13.control-layer.pages.dev
Branch Preview URL:	https://feat-demand-by-lookahead-end.control-layer.pages.dev

View logs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e37416144

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T11:45:10Z

+# TODO(demand-endpoint): remove once fusillade releases the
+# `get_pending_request_counts_by_model_and_window` signature (doublewordai/fusillade#217).
+[patch.crates-io]
+fusillade = { path = "../fusillade" }


Remove local fusillade path override before merging

Adding [patch.crates-io] fusillade = { path = "../fusillade" } makes this repository unbuildable unless a sibling ../fusillade checkout is manually present. In this repo’s CI (.github/workflows/ci.yaml), jobs only do actions/checkout of this repository, so dependency resolution fails (e.g., cargo metadata --format-version 1 errors reading ../fusillade/Cargo.toml) and backend/build pipelines are blocked.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-14T11:45:10Z

+        return None;
+    }
+    let (start_secs, end_secs) = match trimmed.split_once(':') {
+        Some((start, end)) => (parse_window_to_seconds(start), parse_window_to_seconds(end)),


Parse 0s demand ranges as zero, not 24h

parse_demand_window reuses parse_window_to_seconds for the range start, but that helper treats non-positive values as invalid and converts them to 24h. As a result, documented inputs like window=0s:1h are parsed as (86400, 3600), yielding an inverted/empty bucket instead of “now to 1h,” so the new disjoint-window API returns incorrect demand counts for valid caller input.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-14T11:45:10Z

+) -> Result<Json<PendingCountsByModelAndWindow>, Error> {
+    let windows: Vec<(String, i64, i64)> = params.window.split(',').filter_map(parse_demand_window).collect();
+
+    if windows.is_empty() {


Reject malformed demand windows instead of coercing

get_demand only returns 400 when no windows are parsed, but malformed entries are still parsed because parse_window_to_seconds silently defaults invalid tokens to 24h. That means requests like window=bad or window=1h:bad return 200 with misleading bucket calculations instead of a client error, which contradicts the endpoint contract and can hide caller mistakes.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Adds a new admin monitoring endpoint to query per-model pending/claimed/processing request demand using caller-defined deadline windows, decoupling monitoring buckets from batch completion-window configuration.

Changes:

Introduces GET /admin/api/v1/monitoring/demand?window=... and exposes it via router + OpenAPI.
Updates internal fusillade calls to use the new (label, start, end) window tuple signature.
Temporarily patches the fusillade dependency to a local path to consume an unreleased API change.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
dwctl/src/openapi/admin.rs	Registers the new demand handler in the admin OpenAPI surface.
dwctl/src/lib.rs	Routes `/monitoring/demand` to the new handler.
dwctl/src/api/handlers/queue.rs	Implements `get_demand`, query parsing, and adds endpoint tests; updates legacy endpoint to new fusillade signature.
dwctl/src/api/handlers/batches.rs	Updates window tuple shapes and fusillade callsites for capacity reservation logic + tests.
Cargo.toml	Adds a workspace `[patch.crates-io]` override for `fusillade`.
Cargo.lock	Reflects the path override by removing registry source/checksum for `fusillade`.

+fn parse_demand_window(raw: &str) -> Option<(String, i64, i64)> {
+    let trimmed = raw.trim();
+    if trimmed.is_empty() {
+        return None;
+    }
+    let (start_secs, end_secs) = match trimmed.split_once(':') {
+        Some((start, end)) => (parse_window_to_seconds(start), parse_window_to_seconds(end)),
+        None => (0, parse_window_to_seconds(trimmed)),
+    };
+    Some((trimmed.to_string(), start_secs, end_secs))


+        response.assert_status_ok();
+        let counts: HashMap<String, HashMap<String, i64>> = response.json();
+        assert_eq!(counts.len(), 0, "no pending requests exist in a clean database");
+    }


+# TODO(demand-endpoint): remove once fusillade releases the
+# `get_pending_request_counts_by_model_and_window` signature (doublewordai/fusillade#217).
+[patch.crates-io]
+fusillade = { path = "../fusillade" }


Follow-up to fusillade#217 review feedback — the trait method now takes Option<i64> for the lower bound to preserve the old "includes overdue" semantic when start is None, and uses a half-open interval on the upper bound so adjacent disjoint windows don't double-count. control-layer changes: - \`/pending-request-counts\`: windows now pass \`None\` for start, so overdue requests are still counted exactly like before the rename. - \`/demand\`: shorthand \`window=N\` maps to \`(label, None, N_secs)\` — behavioural parity with the legacy endpoint for scouter and friends. Explicit \`window=S:E\` maps to \`(label, Some(S_secs), E_secs)\` for strict half-open bounds. - \`reserve_capacity_for_batch\`: same signature change. The reservation path wants to include overdue work too.

fusillade 16.0.0 is now on crates.io with the \`get_pending_request_counts_by_model_and_window\` signature this PR uses (doublewordai/fusillade#217). Drop the \`[patch.crates-io]\` block that pointed at the local checkout and pin the released version. Cargo.lock also picks up a few windows-sys 0.60 → 0.61 transitive bumps as a side-effect of re-resolution; unrelated to this PR.

Addresses copilot feedback on #987: The original \`parse_demand_window\` reused \`parse_window_to_seconds\` from the batch-API path, which is forgiving by design — it defaults zero/negative/malformed input to 24h. That meant: - \`window=0s:1h\` parsed \`0s\` as 86400 (the 24h default for non-positive values), flipping the intended "strict from now" semantic. - \`window=foo\` silently succeeded with 24h buckets instead of being rejected. Fix: a local \`parse_demand_duration\` that strictly parses \`<int><unit>\` (h/m/s, non-negative integer) and returns \`None\` on anything malformed. \`parse_demand_window\` propagates the error up as a string, and the handler maps it to \`400 Bad Request\`. Tests: added regressions for \`window=0s:1h\` (must succeed) and \`window=foo\` / \`1x\` / \`1h,bad\` / \`-1h:1h\` (must 400).

fergusfinn · 2026-04-14T12:54:11Z

Review feedback addressed in the follow-up commit:

P0 (local fusillade patch): already removed in the previous commit — dwctl/Cargo.toml now pins fusillade = "16.0.0" (on crates.io) and the [patch.crates-io] block is gone.
P1 (0s coerced to 24h): introduced a local strict parser parse_demand_duration that returns None on malformed/negative input and accepts 0. Demand's parse_demand_window now propagates parse failures, and the handler maps them to 400. window=0s:1h now parses 0s as zero seconds.
P2 (malformed input silently accepted): window=foo, window=1x, window=1h,bad, window=-1h:1h all 400 now. Lenient parse_window_to_seconds (batch-API forgiving-on-purpose) is untouched.
Test coverage: added test_demand_accepts_zero_start and test_demand_rejects_malformed_window.

Copilot AI review requested due to automatic review settings April 14, 2026 11:41

Copilot started reviewing on behalf of fergusfinn April 14, 2026 11:41 View session

chatgpt-codex-connector Bot reviewed Apr 14, 2026

View reviewed changes

Copilot AI reviewed Apr 14, 2026

View reviewed changes

fergusfinn added 3 commits April 14, 2026 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: monitoring demand endpoint decoupled from completion windows#987

feat: monitoring demand endpoint decoupled from completion windows#987
fergusfinn wants to merge 4 commits into
mainfrom
feat/demand-by-lookahead-endpoint

fergusfinn commented Apr 14, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

fergusfinn commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fergusfinn commented Apr 14, 2026

Summary

Why

Dependencies

Kept untouched

Test plan

Post-merge

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying control-layer with Cloudflare Pages

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

fergusfinn commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cloudflare-workers-and-pages Bot commented Apr 14, 2026 •

edited

Loading