feat(abuse): implement cost reporting for spend-based heuristics#22
feat(abuse): implement cost reporting for spend-based heuristics#22
Conversation
Code Review SummaryStatus: 1 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Files Reviewed (4 files)
|
210f9a1 to
86a2067
Compare
| @@ -205,26 +205,15 @@ export async function POST(request: NextRequest): Promise<NextResponseType<unkno | |||
| ); | |||
| console.debug(`Routing request to ${provider.id}`); | |||
|
|
|||
| // Fire-and-forget abuse classification as early as possible | |||
| void classifyAbuse(request, requestBodyParsed, { | |||
| // Start abuse classification early (non-blocking) - we'll await it before creating usage context | |||
There was a problem hiding this comment.
SUGGESTION: Misleading comment about when abuse classification is awaited
classifyPromise is awaited after the upstream response (via Promise.race([...])), not before creating usageContext. Updating this comment avoids confusion for future maintainers.
| // Start abuse classification early (non-blocking) - we'll await it before creating usage context | |
| // Start abuse classification early (non-blocking) - we'll await it later to attach request_id to usageContext |
| // Await abuse classification (with timeout) to get request_id for cost tracking correlation | ||
| const classifyResult = await Promise.race([ | ||
| classifyPromise, | ||
| new Promise<null>(resolve => setTimeout(() => resolve(null), 2000)), |
There was a problem hiding this comment.
WARNING: setTimeout in Promise.race isn't cleared when classifyPromise resolves first
If classification returns quickly, the setTimeout is still scheduled and will fire later. In serverless/edge runtimes this can keep extra work queued and may extend the invocation unnecessarily. Consider storing the timeout id and clearTimeout when classifyPromise settles (or use an abortable timeout helper).
There was a problem hiding this comment.
Ah this is true. Let me fix this
| @@ -392,6 +381,25 @@ export async function POST(request: NextRequest): Promise<NextResponseType<unkno | |||
|
|
|||
| const clonedReponse = response.clone(); // reading from body is side-effectful | |||
|
|
|||
| // Await abuse classification (with timeout) to get request_id for cost tracking correlation | |||
| const classifyResult = await Promise.race([ | |||
There was a problem hiding this comment.
WARNING: Awaiting abuse classification adds up to 2s latency on the critical path
await Promise.race([...]) runs before accountForMicrodollarUsage(...) and before returning the proxied response, so each request can incur up to ~2s extra tail latency. Consider moving the timeout-await into the fire-and-forget usage accounting path (e.g., have accountForMicrodollarUsage await classifyPromise internally), so client response timing isn't affected while still capturing request_id when available.
There was a problem hiding this comment.
@jrf0110 i guess this is true? could we fire this after so that we can get request_id eventually (if ever) and then call countAndStoreUsage ?
There was a problem hiding this comment.
The p99's for the requests to the abuse service are way less than 2s, so in practice, the classify response is going to be resolved already. I didn't want to have to pass in the promise into accountForMicrodollarUsage to resolve it there as it seemed like that would be some mega-mixed concerns programming.
There was a problem hiding this comment.
pandemicsyn
left a comment
There was a problem hiding this comment.
One inline where kilo flagged something. But pre-approving because im not super confident.
Add functionality to report actual request costs back to the abuse service. This enables the abuse service to track real-time spend and trigger heuristics like free tier exhaustion.
Remove the ABUSE_SERVICE_SECRET and its associated X-Service-Secret header. Authentication is now managed via Cloudflare Access headers.
Executed via three sub-agent runs: 1. Browse (Flow 2) — sub-agent verified browse-direct matches upstream count 2. Full lifecycle (Flows 3, 4, 6, 7) — sub-agent verified post -> claim -> done -> accept on item w-870be07fbc through PRs #8-11. Stamp s-35e8a923... issued with author=jrf0110, subject=jfawcett. 3. Branches (Flows 5, 8, 9) — sub-agent verified unclaim (item w-68aa4ab1dd, PR #14), reject (w-d2cf6acf6a, PR #18), and close (w-89e6720ca4, PR #22) on fresh items. Only Flow 10 (disconnect) remains — it is a pure gastown operation and doesn't touch upstream DoltHub state, so lower priority. Total: 19 PRs merged, 4 rigs in play (jfawcett contributor, jrf0110 maintainer), full lifecycle graph exercised.

Add functionality to report actual request costs back to the abuse service. This enables the abuse service to track real-time spend and trigger heuristics like free tier exhaustion.