client/resource_group: add RC paging pre-charge with PredictedReadBytes hint#10611
client/resource_group: add RC paging pre-charge with PredictedReadBytes hint#10611YuhaoZhang00 wants to merge 18 commits into
Conversation
Introduce an optional predictedReadBytesProvider interface on RequestInfo. When a caller (e.g. TiDB maintaining a per-logical-scan EMA across paging RPCs) supplies a non-zero PredictedReadBytes, BeforeKVRequest pre-charges ReadBytesCost * predictedReadBytes RRU so that concurrent workers are throttled at BeforeKVRequest rather than all hitting AfterKVRequest settlement at once. AfterKVRequest then subtracts the same basis so the net (pre-charge + settlement) equals baseCost + actualCost. Without a hint the request is not pre-charged and is billed in AfterKVRequest by actual read bytes only - this keeps the RU-billing pre-charge decoupled from the protocol-level paging byte cap. The hint is added as an optional interface (not a method on RequestInfo) so existing RequestInfo implementations compile unchanged; they simply skip pre-charge. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
…ment When a request carries a PredictedReadBytes hint, BeforeKVRequest consumes tokens up-front as a pre-charge. If the actual read bytes come back smaller than the estimate, the delta represents tokens that were reserved but never consumed. Previously AfterKVRequest computed a negative delta but called RemoveTokens unconditionally, which further debited the limiter instead of giving tokens back. This commit adds Limiter.RefundTokens as the inverse of RemoveTokens and wires the response-side paths (onResponseImpl, onResponseWaitImpl) to call it whenever the settlement delta is negative, so over-estimated pre-charges are released back to the group's token bucket. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Add per-resource-group Prometheus metrics so operators can observe how the
paging pre-charge path behaves in production and judge EMA prediction
accuracy:
- paging_precharge_total / paging_precharge_bytes_total: count and byte
volume of RPCs that arrived with a PredictedReadBytes hint > 0 and
were pre-charged at BeforeKVRequest.
- paging_actual_bytes_total: actual read bytes reported by pre-charged
RPCs, to compute an over/under-charge ratio against the pre-charge
volume.
- paging_prediction_residual_bytes: histogram of (actual - predicted)
bytes for pre-charged RPCs; shows EMA prediction accuracy directly.
- paging_nonprecharge_total / paging_nonprecharge_actual_bytes_total:
count and byte volume of read RPCs that implemented the predicted
hint interface but reported 0 (e.g. EMA cold-start) and therefore
ran without pre-charge. Paired with paging_precharge_total this
yields the cold/ready RPC split from the PD client side.
Labels are cached per resource group in groupMetricsCollection to keep
the hot path out of WithLabelValues. Only RequestInfo implementations
that opt into the PredictedReadBytes interface contribute to these
series; existing callers are unaffected.
Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @YuhaoZhang00. Thanks for your PR. I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds optional predicted-read-bytes pre-charge for read requests, new Prometheus paging metrics, a Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant GC as GroupController
participant TL as TokenLimiter
participant M as Metrics
Client->>GC: onRequestWaitImpl(info)
Note over GC: if estimatedReadBytes(info) > 0
GC->>TL: acquireTokens(precharge RU)
TL-->>GC: success / error
GC->>M: observe pre-charge count & bytes
Client->>GC: onResponseImpl(req, resp)
GC->>GC: compute consumption delta (actual - precharge)
alt delta < 0 (excess precharge)
GC->>TL: RefundTokens(excess)
TL-->>GC: refunded
else delta > 0 (deficit)
GC->>TL: RemoveTokens(deficit)
TL-->>GC: removed / error
end
GC->>M: observe actual bytes, prediction residual, settlement RU
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
client/resource_group/controller/group_controller.go (1)
602-626:⚠️ Potential issue | 🟡 MinorPre-charge counter increments even when the request is then throttled.
observePagingPrechargeruns beforeacquireTokens. IfacquireTokensfails withErrClientResourceGroupThrottled,mu.consumptionis rolled back viasub(...), butPagingPrechargeCounter/PagingPrechargeBytesCounterare already incremented, andonResponseImpl/onResponseWaitImplwill never run for this request to balance it with aPagingActualBytesCounterentry. Dashboards comparing actual-vs-precharge bytes (the documented use of these metrics) will show a chronic deficit proportional to throttling rate.Consider moving the observation to after a successful acquire (or skipping it on the throttled-rollback path), or alternatively add a separate
PagingPrechargeThrottledCounterso the denominator stays honest.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/resource_group/controller/group_controller.go` around lines 602 - 626, The pre-charge metrics are recorded by observePagingPrecharge before calling acquireTokens, causing unbalanced precharge counts when acquireTokens fails and consumption is rolled back (sub on gc.mu.consumption); update the logic in group_controller.go so that observePagingPrecharge is only called after a successful acquireTokens (i.e., move the observePagingPrecharge call to after acquireTokens returns without error), or alternatively decrement or record a separate PagingPrechargeThrottledCounter in the error branch that handles errs.ErrClientResourceGroupThrottled (the branch that calls sub and returns the error) to keep PagingPrechargeCounter/PagingPrechargeBytesCounter consistent with actual completed requests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/resource_group/controller/group_controller.go`:
- Around line 602-604: The metric observations (calls to
gc.metrics.observePagingPrecharge / observePagingPrechargeBytes /
observePagingActualBytes / observePagingPredictionResidual) should only run for
read requests to match the pre-charge behavior in
BeforeKVRequest/AfterKVRequest; update the branches that call
estimatedReadBytes(info) (notably in onResponseImpl and onResponseWaitImpl and
the other occurrences mentioned) to first check that the request is a read
(e.g., !req.IsWrite()) before observing any paging metrics, so pre-charge and
metric recording are gated symmetrically with the RU accounting in model.go.
---
Outside diff comments:
In `@client/resource_group/controller/group_controller.go`:
- Around line 602-626: The pre-charge metrics are recorded by
observePagingPrecharge before calling acquireTokens, causing unbalanced
precharge counts when acquireTokens fails and consumption is rolled back (sub on
gc.mu.consumption); update the logic in group_controller.go so that
observePagingPrecharge is only called after a successful acquireTokens (i.e.,
move the observePagingPrecharge call to after acquireTokens returns without
error), or alternatively decrement or record a separate
PagingPrechargeThrottledCounter in the error branch that handles
errs.ErrClientResourceGroupThrottled (the branch that calls sub and returns the
error) to keep PagingPrechargeCounter/PagingPrechargeBytesCounter consistent
with actual completed requests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f857a943-c822-4ef7-939c-79712b2a1f70
📒 Files selected for processing (7)
client/resource_group/controller/group_controller.goclient/resource_group/controller/group_controller_test.goclient/resource_group/controller/limiter.goclient/resource_group/controller/limiter_test.goclient/resource_group/controller/metrics/metrics.goclient/resource_group/controller/model.goclient/resource_group/controller/testutil.go
Temporarily replace github.com/tikv/client-go/v2 and github.com/tikv/pd/client with the corresponding forks carrying the PredictedReadBytes hint and controller-side pre-charge logic so CI can build this PR before the two upstream PRs land: - github.com/YuhaoZhang00/client-go/v2 (tikv/client-go#1947) - github.com/YuhaoZhang00/pd/client (tikv/pd#10611) Will be reverted and replaced with a normal require bump once both PRs merge and tag new releases. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Temporarily replace github.com/tikv/client-go/v2 and github.com/tikv/pd/client with the corresponding forks carrying the PredictedReadBytes hint and controller-side pre-charge logic so CI can build this PR before the two upstream PRs land: - github.com/YuhaoZhang00/client-go/v2 (tikv/client-go#1947) - github.com/YuhaoZhang00/pd/client (tikv/pd#10611) Will be reverted and replaced with a normal require bump once both PRs merge and tag new releases. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
client/resource_group/controller/limiter.go (1)
328-340: LGTM on the refund primitive; consider symmetry note.Implementation mirrors
RemoveTokenswith the opposite sign and skipsmaybeNotify(refund can't make us low), which is the right call. Sincetokenscan temporarily exceedburst, the cap is applied implicitly on the nextgetTokens()call — that's fine but worth a one-line comment so future readers don't mistake it for a leak of the burst cap.📝 Optional doc tweak
// RefundTokens adds tokens back to the limiter. // -// No burst cap is applied here +// No burst cap is applied here; any overshoot is clamped to burst on the +// next getTokens() call. Does not call maybeNotify since adding tokens +// cannot put the limiter into the low-token state. func (lim *Limiter) RefundTokens(now time.Time, amount float64) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@client/resource_group/controller/limiter.go` around lines 328 - 340, The RefundTokens implementation mirrors RemoveTokens but omits maybeNotify and allows lim.tokens to temporarily exceed lim.burst because the cap is enforced later in getTokens; add a single-line comment inside RefundTokens (near the lim.tokens = tokens + amount line) stating that exceeding burst is intentional and will be capped by getTokens on the next access, and note that maybeNotify is intentionally not called for refunds to clarify intent for future readers (reference symbols: RefundTokens, RemoveTokens, maybeNotify, getTokens, burst).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@client/resource_group/controller/group_controller.go`:
- Around line 689-710: The change skips calling acquireTokens and recording
successfulRequestDuration when getRUValueFromConsumption(delta) returns exactly
0; to preserve prior metrics behavior, add a v == 0 branch after the existing
if/else-if so that when v == 0 you call
gc.metrics.successfulRequestDuration.Observe(0) (and leave waitDuration
unchanged), referencing the existing symbols getRUValueFromConsumption,
acquireTokens, and gc.metrics.successfulRequestDuration.Observe to locate where
to add the single-line observation.
---
Nitpick comments:
In `@client/resource_group/controller/limiter.go`:
- Around line 328-340: The RefundTokens implementation mirrors RemoveTokens but
omits maybeNotify and allows lim.tokens to temporarily exceed lim.burst because
the cap is enforced later in getTokens; add a single-line comment inside
RefundTokens (near the lim.tokens = tokens + amount line) stating that exceeding
burst is intentional and will be capped by getTokens on the next access, and
note that maybeNotify is intentionally not called for refunds to clarify intent
for future readers (reference symbols: RefundTokens, RemoveTokens, maybeNotify,
getTokens, burst).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4f54c1ea-7b65-4175-ba34-2b2aaf771bc9
📒 Files selected for processing (4)
client/resource_group/controller/group_controller.goclient/resource_group/controller/limiter.goclient/resource_group/controller/metrics/metrics.goclient/resource_group/controller/model.go
🚧 Files skipped from review as they are similar to previous changes (1)
- client/resource_group/controller/metrics/metrics.go
| // Paging settlement | ||
| if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 { | ||
| consumption.RRU -= float64(kc.ReadBytesCost) * float64(bytesForEst) | ||
| } |
There was a problem hiding this comment.
If I remember correctly, the logic here is only reached when the response is non-empty (meaning no error occurred).
If a request fails, shouldn't the previously deducted RU be refunded?
There was a problem hiding this comment.
The refund doesn't happen here. The actual refund lives in onResponse{Wait}Impl, where v < 0 -> RefundTokens. So on failure, the pre-charge is correctly returned via that path.
Added TestPagingPreChargeRefundOnFailedRead unit test to verify.
| if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 { | ||
| gc.metrics.observePagingActual(bytesForEst, resp.ReadBytes()) | ||
| } else if !req.IsWrite() { | ||
| if _, ok := req.(predictedReadBytesProvider); ok { |
There was a problem hiding this comment.
Looking at the code, all requests implement this interface, meaning they will all enter this branch under any path.
Given that, what is the purpose of this non-precharge metric?
The previous +/-4MB range clipped large first-page responses on cold copIterators (predicted=0 -> residual == actual, commonly several MB) and workload shifts where EMA sits above the current actual. Extending both ends by two factor-4 steps keeps the same near-zero resolution while making P95/P99 readable up to the TiKV paging cap. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
EMA is a TiDB-side implementation detail; PD's metric Help text should describe what the metric observes in terms of the hint contract. paging_nonprecharge_* also fires when hint is absent entirely (not just when the predictor produced 0), so reword to say so. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
a8777a9 to
1764549
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10611 +/- ##
==========================================
+ Coverage 78.96% 79.13% +0.16%
==========================================
Files 532 535 +3
Lines 71883 73252 +1369
==========================================
+ Hits 56766 57971 +1205
- Misses 11093 11219 +126
- Partials 4024 4062 +38
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Mirror the existing bytes-dimension paging metrics in RU units: - paging_precharge_ru_total: RU pre-acquired at BeforeKVRequest for pre-charged paging read requests. - paging_settlement_ru_total: full RU finally consumed per pre-charged paging read request (base + CPU + ReadBytesCost * actual_bytes). - paging_settlement_ru_delta: histogram of signed per-RPC delta (settlement_ru - precharge_ru); negative bucket = RefundTokens flow, positive = RemoveTokens/acquireTokens flow. The histogram captures the per-RPC settlement magnitude and direction, which cannot be reconstructed from the two aggregate counters alone (sum and max(0,-v) don't commute). Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
209e7ce to
6efae1c
Compare
Order as count -> bytes -> RU, matching the conceptual grouping of sampling units. Add a one-line doc comment to each exported metric to satisfy revive lint. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
6efae1c to
ded66bf
Compare
Move the !IsWrite() guard into estimatedReadBytes so paging pre-charge, settlement, and metric observations stay symmetric even if a future write type implements the optional predictedReadBytesProvider. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Failed reads with a paging hint still go through AfterKVRequest (proven by the existing write !res.Succeed() payBackWriteCost branch), so the paging settlement subtracts ReadBytesCost*predicted and the resulting negative delta flows through RefundTokens. ReadBaseCost is intentionally not refunded, matching existing non-paging read failure behavior. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Drop the predictedReadBytesProvider optional interface and the type assertion guards at the paging metric observation sites. The hint is now part of the RequestInfo contract; client-go and tidb are upgraded simultaneously so no back-compat shim is needed. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
…ucceeds
Observing the precharge before acquireTokens lets throttled and
ctx-cancelled requests inflate PagingPrechargeCounter /
PagingPrechargeBytesCounter / PagingPrechargeRU with no matching
settlement sample on the response side (since OnResponse is never
called for those requests). Delay the observation until after
acquireTokens returns nil, matching the unconditional observation of
paging_actual metrics in onResponse{,Wait}Impl. Burstable mode still
observes, preserving symmetry with the burstable-agnostic settlement
side.
Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
| if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 { | ||
| gc.metrics.observePagingActual(bytesForEst, resp.ReadBytes(), | ||
| getRUValueFromConsumption(count), getRUValueFromConsumption(delta)) | ||
| } else if !req.IsWrite() { |
There was a problem hiding this comment.
After PredictedReadBytes became part of every RequestInfo, this branch counts every read with a zero prediction, not only paging reads. Can we add an explicit paging/hint-present signal before recording paging_nonprecharge_*?
| } | ||
| _, tokens := lim.getTokens(now) | ||
| lim.last = now | ||
| lim.tokens = tokens + amount |
There was a problem hiding this comment.
Refunding tokens updates the bucket, but existing waiters still sleep on their original timer in WaitReservations. Can RefundTokens notify or wake waiters so over-estimated precharges release blocked requests promptly?
| runningKVRequestCounter: metrics.GroupRunningKVRequestCounter.WithLabelValues(name), | ||
| consumeTokenHistogram: metrics.TokenConsumedHistogram.WithLabelValues(name), | ||
|
|
||
| prechargeCounter: metrics.PagingPrechargeCounter.WithLabelValues(name), |
There was a problem hiding this comment.
These new per-resource-group metric series also need cleanup on group deletion/tombstone cleanup. Otherwise deleted groups can leave stale Paging* series behind.
After PredictedReadBytes was promoted to a required RequestInfo method, the paging_nonprecharge_* branch fired for every non-write RPC reaching the resource-control interceptor: point gets, batch gets, scans, and internal lookups all looked identical to a paging coprocessor request whose hint happened to be zero. The "paging cold-start" counter inflated by roughly the full read-RPC volume of the cluster, breaking the precharge-vs-nonprecharge comparison panel in the TiDB dashboard. Expose IsCop() on the RequestInfo interface so the nonprecharge branch can scope itself to coprocessor requests. The hint-present branch is unchanged because hint>0 already implies a coprocessor caller; only the zero-hint branch needs the explicit cmd-type signal, which client-go's implementation will derive from tikvrpc.Request.Type. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Sync the metric variable docs, Prometheus Help strings, and the observePagingNonprecharge function doc with the new IsCop()-gated scope: every paging_* metric describes coprocessor RPCs, and the nonprecharge branch explicitly calls out the IsCop() precondition and the EMA cold-start meaning. No behavior change — comments only. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
RefundTokens previously updated the bucket and returned without notifying anyone. acquireTokens retry loops select on reconfiguredCh to wake before the next retry-interval tick when fresh tokens arrive (this is how Reconfigure unblocks them); refunded tokens however stayed invisible until the timer fired, costing one retry interval of unnecessary throttling per over-estimated paging settlement. Mirror the Reconfigure close+recreate pattern at the end of RefundTokens so the new tokens become immediately visible to retry waiters. Behavior for in-flight WaitReservations sleepers is deliberately unchanged — reservations carry their pre-computed timeToAct and re-evaluating them on refund is a separate design question. Also expand the RefundTokens doc to document the intentional above-burst overshoot (clamped by getTokens on the next access) and the deliberate omission of maybeNotify. Adds TestRefundTokensWakesAcquireRetry covering both the wake path and the burstable-limiter no-op case. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
cleanUpResourceGroup previously only removed ResourceGroupStatusGauge when a group was tombstoned or went inactive. The nine new paging_* per-resource-group label series introduced earlier in this PR were left behind and lingered in Prometheus until process restart. Add deletePagingLabels on groupMetricsCollection mirroring the initMetrics ordering and call it from cleanUpResourceGroup alongside the existing ResourceGroupStatusGauge deletion. The helper sits next to initMetrics so adding a new paging metric there forces a paired deletion here. Older per-group series (SuccessfulRequestDuration, FailedRequestCounter, TokenConsumedHistogram, ...) have the same leak but predate this PR; they will be addressed in a separate change. Adds TestDeletePagingLabelsResetsSeries covering every paging counter to catch a forgotten DeleteLabelValues in the helper. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
…C failure OnResponseWait is only called on the client-go side when the underlying RPC produced a response. A transport-level failure (connection drop, timeout, context cancellation) leaves the predicted RU pre-charged in BeforeKVRequest permanently debited from the token bucket and from the per-group consumption, because the existing settlement path never runs. Add OnRequestCancel(ctx, name, info) to ResourceGroupKVInterceptor and implement it on ResourceGroupsController. The implementation recomputes the BeforeKVRequest delta with the same RequestInfo, subtracts it from gc.mu.consumption, and refunds the corresponding tokens to the limiter. The per-store snapshot maintained by onRequestWaitImpl is intentionally left in place — it is bookkeeping for penalty distribution and the existing OnRequestWait path already accepts similar imprecision on the rare failure case. Adds TestOnRequestCancelRefundsPreCharge covering the round-trip invariant (tokens and consumption both restored to pre-OnRequestWait values after cancel). Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
interceptedClient.SendRequest and SendRequestAsync today call OnResponseWait only when the underlying RPC returns a non-nil response. That made sense in the legacy billing model (only base RRU / WRU at risk on transport failure) but became a real RU leak after the PD-side paging pre-charge: the predicted RU debited in OnRequestWait stayed permanently charged whenever the RPC failed before producing a response (connection drop, timeout, context cancellation). When resp == nil, invoke the new ResourceGroupKVInterceptor.OnRequestCancel so PD refunds the predicted RU to the limiter and rolls back the consumption row it added in OnRequestWait. Cancel and settlement are mutually exclusive on a given request — the if/else structure makes that invariant explicit at every call site. Pin the PD client replace to the matching tikv/pd#10611 commit; the replace will be removed once that PR is merged and tagged. Adds TestSendRequest{Cancels,DoesNotCancel}{,Async}* covering the three combinations (sync failure / sync success / async failure) through a recordingInterceptor mock so the wiring is verified end-to-end without touching a real PD. Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
|
@YuhaoZhang00: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| calc.BeforeKVRequest(count, req) | ||
| } | ||
| if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 { | ||
| gc.metrics.observePagingActual(bytesForEst, resp.ReadBytes(), |
There was a problem hiding this comment.
Can we move this paging settlement metric observation until after the positive-delta acquireTokens path succeeds? In onResponseWaitImpl, acquireTokens below can still return an error and the function returns before updating gc.mu.consumption, storeCounter, or globalCounter. In that case paging_actual_bytes_total / paging_settlement_ru_total would record a request that was not actually settled, which makes the “finally consumed” metrics drift from controller accounting. The OnResponse path is fine because it cannot fail after observation; this risk is specific to OnResponseWait.
What problem does this PR solve?
Issue Number: close #10612
Under RC (resource control), paging coprocessor requests are today billed only at response settlement. A misbehaving resource group can therefore burst large reads before its token bucket reacts, because no RU is reserved up front. This PR introduces a pre-charge mechanism driven by a caller-supplied read-bytes prediction so the token bucket is pressured before the KV RPC is sent, while keeping the final bill accurate at settlement.
What is changed and how does it work?
Commits inside
client/resource_group/controller:Pre-charge on request —
BeforeKVRequestnow readsPredictedReadBytesdirectly fromRequestInfo(a required method on the interface; writes always return 0 so pre-charge is gated to read RPCs only). When present and > 0,KVCalculatoraddsReadBytesCost * predictedtodelta.RRUjust like any other read cost — the existingacquireTokenspath then reserves that many tokens from the limiter. When the hint is zero (writes, or reads that opt out), behavior is identical to today.Refund on settlement —
AfterKVRequestnow settles the pre-charge by subtracting the predicted basis and addingReadBytesCost * actual. The resulting signed delta can be negative (we over-estimated), so a newLimiter.RefundTokensprimitive is introduced as the inverse ofRemoveTokens;onResponseImpl/onResponseWaitImplbranch to refund vs. consume based on the sign. Correctness: total RU billed for a pre-charged request is still exactlyReadBytesCost * actual.Observability — nine per-resource-group counters/histograms, grouped by sampling unit (count → bytes → RU):
Count:
paging_precharge_total— coprocessor RPCs that triggered pre-charge (PredictedReadByteshint > 0; self-gated to coprocessor reads because non-cop callers never set the hint)paging_nonprecharge_total— coprocessor RPCs that reached the controller without aPredictedReadByteshint (EMA cold-start); gated onRequestInfo.IsCop()to keep non-cop reads (CmdGet, CmdBatchGet, CmdScan, internal lookups) out of the metricBytes:
paging_precharge_bytes_total— sum of predicted hints used as the pre-charge basispaging_actual_bytes_total— actual bytes read by pre-charged RPCspaging_nonprecharge_actual_bytes_total— actual bytes read by coprocessor RPCs that reached the controller without a hint (same IsCop gating as above)paging_prediction_residual_bytes— distribution of signedactual − predictedfor pre-charged RPCs (predictor accuracy)RU:
paging_precharge_ru_total— RU pre-acquired atBeforeKVRequestfor pre-charged RPCs (read base +ReadBytesCost * predicted)paging_settlement_ru_total— total RU finally consumed by pre-charged RPCs (read base +CPUMsCost * kv_cpu_ms+ReadBytesCost * actual); equalsprecharge_ru + settlement_ru_deltapaging_settlement_ru_delta— distribution of per-RPC signedsettlement_ru − precharge_ru(negative = refund, positive = extra debit)Pre-charge is strictly opt-in: a client must pass a non-zero
PredictedReadBytesvalue on a read request; passing 0 (or any write request, which always returns 0) preserves the existing settlement-only path. The wire protocol is unchanged.Related PRs
Check List
Tests
TestPredictedReadBytesPreCharge/TestNoPreChargeWithoutPredictedReadBytes— Before-side pre-chargeTestPagingPreChargeTokenRefund/TestPagingPreChargeNoRefundWhenActualExceedsEstimate/TestPagingPreChargeZeroDelta/TestOnResponseImplPagingRefund— After-side refundTestRefundTokens— limiter primitiveSide effects
RequestInfo.PredictedReadBytes()method + refund path)Release note
Summary by CodeRabbit
New Features
Bug Fixes
Tests