Skip to content

client/resource_group: add RC paging pre-charge with PredictedReadBytes hint#10611

Open
YuhaoZhang00 wants to merge 18 commits into
tikv:masterfrom
YuhaoZhang00:feat/rc-paging-precharge
Open

client/resource_group: add RC paging pre-charge with PredictedReadBytes hint#10611
YuhaoZhang00 wants to merge 18 commits into
tikv:masterfrom
YuhaoZhang00:feat/rc-paging-precharge

Conversation

@YuhaoZhang00
Copy link
Copy Markdown
Contributor

@YuhaoZhang00 YuhaoZhang00 commented Apr 21, 2026

What problem does this PR solve?

Issue Number: close #10612

Under RC (resource control), paging coprocessor requests are today billed only at response settlement. A misbehaving resource group can therefore burst large reads before its token bucket reacts, because no RU is reserved up front. This PR introduces a pre-charge mechanism driven by a caller-supplied read-bytes prediction so the token bucket is pressured before the KV RPC is sent, while keeping the final bill accurate at settlement.

What is changed and how does it work?

Commits inside client/resource_group/controller:

  1. Pre-charge on requestBeforeKVRequest now reads PredictedReadBytes directly from RequestInfo (a required method on the interface; writes always return 0 so pre-charge is gated to read RPCs only). When present and > 0, KVCalculator adds ReadBytesCost * predicted to delta.RRU just like any other read cost — the existing acquireTokens path then reserves that many tokens from the limiter. When the hint is zero (writes, or reads that opt out), behavior is identical to today.

  2. Refund on settlementAfterKVRequest now settles the pre-charge by subtracting the predicted basis and adding ReadBytesCost * actual. The resulting signed delta can be negative (we over-estimated), so a new Limiter.RefundTokens primitive is introduced as the inverse of RemoveTokens; onResponseImpl / onResponseWaitImpl branch to refund vs. consume based on the sign. Correctness: total RU billed for a pre-charged request is still exactly ReadBytesCost * actual.

  3. Observability — nine per-resource-group counters/histograms, grouped by sampling unit (count → bytes → RU):

    Count:

    • paging_precharge_total — coprocessor RPCs that triggered pre-charge (PredictedReadBytes hint > 0; self-gated to coprocessor reads because non-cop callers never set the hint)
    • paging_nonprecharge_total — coprocessor RPCs that reached the controller without a PredictedReadBytes hint (EMA cold-start); gated on RequestInfo.IsCop() to keep non-cop reads (CmdGet, CmdBatchGet, CmdScan, internal lookups) out of the metric

    Bytes:

    • paging_precharge_bytes_total — sum of predicted hints used as the pre-charge basis
    • paging_actual_bytes_total — actual bytes read by pre-charged RPCs
    • paging_nonprecharge_actual_bytes_total — actual bytes read by coprocessor RPCs that reached the controller without a hint (same IsCop gating as above)
    • paging_prediction_residual_bytes — distribution of signed actual − predicted for pre-charged RPCs (predictor accuracy)

    RU:

    • paging_precharge_ru_total — RU pre-acquired at BeforeKVRequest for pre-charged RPCs (read base + ReadBytesCost * predicted)
    • paging_settlement_ru_total — total RU finally consumed by pre-charged RPCs (read base + CPUMsCost * kv_cpu_ms + ReadBytesCost * actual); equals precharge_ru + settlement_ru_delta
    • paging_settlement_ru_delta — distribution of per-RPC signed settlement_ru − precharge_ru (negative = refund, positive = extra debit)

Pre-charge is strictly opt-in: a client must pass a non-zero PredictedReadBytes value on a read request; passing 0 (or any write request, which always returns 0) preserves the existing settlement-only path. The wire protocol is unchanged.

Related PRs

Check List

Tests

  • Unit test
    • TestPredictedReadBytesPreCharge / TestNoPreChargeWithoutPredictedReadBytes — Before-side pre-charge
    • TestPagingPreChargeTokenRefund / TestPagingPreChargeNoRefundWhenActualExceedsEstimate / TestPagingPreChargeZeroDelta / TestOnResponseImplPagingRefund — After-side refund
    • TestRefundTokens — limiter primitive

Side effects

  • Increased code complexity (new required RequestInfo.PredictedReadBytes() method + refund path)

Release note

None.

Summary by CodeRabbit

  • New Features

    • Added Prometheus metrics to surface paging pre-charges, actual paging bytes, non-precharge outcomes, prediction residuals, and RU settlement; added a limiter refund operation to return excess tokens.
  • Bug Fixes

    • Ensure excess pre-charged tokens are refunded during response settlement and correct token accounting for over/under-estimates.
  • Tests

    • Added tests covering predicted-read pre-charges, settlement behavior, paging-token refunds, and limiter refunds.

Introduce an optional predictedReadBytesProvider interface on RequestInfo.
When a caller (e.g. TiDB maintaining a per-logical-scan EMA across paging
RPCs) supplies a non-zero PredictedReadBytes, BeforeKVRequest pre-charges
ReadBytesCost * predictedReadBytes RRU so that concurrent workers are
throttled at BeforeKVRequest rather than all hitting AfterKVRequest
settlement at once. AfterKVRequest then subtracts the same basis so the
net (pre-charge + settlement) equals baseCost + actualCost.

Without a hint the request is not pre-charged and is billed in
AfterKVRequest by actual read bytes only - this keeps the RU-billing
pre-charge decoupled from the protocol-level paging byte cap.

The hint is added as an optional interface (not a method on RequestInfo)
so existing RequestInfo implementations compile unchanged; they simply
skip pre-charge.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
…ment

When a request carries a PredictedReadBytes hint, BeforeKVRequest consumes
tokens up-front as a pre-charge. If the actual read bytes come back smaller
than the estimate, the delta represents tokens that were reserved but never
consumed. Previously AfterKVRequest computed a negative delta but called
RemoveTokens unconditionally, which further debited the limiter instead of
giving tokens back.

This commit adds Limiter.RefundTokens as the inverse of RemoveTokens and
wires the response-side paths (onResponseImpl, onResponseWaitImpl) to call
it whenever the settlement delta is negative, so over-estimated pre-charges
are released back to the group's token bucket.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Add per-resource-group Prometheus metrics so operators can observe how the
paging pre-charge path behaves in production and judge EMA prediction
accuracy:

  - paging_precharge_total / paging_precharge_bytes_total: count and byte
    volume of RPCs that arrived with a PredictedReadBytes hint > 0 and
    were pre-charged at BeforeKVRequest.
  - paging_actual_bytes_total: actual read bytes reported by pre-charged
    RPCs, to compute an over/under-charge ratio against the pre-charge
    volume.
  - paging_prediction_residual_bytes: histogram of (actual - predicted)
    bytes for pre-charged RPCs; shows EMA prediction accuracy directly.
  - paging_nonprecharge_total / paging_nonprecharge_actual_bytes_total:
    count and byte volume of read RPCs that implemented the predicted
    hint interface but reported 0 (e.g. EMA cold-start) and therefore
    ran without pre-charge. Paired with paging_precharge_total this
    yields the cold/ready RPC split from the PD client side.

Labels are cached per resource group in groupMetricsCollection to keep
the hot path out of WithLabelValues. Only RequestInfo implementations
that opt into the PredictedReadBytes interface contribute to these
series; existing callers are unaffected.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot ti-chi-bot Bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the dco. labels Apr 21, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Apr 21, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign husharp for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added contribution This PR is from a community contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Apr 21, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Apr 21, 2026

Hi @YuhaoZhang00. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 21, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds optional predicted-read-bytes pre-charge for read requests, new Prometheus paging metrics, a Limiter.RefundTokens method, and updates request/response token accounting and tests to pre-charge, settle, and refund excess tokens.

Changes

Cohort / File(s) Summary
Controller — core logic
client/resource_group/controller/group_controller.go
Record paging pre-charge, actual, non-precharge metrics; compute per-request consumption earlier; branch token settlement to refund excess or remove deficit; adjust wait-path token acquisition and observation.
Metrics & Monitoring
client/resource_group/controller/metrics/metrics.go
Add Prometheus collectors for paging pre-charge counts/bytes, actual bytes, prediction residual histogram, non-precharge counters/bytes, and RU precharge/settlement/delta; register new metrics.
Model & Estimation
client/resource_group/controller/model.go
Introduce predictedReadBytesProvider + estimatedReadBytes; KVCalculator.BeforeKVRequest pre-charges RRU for predicted read bytes; AfterKVRequest subtracts the pre-charge during settlement.
Token Limiter
client/resource_group/controller/limiter.go
Add func (lim *Limiter) RefundTokens(now time.Time, amount float64) to replenish tokens (respects burst/unlimited modes; does not call maybeNotify).
Tests & Test Utilities
client/resource_group/controller/group_controller_test.go, client/resource_group/controller/limiter_test.go, client/resource_group/controller/testutil.go
Add tests validating pre-charge, settlement, and token refund semantics; add predictedReadBytes field and PredictedReadBytes() on TestRequestInfo; add TestRefundTokens.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant GC as GroupController
    participant TL as TokenLimiter
    participant M as Metrics

    Client->>GC: onRequestWaitImpl(info)
    Note over GC: if estimatedReadBytes(info) > 0
    GC->>TL: acquireTokens(precharge RU)
    TL-->>GC: success / error
    GC->>M: observe pre-charge count & bytes

    Client->>GC: onResponseImpl(req, resp)
    GC->>GC: compute consumption delta (actual - precharge)
    alt delta < 0 (excess precharge)
        GC->>TL: RefundTokens(excess)
        TL-->>GC: refunded
    else delta > 0 (deficit)
        GC->>TL: RemoveTokens(deficit)
        TL-->>GC: removed / error
    end
    GC->>M: observe actual bytes, prediction residual, settlement RU
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Suggested labels

size/L, type/development, ok-to-test

Suggested reviewers

  • JmPotato
  • rleungx
  • disksing
  • nolouch

Poem

🐰 I guessed some bytes and tucked them tight,
I pre-charged tokens in the silver night,
If fewer hopped home, I nudged them back,
If more arrived, I let the meter track,
Metrics hum softly — balance kept just right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main change: adding RC paging pre-charge functionality with the PredictedReadBytes hint mechanism.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed PR description comprehensively addresses the problem, changes, implementation details, metrics, tests, and side effects with proper issue linkage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
client/resource_group/controller/group_controller.go (1)

602-626: ⚠️ Potential issue | 🟡 Minor

Pre-charge counter increments even when the request is then throttled.

observePagingPrecharge runs before acquireTokens. If acquireTokens fails with ErrClientResourceGroupThrottled, mu.consumption is rolled back via sub(...), but PagingPrechargeCounter / PagingPrechargeBytesCounter are already incremented, and onResponseImpl/onResponseWaitImpl will never run for this request to balance it with a PagingActualBytesCounter entry. Dashboards comparing actual-vs-precharge bytes (the documented use of these metrics) will show a chronic deficit proportional to throttling rate.

Consider moving the observation to after a successful acquire (or skipping it on the throttled-rollback path), or alternatively add a separate PagingPrechargeThrottledCounter so the denominator stays honest.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/resource_group/controller/group_controller.go` around lines 602 - 626,
The pre-charge metrics are recorded by observePagingPrecharge before calling
acquireTokens, causing unbalanced precharge counts when acquireTokens fails and
consumption is rolled back (sub on gc.mu.consumption); update the logic in
group_controller.go so that observePagingPrecharge is only called after a
successful acquireTokens (i.e., move the observePagingPrecharge call to after
acquireTokens returns without error), or alternatively decrement or record a
separate PagingPrechargeThrottledCounter in the error branch that handles
errs.ErrClientResourceGroupThrottled (the branch that calls sub and returns the
error) to keep PagingPrechargeCounter/PagingPrechargeBytesCounter consistent
with actual completed requests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/resource_group/controller/group_controller.go`:
- Around line 602-604: The metric observations (calls to
gc.metrics.observePagingPrecharge / observePagingPrechargeBytes /
observePagingActualBytes / observePagingPredictionResidual) should only run for
read requests to match the pre-charge behavior in
BeforeKVRequest/AfterKVRequest; update the branches that call
estimatedReadBytes(info) (notably in onResponseImpl and onResponseWaitImpl and
the other occurrences mentioned) to first check that the request is a read
(e.g., !req.IsWrite()) before observing any paging metrics, so pre-charge and
metric recording are gated symmetrically with the RU accounting in model.go.

---

Outside diff comments:
In `@client/resource_group/controller/group_controller.go`:
- Around line 602-626: The pre-charge metrics are recorded by
observePagingPrecharge before calling acquireTokens, causing unbalanced
precharge counts when acquireTokens fails and consumption is rolled back (sub on
gc.mu.consumption); update the logic in group_controller.go so that
observePagingPrecharge is only called after a successful acquireTokens (i.e.,
move the observePagingPrecharge call to after acquireTokens returns without
error), or alternatively decrement or record a separate
PagingPrechargeThrottledCounter in the error branch that handles
errs.ErrClientResourceGroupThrottled (the branch that calls sub and returns the
error) to keep PagingPrechargeCounter/PagingPrechargeBytesCounter consistent
with actual completed requests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f857a943-c822-4ef7-939c-79712b2a1f70

📥 Commits

Reviewing files that changed from the base of the PR and between e430bd0 and 6ebc843.

📒 Files selected for processing (7)
  • client/resource_group/controller/group_controller.go
  • client/resource_group/controller/group_controller_test.go
  • client/resource_group/controller/limiter.go
  • client/resource_group/controller/limiter_test.go
  • client/resource_group/controller/metrics/metrics.go
  • client/resource_group/controller/model.go
  • client/resource_group/controller/testutil.go

Comment thread client/resource_group/controller/group_controller.go Outdated
YuhaoZhang00 added a commit to YuhaoZhang00/tidb that referenced this pull request Apr 22, 2026
Temporarily replace github.com/tikv/client-go/v2 and github.com/tikv/pd/client
with the corresponding forks carrying the PredictedReadBytes hint and
controller-side pre-charge logic so CI can build this PR before the two
upstream PRs land:

  - github.com/YuhaoZhang00/client-go/v2  (tikv/client-go#1947)
  - github.com/YuhaoZhang00/pd/client     (tikv/pd#10611)

Will be reverted and replaced with a normal require bump once both PRs
merge and tag new releases.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
YuhaoZhang00 added a commit to YuhaoZhang00/tidb that referenced this pull request Apr 22, 2026
Temporarily replace github.com/tikv/client-go/v2 and github.com/tikv/pd/client
with the corresponding forks carrying the PredictedReadBytes hint and
controller-side pre-charge logic so CI can build this PR before the two
upstream PRs land:

  - github.com/YuhaoZhang00/client-go/v2  (tikv/client-go#1947)
  - github.com/YuhaoZhang00/pd/client     (tikv/pd#10611)

Will be reverted and replaced with a normal require bump once both PRs
merge and tag new releases.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
client/resource_group/controller/limiter.go (1)

328-340: LGTM on the refund primitive; consider symmetry note.

Implementation mirrors RemoveTokens with the opposite sign and skips maybeNotify (refund can't make us low), which is the right call. Since tokens can temporarily exceed burst, the cap is applied implicitly on the next getTokens() call — that's fine but worth a one-line comment so future readers don't mistake it for a leak of the burst cap.

📝 Optional doc tweak
 // RefundTokens adds tokens back to the limiter.
 //
-// No burst cap is applied here
+// No burst cap is applied here; any overshoot is clamped to burst on the
+// next getTokens() call. Does not call maybeNotify since adding tokens
+// cannot put the limiter into the low-token state.
 func (lim *Limiter) RefundTokens(now time.Time, amount float64) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/resource_group/controller/limiter.go` around lines 328 - 340, The
RefundTokens implementation mirrors RemoveTokens but omits maybeNotify and
allows lim.tokens to temporarily exceed lim.burst because the cap is enforced
later in getTokens; add a single-line comment inside RefundTokens (near the
lim.tokens = tokens + amount line) stating that exceeding burst is intentional
and will be capped by getTokens on the next access, and note that maybeNotify is
intentionally not called for refunds to clarify intent for future readers
(reference symbols: RefundTokens, RemoveTokens, maybeNotify, getTokens, burst).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/resource_group/controller/group_controller.go`:
- Around line 689-710: The change skips calling acquireTokens and recording
successfulRequestDuration when getRUValueFromConsumption(delta) returns exactly
0; to preserve prior metrics behavior, add a v == 0 branch after the existing
if/else-if so that when v == 0 you call
gc.metrics.successfulRequestDuration.Observe(0) (and leave waitDuration
unchanged), referencing the existing symbols getRUValueFromConsumption,
acquireTokens, and gc.metrics.successfulRequestDuration.Observe to locate where
to add the single-line observation.

---

Nitpick comments:
In `@client/resource_group/controller/limiter.go`:
- Around line 328-340: The RefundTokens implementation mirrors RemoveTokens but
omits maybeNotify and allows lim.tokens to temporarily exceed lim.burst because
the cap is enforced later in getTokens; add a single-line comment inside
RefundTokens (near the lim.tokens = tokens + amount line) stating that exceeding
burst is intentional and will be capped by getTokens on the next access, and
note that maybeNotify is intentionally not called for refunds to clarify intent
for future readers (reference symbols: RefundTokens, RemoveTokens, maybeNotify,
getTokens, burst).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f54c1ea-7b65-4175-ba34-2b2aaf771bc9

📥 Commits

Reviewing files that changed from the base of the PR and between 6ebc843 and 6e95720.

📒 Files selected for processing (4)
  • client/resource_group/controller/group_controller.go
  • client/resource_group/controller/limiter.go
  • client/resource_group/controller/metrics/metrics.go
  • client/resource_group/controller/model.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • client/resource_group/controller/metrics/metrics.go

Comment thread client/resource_group/controller/group_controller.go
Comment on lines +164 to +167
// Paging settlement
if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 {
consumption.RRU -= float64(kc.ReadBytesCost) * float64(bytesForEst)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, the logic here is only reached when the response is non-empty (meaning no error occurred).

If a request fails, shouldn't the previously deducted RU be refunded?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refund doesn't happen here. The actual refund lives in onResponse{Wait}Impl, where v < 0 -> RefundTokens. So on failure, the pre-charge is correctly returned via that path.

Added TestPagingPreChargeRefundOnFailedRead unit test to verify.

if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 {
gc.metrics.observePagingActual(bytesForEst, resp.ReadBytes())
} else if !req.IsWrite() {
if _, ok := req.(predictedReadBytesProvider); ok {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, all requests implement this interface, meaning they will all enter this branch under any path.

Given that, what is the purpose of this non-precharge metric?

The previous +/-4MB range clipped large first-page responses on cold
copIterators (predicted=0 -> residual == actual, commonly several MB)
and workload shifts where EMA sits above the current actual. Extending
both ends by two factor-4 steps keeps the same near-zero resolution
while making P95/P99 readable up to the TiKV paging cap.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
EMA is a TiDB-side implementation detail; PD's metric Help text should
describe what the metric observes in terms of the hint contract.
paging_nonprecharge_* also fires when hint is absent entirely (not just
when the predictor produced 0), so reword to say so.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 22, 2026
@YuhaoZhang00 YuhaoZhang00 force-pushed the feat/rc-paging-precharge branch from a8777a9 to 1764549 Compare April 22, 2026 07:26
@ti-chi-bot ti-chi-bot Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 22, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 91.66667% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.13%. Comparing base (b21a183) to head (22b311d).
⚠️ Report is 15 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10611      +/-   ##
==========================================
+ Coverage   78.96%   79.13%   +0.16%     
==========================================
  Files         532      535       +3     
  Lines       71883    73252    +1369     
==========================================
+ Hits        56766    57971    +1205     
- Misses      11093    11219     +126     
- Partials     4024     4062      +38     
Flag Coverage Δ
unittests 79.13% <91.66%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Mirror the existing bytes-dimension paging metrics in RU units:

- paging_precharge_ru_total: RU pre-acquired at BeforeKVRequest for
  pre-charged paging read requests.
- paging_settlement_ru_total: full RU finally consumed per pre-charged
  paging read request (base + CPU + ReadBytesCost * actual_bytes).
- paging_settlement_ru_delta: histogram of signed per-RPC delta
  (settlement_ru - precharge_ru); negative bucket = RefundTokens flow,
  positive = RemoveTokens/acquireTokens flow.

The histogram captures the per-RPC settlement magnitude and direction,
which cannot be reconstructed from the two aggregate counters alone
(sum and max(0,-v) don't commute).

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 22, 2026
@YuhaoZhang00 YuhaoZhang00 force-pushed the feat/rc-paging-precharge branch from 209e7ce to 6efae1c Compare April 22, 2026 07:45
Order as count -> bytes -> RU, matching the conceptual grouping of
sampling units. Add a one-line doc comment to each exported metric to
satisfy revive lint.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@YuhaoZhang00 YuhaoZhang00 force-pushed the feat/rc-paging-precharge branch from 6efae1c to ded66bf Compare April 22, 2026 07:46
Move the !IsWrite() guard into estimatedReadBytes so paging pre-charge,
settlement, and metric observations stay symmetric even if a future
write type implements the optional predictedReadBytesProvider.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Failed reads with a paging hint still go through AfterKVRequest (proven
by the existing write !res.Succeed() payBackWriteCost branch), so the
paging settlement subtracts ReadBytesCost*predicted and the resulting
negative delta flows through RefundTokens. ReadBaseCost is intentionally
not refunded, matching existing non-paging read failure behavior.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Drop the predictedReadBytesProvider optional interface and the type
assertion guards at the paging metric observation sites. The hint is now
part of the RequestInfo contract; client-go and tidb are upgraded
simultaneously so no back-compat shim is needed.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@YuhaoZhang00 YuhaoZhang00 requested a review from JmPotato April 22, 2026 10:24
…ucceeds

Observing the precharge before acquireTokens lets throttled and
ctx-cancelled requests inflate PagingPrechargeCounter /
PagingPrechargeBytesCounter / PagingPrechargeRU with no matching
settlement sample on the response side (since OnResponse is never
called for those requests). Delay the observation until after
acquireTokens returns nil, matching the unconditional observation of
paging_actual metrics in onResponse{,Wait}Impl. Burstable mode still
observes, preserving symmetry with the burstable-agnostic settlement
side.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@JmPotato JmPotato added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Apr 28, 2026
if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 {
gc.metrics.observePagingActual(bytesForEst, resp.ReadBytes(),
getRUValueFromConsumption(count), getRUValueFromConsumption(delta))
} else if !req.IsWrite() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After PredictedReadBytes became part of every RequestInfo, this branch counts every read with a zero prediction, not only paging reads. Can we add an explicit paging/hint-present signal before recording paging_nonprecharge_*?

}
_, tokens := lim.getTokens(now)
lim.last = now
lim.tokens = tokens + amount
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refunding tokens updates the bucket, but existing waiters still sleep on their original timer in WaitReservations. Can RefundTokens notify or wake waiters so over-estimated precharges release blocked requests promptly?

runningKVRequestCounter: metrics.GroupRunningKVRequestCounter.WithLabelValues(name),
consumeTokenHistogram: metrics.TokenConsumedHistogram.WithLabelValues(name),

prechargeCounter: metrics.PagingPrechargeCounter.WithLabelValues(name),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new per-resource-group metric series also need cleanup on group deletion/tombstone cleanup. Otherwise deleted groups can leave stale Paging* series behind.

After PredictedReadBytes was promoted to a required RequestInfo method,
the paging_nonprecharge_* branch fired for every non-write RPC reaching
the resource-control interceptor: point gets, batch gets, scans, and
internal lookups all looked identical to a paging coprocessor request
whose hint happened to be zero. The "paging cold-start" counter
inflated by roughly the full read-RPC volume of the cluster, breaking
the precharge-vs-nonprecharge comparison panel in the TiDB dashboard.

Expose IsCop() on the RequestInfo interface so the nonprecharge branch
can scope itself to coprocessor requests. The hint-present branch is
unchanged because hint>0 already implies a coprocessor caller; only the
zero-hint branch needs the explicit cmd-type signal, which client-go's
implementation will derive from tikvrpc.Request.Type.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Sync the metric variable docs, Prometheus Help strings, and the
observePagingNonprecharge function doc with the new IsCop()-gated
scope: every paging_* metric describes coprocessor RPCs, and the
nonprecharge branch explicitly calls out the IsCop() precondition
and the EMA cold-start meaning.

No behavior change — comments only.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
RefundTokens previously updated the bucket and returned without
notifying anyone. acquireTokens retry loops select on reconfiguredCh
to wake before the next retry-interval tick when fresh tokens arrive
(this is how Reconfigure unblocks them); refunded tokens however
stayed invisible until the timer fired, costing one retry interval
of unnecessary throttling per over-estimated paging settlement.

Mirror the Reconfigure close+recreate pattern at the end of
RefundTokens so the new tokens become immediately visible to retry
waiters. Behavior for in-flight WaitReservations sleepers is
deliberately unchanged — reservations carry their pre-computed
timeToAct and re-evaluating them on refund is a separate design
question.

Also expand the RefundTokens doc to document the intentional
above-burst overshoot (clamped by getTokens on the next access) and
the deliberate omission of maybeNotify.

Adds TestRefundTokensWakesAcquireRetry covering both the wake path
and the burstable-limiter no-op case.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
cleanUpResourceGroup previously only removed ResourceGroupStatusGauge
when a group was tombstoned or went inactive. The nine new paging_*
per-resource-group label series introduced earlier in this PR were
left behind and lingered in Prometheus until process restart.

Add deletePagingLabels on groupMetricsCollection mirroring the
initMetrics ordering and call it from cleanUpResourceGroup alongside
the existing ResourceGroupStatusGauge deletion. The helper sits next
to initMetrics so adding a new paging metric there forces a paired
deletion here.

Older per-group series (SuccessfulRequestDuration, FailedRequestCounter,
TokenConsumedHistogram, ...) have the same leak but predate this PR;
they will be addressed in a separate change.

Adds TestDeletePagingLabelsResetsSeries covering every paging counter
to catch a forgotten DeleteLabelValues in the helper.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
…C failure

OnResponseWait is only called on the client-go side when the underlying
RPC produced a response. A transport-level failure (connection drop,
timeout, context cancellation) leaves the predicted RU pre-charged in
BeforeKVRequest permanently debited from the token bucket and from the
per-group consumption, because the existing settlement path never runs.

Add OnRequestCancel(ctx, name, info) to ResourceGroupKVInterceptor and
implement it on ResourceGroupsController. The implementation recomputes
the BeforeKVRequest delta with the same RequestInfo, subtracts it from
gc.mu.consumption, and refunds the corresponding tokens to the limiter.
The per-store snapshot maintained by onRequestWaitImpl is intentionally
left in place — it is bookkeeping for penalty distribution and the
existing OnRequestWait path already accepts similar imprecision on the
rare failure case.

Adds TestOnRequestCancelRefundsPreCharge covering the round-trip
invariant (tokens and consumption both restored to pre-OnRequestWait
values after cancel).

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
YuhaoZhang00 added a commit to YuhaoZhang00/client-go that referenced this pull request May 18, 2026
interceptedClient.SendRequest and SendRequestAsync today call
OnResponseWait only when the underlying RPC returns a non-nil response.
That made sense in the legacy billing model (only base RRU / WRU at
risk on transport failure) but became a real RU leak after the PD-side
paging pre-charge: the predicted RU debited in OnRequestWait stayed
permanently charged whenever the RPC failed before producing a
response (connection drop, timeout, context cancellation).

When resp == nil, invoke the new ResourceGroupKVInterceptor.OnRequestCancel
so PD refunds the predicted RU to the limiter and rolls back the
consumption row it added in OnRequestWait. Cancel and settlement are
mutually exclusive on a given request — the if/else structure makes
that invariant explicit at every call site.

Pin the PD client replace to the matching tikv/pd#10611 commit; the
replace will be removed once that PR is merged and tagged.

Adds TestSendRequest{Cancels,DoesNotCancel}{,Async}* covering the
three combinations (sync failure / sync success / async failure)
through a recordingInterceptor mock so the wiring is verified
end-to-end without touching a real PD.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented May 18, 2026

@YuhaoZhang00: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-unit-test-next-gen-2 22b311d link true /test pull-unit-test-next-gen-2
pull-unit-test-next-gen-3 22b311d link true /test pull-unit-test-next-gen-3
pull-unit-test-next-gen-1 22b311d link true /test pull-unit-test-next-gen-1

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@YuhaoZhang00 YuhaoZhang00 requested a review from rleungx May 18, 2026 13:19
calc.BeforeKVRequest(count, req)
}
if bytesForEst := estimatedReadBytes(req); bytesForEst > 0 {
gc.metrics.observePagingActual(bytesForEst, resp.ReadBytes(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this paging settlement metric observation until after the positive-delta acquireTokens path succeeds? In onResponseWaitImpl, acquireTokens below can still return an error and the function returns before updating gc.mu.consumption, storeCounter, or globalCounter. In that case paging_actual_bytes_total / paging_settlement_ru_total would record a request that was not actually settled, which makes the “finally consumed” metrics drift from controller accounting. The OnResponse path is fine because it cannot fail after observation; this risk is specific to OnResponseWait.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. dco-signoff: yes Indicates the PR's author has signed the dco. ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

client/resource_group: add RC paging pre-charge controller consuming client-supplied read-bytes hint

3 participants