Skip to content

resourcecontrol, tikvrpc: add PredictedReadBytes hint for RC paging pre-charge#1947

Open
YuhaoZhang00 wants to merge 7 commits into
tikv:masterfrom
YuhaoZhang00:feat/rc-paging-precharge
Open

resourcecontrol, tikvrpc: add PredictedReadBytes hint for RC paging pre-charge#1947
YuhaoZhang00 wants to merge 7 commits into
tikv:masterfrom
YuhaoZhang00:feat/rc-paging-precharge

Conversation

@YuhaoZhang00
Copy link
Copy Markdown

@YuhaoZhang00 YuhaoZhang00 commented Apr 21, 2026

Issue Number: close #1953

What is changed and how it works?

Adds an optional caller-supplied read-bytes estimate that the resource-control layer forwards to PD's controller as a pre-charge basis for paging-style requests.

  • tikvrpc.Request: new PredictedReadBytes uint64 field. This is a client-go-internal hint, not carried in the wire proto — it is consumed before the RPC is sent and TiKV never observes it.
  • internal/resourcecontrol.RequestInfo: new predictedReadBytes field populated from tikvrpc.Request.PredictedReadBytes in MakeRequestInfo, exposed via a PredictedReadBytes() getter.

Together these let an upstream caller (TiDB cop iterator, driven by a per-logical-scan EMA) attach a read-bytes prediction to a paging RPC. PD's resource-group controller consumes this hint via the required PredictedReadBytes() method on its RequestInfo interface and pre-charges RU before the RPC goes out; see the matching PD PR. When the hint is zero (the default for any caller that doesn't set the field, including all writes), behavior is unchanged — the request is billed by actual read bytes at settlement.

Related PRs

Check List

  • Unit test
None

Summary by CodeRabbit

  • New Features

    • Requests may include an optional predicted read-bytes hint; when >0 it is used for resource-control pre-charging, otherwise billing uses actual read bytes.
    • Read requests targeting coprocessor endpoints are explicitly identifiable for resource handling.
  • Bug Fixes

    • Transport-level failures now trigger a refund/cancel path so pre-charged resources are returned when a request fails before response.
  • Tests

    • Added tests for predicted-read-bytes propagation and for request wait/response/cancel behavior on sync/async transport success and failure.

Review Change Stack

Add a client-go-internal PredictedReadBytes field on tikvrpc.Request so
the caller (e.g. TiDB, maintaining a per-logical-scan EMA across paging
cop RPCs) can supply a learned estimate of how many bytes the request
will read. MakeRequestInfo propagates this into RequestInfo, and the new
RequestInfo.PredictedReadBytes() getter satisfies the optional
predictedReadBytesProvider interface that PD's resource_group/controller
checks via type assertion.

When the hint is > 0, PD uses it as the byte basis for RC paging
pre-charge. Zero means the caller has no prediction (e.g. cold start);
the request is not pre-charged and is billed at settlement time by
actual read bytes only.

The field is kept out of the proto because it is purely a client-side
estimate consumed before the RPC is sent - TiKV neither needs nor reads
it.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Drop the verbose restatement of how the hint is produced and consumed;
keep only what a reader of this struct needs to know to set the field.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot ti-chi-bot Bot added dco-signoff: yes Indicates the PR's author has signed the dco. contribution This PR is from a community contributor. labels Apr 21, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds a caller-supplied PredictedReadBytes field to tikvrpc.Request, passes it into internal/resourcecontrol.RequestInfo for read-path requests, exposes it via RequestInfo.PredictedReadBytes() and IsCop(), ensures resource-control pre-charges are refunded on transport-level failures, and adds unit tests validating these behaviors.

Changes

Request Type Extension

Layer / File(s) Summary
Request contract field
tikvrpc/tikvrpc.go
Added PredictedReadBytes uint64 to tikvrpc.Request as an optional caller-supplied read-byte estimate.

Resource Control

Layer / File(s) Summary
RequestInfo fields and accessors
internal/resourcecontrol/resource_control.go
Added predictedReadBytes uint64 and isCop bool to RequestInfo; added PredictedReadBytes() uint64 and IsCop() bool accessors and isCopRequest helper.
MakeRequestInfo wiring and tests
internal/resourcecontrol/resource_control.go, internal/resourcecontrol/resource_control_test.go
Populate predictedReadBytes and isCop in MakeRequestInfo for the read-path; added TestMakeRequestInfoPredictedReadBytes to assert propagation and zero-default; retained TestMakeRequestInfoIsCop.

Client Interceptor

Layer / File(s) Summary
Cancel pre-charge on transport failure
internal/client/client_interceptor.go, internal/client/client_interceptor_test.go
When transport returns resp == nil, call OnRequestCancel to refund pre-charged RU for both sync and async paths; added recordingInterceptor test double and sync/async tests verifying cancel vs response settlement.

Module pin

Layer / File(s) Summary
Temporary go.mod replace
go.mod
Add temporary replace directive for github.com/tikv/pd/client pointing to a forked commit (with a revert comment).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Suggested labels

lgtm, approved

Suggested reviewers

  • nolouch
  • ekexium

Poem

🐰 I counted bytes in fields unseen,

A small hint carried, tidy and lean,
If zero sits, the meter waits,
Else pre-charge guards the paging gates,
Hopper hops — the bytes now seen.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a PredictedReadBytes hint to resourcecontrol and tikvrpc for RC paging pre-charge, which is the central focus of this PR.
Linked Issues check ✅ Passed All coding objectives from issue #1953 are met: PredictedReadBytes field added to tikvrpc.Request, predictedReadBytes integrated into RequestInfo with getter, cancellation handling added for failed RPCs, and comprehensive tests included.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue objectives: adding the hint transport, integrating it into resource control, handling cancellation for pre-charged requests, and pinning the PD client dependency as noted in the commit messages.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
internal/resourcecontrol/resource_control_test.go (1)

52-74: Test coverage is adequate for the propagation contract.

Both the hint-present and hint-absent (zero-value default) paths are asserted, and IsWrite() == false is verified so the read-path branch in MakeRequestInfo is the one under test.

One optional addition worth considering: a case asserting that for a write request (e.g., CmdPrewrite) the hint is not propagated (i.e., PredictedReadBytes() returns 0 even when set on the tikvrpc.Request). This pins down the current write-path behavior so any future accidental propagation would be caught.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resourcecontrol/resource_control_test.go` around lines 52 - 74, Add
a test that verifies PredictedReadBytes is not propagated for write requests:
create a tikvrpc.Request with Req set to a write command (e.g.,
&kvrpcpb.CmdPrewrite{}), Context with a Peer, and PredictedReadBytes non-zero;
call MakeRequestInfo(req) and assert that the returned
RequestInfo.PredictedReadBytes() == 0 and IsWrite() == true to ensure write-path
does not carry the read hint. Ensure the new case mirrors the existing read-case
structure so it fails if MakeRequestInfo incorrectly propagates the hint for
writes.
internal/resourcecontrol/resource_control.go (1)

90-131: Hint is silently dropped on write path — confirm this is intentional.

predictedReadBytes is only assigned in the read branch (Line 104). In the write branch (Lines 123–130) the field is left zero even if the caller set req.PredictedReadBytes. Given the field is documented as a read-bytes estimate, this is almost certainly intentional — but since there's no guard at the call site, a caller mistakenly setting it on a write request gets silent data loss rather than a signal.

Two low-cost options if you want to make the contract more explicit:

  • Add a short comment in the write branch noting the hint is intentionally ignored for writes.
  • Or drop the field at the Request struct level into a helper that is only read on read paths (current design already effectively does this via the getter being consulted only by PD's read-path pre-charge — so a comment is probably enough).

No change strictly required; flagging for visibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resourcecontrol/resource_control.go` around lines 90 - 131,
MakeRequestInfo drops req.PredictedReadBytes on the write path (in the branch
building RequestInfo for write requests), causing the hint to be silently lost;
either preserve it by setting predictedReadBytes: req.PredictedReadBytes in the
RequestInfo returned by the write-path branch (the struct built at the end of
MakeRequestInfo) or, if the drop is intentional, add an explicit comment in
MakeRequestInfo (near the write-path return) stating that predictedReadBytes is
ignored for write requests to avoid silent surprises; reference symbols:
MakeRequestInfo, RequestInfo, predictedReadBytes, req.PredictedReadBytes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/resourcecontrol/resource_control_test.go`:
- Around line 52-74: Add a test that verifies PredictedReadBytes is not
propagated for write requests: create a tikvrpc.Request with Req set to a write
command (e.g., &kvrpcpb.CmdPrewrite{}), Context with a Peer, and
PredictedReadBytes non-zero; call MakeRequestInfo(req) and assert that the
returned RequestInfo.PredictedReadBytes() == 0 and IsWrite() == true to ensure
write-path does not carry the read hint. Ensure the new case mirrors the
existing read-case structure so it fails if MakeRequestInfo incorrectly
propagates the hint for writes.

In `@internal/resourcecontrol/resource_control.go`:
- Around line 90-131: MakeRequestInfo drops req.PredictedReadBytes on the write
path (in the branch building RequestInfo for write requests), causing the hint
to be silently lost; either preserve it by setting predictedReadBytes:
req.PredictedReadBytes in the RequestInfo returned by the write-path branch (the
struct built at the end of MakeRequestInfo) or, if the drop is intentional, add
an explicit comment in MakeRequestInfo (near the write-path return) stating that
predictedReadBytes is ignored for write requests to avoid silent surprises;
reference symbols: MakeRequestInfo, RequestInfo, predictedReadBytes,
req.PredictedReadBytes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: de10caf0-abb3-4c5e-9151-3f6c02498470

📥 Commits

Reviewing files that changed from the base of the PR and between 2115057 and f0e16ed.

📒 Files selected for processing (3)
  • internal/resourcecontrol/resource_control.go
  • internal/resourcecontrol/resource_control_test.go
  • tikvrpc/tikvrpc.go

YuhaoZhang00 added a commit to YuhaoZhang00/tidb that referenced this pull request Apr 22, 2026
Temporarily replace github.com/tikv/client-go/v2 and github.com/tikv/pd/client
with the corresponding forks carrying the PredictedReadBytes hint and
controller-side pre-charge logic so CI can build this PR before the two
upstream PRs land:

  - github.com/YuhaoZhang00/client-go/v2  (tikv/client-go#1947)
  - github.com/YuhaoZhang00/pd/client     (tikv/pd#10611)

Will be reverted and replaced with a normal require bump once both PRs
merge and tag new releases.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
YuhaoZhang00 added a commit to YuhaoZhang00/tidb that referenced this pull request Apr 22, 2026
Temporarily replace github.com/tikv/client-go/v2 and github.com/tikv/pd/client
with the corresponding forks carrying the PredictedReadBytes hint and
controller-side pre-charge logic so CI can build this PR before the two
upstream PRs land:

  - github.com/YuhaoZhang00/client-go/v2  (tikv/client-go#1947)
  - github.com/YuhaoZhang00/pd/client     (tikv/pd#10611)

Will be reverted and replaced with a normal require bump once both PRs
merge and tag new releases.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
PD promoted PredictedReadBytes to a required method on RequestInfo, so
this is no longer an optional duck-typed satisfaction.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
bypass: bypass,
requestSize: uint64(req.GetSize()),
accessType: toPDAccessLocationType(req.AccessLocation),
predictedReadBytes: req.PredictedReadBytes,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With PD pre-charge enabled, this hint needs a settlement path even when the RPC fails before returning a response. Today OnResponseWait is only called when resp != nil, so a transport error/timeout after pre-charge can leave the predicted read RU charged without refund. Can we add a no-response settlement/refund path or a test that documents this behavior?

Copy link
Copy Markdown
Author

@YuhaoZhang00 YuhaoZhang00 May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by adding calling resourceControlInterceptor.OnRequestCancel() which does refund when resp == nil

Commit 54a00f4

PD's resource-control controller needs to gate paging_* accounting on
coprocessor requests only; non-cop reads (CmdGet, CmdBatchGet, CmdScan,
internal lookups) reach the same interceptor but never participate in
the EMA pre-charge path and must not appear in paging_nonprecharge_*.

Add an isCop field on RequestInfo derived from tikvrpc.Request.Type at
construction time, and expose it via IsCop() to satisfy PD's new
RequestInfo interface method.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot ti-chi-bot Bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 18, 2026
interceptedClient.SendRequest and SendRequestAsync today call
OnResponseWait only when the underlying RPC returns a non-nil response.
That made sense in the legacy billing model (only base RRU / WRU at
risk on transport failure) but became a real RU leak after the PD-side
paging pre-charge: the predicted RU debited in OnRequestWait stayed
permanently charged whenever the RPC failed before producing a
response (connection drop, timeout, context cancellation).

When resp == nil, invoke the new ResourceGroupKVInterceptor.OnRequestCancel
so PD refunds the predicted RU to the limiter and rolls back the
consumption row it added in OnRequestWait. Cancel and settlement are
mutually exclusive on a given request — the if/else structure makes
that invariant explicit at every call site.

Pin the PD client replace to the matching tikv/pd#10611 commit; the
replace will be removed once that PR is merged and tagged.

Adds TestSendRequest{Cancels,DoesNotCancel}{,Async}* covering the
three combinations (sync failure / sync success / async failure)
through a recordingInterceptor mock so the wiring is verified
end-to-end without touching a real PD.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 18, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cfzjywxk for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 18, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/client/client_interceptor_test.go (1)

216-237: ⚡ Quick win

Add async success-path assertion for settle-vs-cancel behavior.

Line 216 currently covers async transport failure only. Add a companion async success test to assert OnResponseWait runs and OnRequestCancel stays zero.

Proposed test
+func TestSendRequestAsyncDoesNotCancelOnSuccess(t *testing.T) {
+	rec := withRecordingInterceptor(t)
+	client := NewInterceptedClient(respondingClient{})
+
+	done := make(chan struct{})
+	var gotResp *tikvrpc.Response
+	var gotErr error
+	cb := async.NewCallback(nil, func(resp *tikvrpc.Response, err error) {
+		gotResp = resp
+		gotErr = err
+		close(done)
+	})
+
+	client.SendRequestAsync(context.Background(), "", newRGRequest(), cb)
+	<-done
+
+	assert.NotNil(t, gotResp)
+	assert.NoError(t, gotErr)
+	assert.Equal(t, 1, rec.waitCalls)
+	assert.Equal(t, 1, rec.respCalls)
+	assert.Equal(t, 0, rec.cancelCalls)
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/client/client_interceptor_test.go` around lines 216 - 237, Add a new
async-success test companion to
TestSendRequestAsyncCancelsPreChargeOnTransportFailure that verifies the settle
vs cancel behavior on a successful transport: create a recording interceptor
with withRecordingInterceptor(t), construct an intercepted client using a
passing client (e.g., successfulClient or a client that returns a valid
tikvrpc.Response), invoke client.SendRequestAsync(context.Background(), "",
newRGRequest(), cb) with an async callback that captures resp and err and closes
a done channel, wait on done, then assert resp is non-nil and err is nil and
assert rec.waitCalls == 1, rec.respCalls == 1 and rec.cancelCalls == 0 (with a
message like "OnRequestCancel must not run on the async path when resp is
non-nil"); place the new test alongside
TestSendRequestAsyncCancelsPreChargeOnTransportFailure and name it e.g.
TestSendRequestAsyncSettlesPreChargeOnTransportSuccess.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/client/client_interceptor_test.go`:
- Around line 216-237: Add a new async-success test companion to
TestSendRequestAsyncCancelsPreChargeOnTransportFailure that verifies the settle
vs cancel behavior on a successful transport: create a recording interceptor
with withRecordingInterceptor(t), construct an intercepted client using a
passing client (e.g., successfulClient or a client that returns a valid
tikvrpc.Response), invoke client.SendRequestAsync(context.Background(), "",
newRGRequest(), cb) with an async callback that captures resp and err and closes
a done channel, wait on done, then assert resp is non-nil and err is nil and
assert rec.waitCalls == 1, rec.respCalls == 1 and rec.cancelCalls == 0 (with a
message like "OnRequestCancel must not run on the async path when resp is
non-nil"); place the new test alongside
TestSendRequestAsyncCancelsPreChargeOnTransportFailure and name it e.g.
TestSendRequestAsyncSettlesPreChargeOnTransportSuccess.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d7a03552-a257-480d-8f7b-28bf13e33018

📥 Commits

Reviewing files that changed from the base of the PR and between 7f49960 and 54a00f4.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go.mod
  • internal/client/client_interceptor.go
  • internal/client/client_interceptor_test.go

@YuhaoZhang00 YuhaoZhang00 requested a review from rleungx May 18, 2026 13:18
// Transport-level failure produced no response — the settlement
// path will not run. Cancel the pre-charge so the predicted RU
// debited in OnRequestWait is refunded to the limiter.
resourceControlInterceptor.OnRequestCancel(ctx, resourceGroupName, reqInfo)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the no-response path we refund the pre-charge in the PD controller, but RUDetails has already been updated with the OnRequestWait consumption above. Because OnRequestCancel does not return a refund delta and this branch does not apply a negative update, TiDB runtime RU stats will still include the predicted/base RU from the failed attempt. If the request is retried and later succeeds, the failed attempt can be counted as phantom RU in runtime stats even though controller tokens/consumption were rolled back. The async cancel path below has the same issue. Could we either have OnRequestCancel return the refund delta and update RUDetails, or keep the request-side consumption here and subtract it on cancel?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. dco-signoff: yes Indicates the PR's author has signed the dco. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

resource control: add PredictedReadBytes hint transport for RC paging pre-charge

2 participants