*: stale reads compatible with prepare#25156
Conversation
05ebf02 to
5ba47c9
Compare
There was a problem hiding this comment.
What will happen in this situation?
create table t (id int)
insert into t values (1) ---- 1s
prepare s from select * from t as of timestamp Now() - INTERVAL 2 Second -----2s
execute s ; ----- 4s
alter table t add column age int; ----- 5s
insert into t values (2, 2) --------5s
execute s; ----- 8s
maybe we need to store the expression rather than SanpshotTS.
There was a problem hiding this comment.
I wanted to prepare the handle of infoschema in #25080, but that is not true anymore. The handle of infoschema is now added.
There was a problem hiding this comment.
I want to get the result (2, 2) in the time point 8s. but I guess you implement will always get the result 1.
There was a problem hiding this comment.
Yeah, the timestamp is fixed once preprocessed now.
Your expectation needs a re-preprocessing, which drops all the prepared efforts. It is not a simple reason of my implementation. If timestamp is changed, schema and snapshot is changed, the result from previous prepare statements is meaningless.
Its semantic may be intuitive, but it does not make much sense under our current framework of prepare statements.
A full support, I guess, will need to store the expr of as of into the plan. And the optimizer should consider infoschema and snapshot in their optimization process. Then it will be much easier to implement.
There was a problem hiding this comment.
I could provide a hack solution, if you really want it to be true.
There was a problem hiding this comment.
No, it must be re-preprocessed. Infoschema may change. Even if it is not changed, a check for stale infoschema is needed, which leads to one more tikv metadata access.
Otherwise, the result is inconsistent.
Agree with you.
Glad to hear that but it can not be true for now. I can only hack around. The solution will be tricky.
There was a problem hiding this comment.
Yes, I mean I can accept re-preprocessed, cannot accept the result is fixed on the point in mine example. BTW, Even if we have the infoschema cache, still we need to access TiKV to get the schema version for every request?
There was a problem hiding this comment.
No for the current solution. Because it is already ensured when prepare. If you want to ensure the consistency for NOW() - 2 seconds, however, yes, you need extra request. You don't know the schema version before 2 seconds. It is not prepared due to NOW(). If you don't check, it may be outdated.
FYI, check
Lines 103 to 107 in 5c95062
SchemaVersion.
There was a problem hiding this comment.
I have checked it, the implementation is different from what I understand. I think the schema version is guaranteed to increase, it could be optimized to reduce the extra request by check the schema version is not changed between the older request and the up-to-date schema version.
There was a problem hiding this comment.
Otherwise, it is easy to have a single point of bottleneck with the current implementation, because the meta information is always read from a certain TiKV node.
There was a problem hiding this comment.
While the change of infoschema for stale reads are expected, you may argue that preparedStmtExec does not have such logic, i.e. change infoschema for updateRead.
But it does. It will eventually invoke planner.Optimize, which does get the latest infoschema for preparedStmt.ForUpdateRead. It is located at https://github.com/pingcap/tidb/blob/master/planner/optimize.go#L240-L250.
There was a problem hiding this comment.
a.ExplicitStaleness is added as a parameter, found #25206 lost the point get case, just like me.
There was a problem hiding this comment.
Moved to handleAsOfAndReadTS to center the initialize of LastSnapshotTS and other parameters.
|
@JmPotato: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. DetailsReviewer can indicate their review by submitting an approval review. |
|
/merge |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: 703a67d |
|
/run-unit-test |
|
/run-integration-common-test |
| tk.MustExec(fmt.Sprintf(`start transaction read only as of timestamp '%s'`, time2.Format("2006-1-2 15:04:05.000"))) | ||
| c.Assert(tk.ExecToErr(staleSQL), NotNil) | ||
| tk.MustExec("commit") | ||
|
|
||
| // test prepared stale select in stale txn | ||
| tk.MustExec(fmt.Sprintf(`start transaction read only as of timestamp '%s'`, time2.Format("2006-1-2 15:04:05.000"))) | ||
| tk.MustQuery("execute s").Check(staleRows) | ||
| tk.MustExec("commit") |
There was a problem hiding this comment.
Is execute s expected to be executed successfully in staleness transaction while the staleSQL are not?
There was a problem hiding this comment.
Yes, it is expected. And it should be possible to throw an error in OptimizePreparedPlan.
There was a problem hiding this comment.
IMO this is not expected. We should avoid multiple snapshotTS in a transaction as users may think it's data inconsistency error during transaction.
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
|
cherry pick to release-5.1 in PR #25371 |
| // handleAsOfAndReadTS tries to handle as of closure, or possibly read_ts. | ||
| // If read_ts is not nil, it will be consumed. | ||
| // If as of is not nil, timestamp is used to get the history infoschema from the infocache. | ||
| func (p *preprocessor) handleAsOfAndReadTS(node *ast.AsOfClause) { |
There was a problem hiding this comment.
I think handleAsOfAndReadTS is much complex now. We need to add example to explain it and add unit test to ensured logic.
What problem does this PR solve?
Issue Number: close #24932
Problem Summary: Prepare is now compatible with stale reads.
What is changed and how it works?
What's Changed:
snapshotTS != 0is used to indicate that it is a stale read.CachedPrepareStmthas a new fieldSnapshotTSto store the result of preprocessing inPrepareExec.newExecutorBuilderhas a new parameter,snapshotTS. It is passed fromsession.ExecutePreparedStmt, alongcachedPlanExecorCompileExecutePreparedStmt. The one in session comes fromCachedPrepareStmt.IsCachedExecOkandOptimizePreparedPlan, ifsnapshotTS != 0.plannercore.Executeplan has a new fieldSnapshotTSto store the timestamp fromCachedPrepareStmt.executorExec.buildExecutewill assignsnapshotTSfromplannercore.Executeplan toexecutorBuilder.1-3 covered the prepared execution directly from session/mysql-protocol, and 4-5 covered the execution from plain SQL.
NEW CHANGES: to support dynamic timestamp like
NOW() - INTERVAL 1 SECOND:CachedPrepareStmtdo not store infoschema anymore. You need to recheck infoschema before execute to ensure consistency anyway.As Ofstatements evaluation is stored/prepared as a closure to support dynamic re-evaluation. For non-prepare statements, they can useLastSnapshotTSto avoid re-evaluation.Check List
Tests
Release note