*: refactor cost model formulas and constants by eurekaka · Pull Request #10581 · pingcap/tidb

eurekaka · 2019-05-23T07:56:41Z

What problem does this PR solve?

Our current cost model is too naive to pick out the physical plans we prefer in some scenarios, for example:

cost such as sorting in index lookup operator, or inner cost of index join operator, is not reflected in cost computing at all;
some cost computings are wrong or not that accurate because we are using wrong input row count estimation (e.g, cost computing of TopN operator);

Besides, cost computings for different operators are not uniform now: some operators consider memory cost, others do not; some operators consider operator parallelism, others do not;

What is changed and how it works?

This PR tries to

1. refine cost model to catch up with the current executor implementations
1. and uniform the dimensions we consider in cost computing for all operators, i.e, CPU cost, memory cost, network cost, scan cost, and operator parallelism.

Check List

Tests

Unit test: some UT results are updated
Integration test: some integration tests are updated

Code changes

Has exported function/method change
Has exported variable/fields change
Has interface methods change

Side effects

Possible performance regression

Related changes

Need to cherry-pick to the release branch: we may need this in release-3.0
Need to be included in the release note

zhouqiang-cl · 2019-05-23T08:15:23Z

/rebuild

codecov · 2019-05-29T13:54:06Z

Codecov Report

Merging #10581 into master will decrease coverage by 0.1794%.
The diff coverage is 96.2643%.

@@               Coverage Diff                @@
##             master     #10581        +/-   ##
================================================
- Coverage   81.4101%   81.2307%   -0.1795%     
================================================
  Files           426        426                
  Lines         92513      92028       -485     
================================================
- Hits          75315      74755       -560     
- Misses        11826      11904        +78     
+ Partials       5372       5369         -3

zhouqiang-cl · 2019-06-03T13:19:23Z

/bench

eurekaka · 2019-06-21T12:13:25Z

/rebuild

eurekaka · 2019-06-21T13:09:29Z

/run-all-tests

eurekaka · 2019-06-21T13:29:26Z

/run-common-test tidb-test=pr/840

eurekaka · 2019-06-21T13:36:43Z

/run-common-test tidb-test=pr/840

eurekaka · 2019-06-21T14:23:02Z

/run-all-tests tidb-test=pr/840

eurekaka · 2019-06-24T05:59:43Z

/run-all-tests tidb-test=pr/840

eurekaka · 2019-06-24T06:23:49Z

/run-all-tests tidb-test=pr/840

lzmhhh123

LGTM.

zz-jason · 2019-08-05T06:31:29Z

we can calculate (colHist.TotColSize == 0 && (colHist.NullCount != coll.Count)) once outside the for loop.

We need to get a valid colHist to make this computation check, if we move this check outside the for loop, the code is pretty ugly.

zz-jason · 2019-08-05T12:32:09Z

how about replacing 1.0 with ts.stats.RowCount? That will be much clearer.

zz-jason · 2019-08-05T13:38:43Z

maybe rCount is incorrect when we can use index scan on the inner side table, in which condition the scan range is decided by the correlated outer side join key.

But we cannot know the selectivity of the outer key until execution.

zz-jason · 2019-08-05T13:51:09Z

should we consider avg row size for each inner row?

The row in memory would have different size compared with its representation in disk and network. Currently, we are using a very small default memoryFactor in order to choose the fastest plan which makes full utilization of resources. To make cost model friendly for memory management, we need to consider row size here indeed. We can leave this to another separate PR later?

eurekaka · 2019-08-07T06:26:21Z

/rebuild

zz-jason

LGTM

sre-bot · 2019-08-07T08:44:42Z

/run-all-tests

sre-bot · 2019-08-07T08:47:40Z

@eurekaka merge failed.

eurekaka · 2019-08-07T09:34:54Z

/run-all-tests tidb-test=pr/840

eurekaka added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner labels May 23, 2019

zz-jason reviewed Jun 3, 2019

View reviewed changes

Comment thread cmd/explaintest/r/tpch.result Outdated

Comment thread executor/builder.go Outdated

eurekaka added the status/WIP label Jun 3, 2019

eurekaka force-pushed the cost_model branch 2 times, most recently from 5c4dfa9 to b97ea8e Compare June 17, 2019 09:56

eurekaka force-pushed the cost_model branch from 8994c27 to e1ea81a Compare June 21, 2019 06:34

eurekaka added the status/all tests passed label Jun 21, 2019

eurekaka marked this pull request as ready for review June 24, 2019 05:52

eurekaka removed the status/WIP label Jun 24, 2019

eurekaka requested review from alivxxx, winoros and zz-jason June 24, 2019 08:20

winoros reviewed Jun 26, 2019

View reviewed changes

Comment thread planner/core/task.go Outdated

Comment thread planner/core/task.go Outdated

alivxxx reviewed Jun 27, 2019

View reviewed changes

Comment thread statistics/table.go Outdated

eurekaka force-pushed the cost_model branch from 1d60670 to a4592c6 Compare July 2, 2019 10:38

eurekaka requested review from alivxxx and winoros July 2, 2019 10:41

eurekaka force-pushed the cost_model branch from a4592c6 to bc465b6 Compare July 18, 2019 08:17

eurekaka changed the title ~~planner: refactor cost model formulas and constants~~ *: refactor cost model formulas and constants Jul 18, 2019

lzmhhh123 reviewed Jul 24, 2019

View reviewed changes

Comment thread planner/core/task.go Outdated

Comment thread planner/core/task.go Outdated

eurekaka force-pushed the cost_model branch from 47b7990 to 346dea3 Compare July 25, 2019 07:47

lzmhhh123 reviewed Jul 26, 2019

View reviewed changes

lzmhhh123 added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 26, 2019

alivxxx reviewed Jul 31, 2019

View reviewed changes

Comment thread planner/core/task.go Outdated

zz-jason reviewed Aug 1, 2019

View reviewed changes

Comment thread executor/index_lookup_join_test.go Outdated

Comment thread planner/core/cbo_test.go Outdated

Comment thread planner/core/exhaust_physical_plans.go Outdated

Comment thread planner/core/exhaust_physical_plans.go Outdated

eurekaka force-pushed the cost_model branch 2 times, most recently from f949384 to 0617428 Compare August 1, 2019 12:59

eurekaka requested review from alivxxx and zz-jason August 1, 2019 13:00

alivxxx removed their request for review August 2, 2019 06:25

qw4990 removed their request for review August 5, 2019 07:38

zz-jason reviewed Aug 5, 2019

View reviewed changes

eurekaka requested a review from zz-jason August 7, 2019 05:59

eurekaka force-pushed the cost_model branch from 0ad067e to 1c7f97e Compare August 7, 2019 06:29

zz-jason approved these changes Aug 7, 2019

View reviewed changes

zz-jason removed request for foreyes and winoros August 7, 2019 08:36

zz-jason added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 7, 2019

Merge branch 'master' into cost_model

621b8e1

eurekaka merged commit fe03864 into pingcap:master Aug 7, 2019

eurekaka deleted the cost_model branch August 7, 2019 09:57

eurekaka mentioned this pull request Aug 15, 2019

planner: increase default concurrency factor of cost computing #11752

Merged

lzmhhh123 pushed a commit to lzmhhh123/tidb that referenced this pull request Jan 19, 2020

*: refactor cost model formulas and constants (pingcap#10581)

f8657bd

Conversation

eurekaka commented May 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Check List

Uh oh!

zhouqiang-cl commented May 23, 2019

Uh oh!

codecov Bot commented May 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zhouqiang-cl commented Jun 3, 2019

Uh oh!

Uh oh!

Uh oh!

eurekaka commented Jun 21, 2019

Uh oh!

eurekaka commented Jun 21, 2019

Uh oh!

eurekaka commented Jun 21, 2019

Uh oh!

eurekaka commented Jun 21, 2019

Uh oh!

eurekaka commented Jun 21, 2019

Uh oh!

eurekaka commented Jun 24, 2019

Uh oh!

eurekaka commented Jun 24, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lzmhhh123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zz-jason Aug 5, 2019

Choose a reason for hiding this comment

Uh oh!

eurekaka Aug 7, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zz-jason Aug 5, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zz-jason Aug 5, 2019

Choose a reason for hiding this comment

Uh oh!

eurekaka Aug 7, 2019

Choose a reason for hiding this comment

Uh oh!

zz-jason Aug 5, 2019

Choose a reason for hiding this comment

Uh oh!

eurekaka Aug 7, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

eurekaka commented Aug 7, 2019

Uh oh!

zz-jason left a comment

Choose a reason for hiding this comment

Uh oh!

sre-bot commented Aug 7, 2019

Uh oh!

sre-bot commented Aug 7, 2019

Uh oh!

eurekaka commented Aug 7, 2019

eurekaka commented May 23, 2019 •

edited

Loading

codecov Bot commented May 29, 2019 •

edited

Loading