Improve cardinality, avgRowSize, numNodes stat info in OlapScanNode #256

kangkaisen · 2018-10-30T13:52:04Z

Currently, the cardinality, avgRowSize, numNodes stat info in OlapScanNode is none, So the broadcastCost and partitionCost are both wrong and Doris couldn't auto choose a best join strategy.

So we should make the statistical information in OlapScanNode more precise.

kangkaisen · 2018-11-02T09:19:27Z

Update this commit

fe/src/main/java/org/apache/doris/planner/OlapScanNode.java

imay

LGTM

) ## Proposed changes Hadoop snappycodec source : https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-nativetask/src/main/native/src/codec/SnappyCodec.cc Example: OriginData(The original data will be divided into several large data block.) : large data block1 | large data block2 | large data block3 | .... The large data block will be divided into several small data block. Suppose a large data block is divided into three small blocks: large data block1: | small block1 | small block2 | small block3 | CompressData: <A [B1 compress(small block1) ] [B2 compress(small block1) ] [B3 compress(small block1)]> A : original length of the current block of large data block. sizeof(A) = 4 bytes. A = length(small block1) + length(small block2) + length(small block3) Bx : length of small data block bx. sizeof(Bx) = 4 bytes. Bx = length(compress(small blockx)) Co-authored-by: Socrates <suxiaogang223@icloud.com>

imay force-pushed the master branch from 5afa29b to 765c91b Compare October 30, 2018 15:44

Improve cardinality, avgRowSize, numNodes stat info in OlapScanNode

32986c9

imay requested changes Nov 6, 2018

View reviewed changes

fe/src/main/java/org/apache/doris/planner/OlapScanNode.java Show resolved Hide resolved

Compute cardinality only for one replica

c2af9f4

imay approved these changes Nov 7, 2018

View reviewed changes

imay merged commit 1d8fc4b into apache:master Nov 7, 2018

lide-reed mentioned this pull request Feb 18, 2019

Doris 0.9.0-incubating release notes #406

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve cardinality, avgRowSize, numNodes stat info in OlapScanNode #256

Improve cardinality, avgRowSize, numNodes stat info in OlapScanNode #256

Uh oh!

kangkaisen commented Oct 30, 2018

Uh oh!

kangkaisen commented Nov 2, 2018

Uh oh!

Uh oh!

imay left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Improve cardinality, avgRowSize, numNodes stat info in OlapScanNode #256

Improve cardinality, avgRowSize, numNodes stat info in OlapScanNode #256

Uh oh!

Conversation

kangkaisen commented Oct 30, 2018

Uh oh!

kangkaisen commented Nov 2, 2018

Uh oh!

Uh oh!

imay left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants