Skip to content

Conversation

@Jibing-Li
Copy link
Contributor

backport: #41203

…sumption. (apache#41203)

For string type columns, use xxhash_64 to transfer column value to an
integer, and then calculate the NDV based on the integer hash value. In
this case, we can reduce the memory cost of sample analyze and improve
the performance.
For example, l_comment column of TPCH 100G lineitem table. The memory
cost to calculate its NDV is reduced to 8GB from 22GB
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Jibing-Li Jibing-Li marked this pull request as ready for review September 27, 2024 04:00
@Jibing-Li
Copy link
Contributor Author

run buildall

@Jibing-Li Jibing-Li merged commit 1baf0db into apache:branch-2.1 Sep 27, 2024
@Jibing-Li Jibing-Li deleted the reduceMem2.1 branch September 27, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants