Skip to content

Conversation

@suxiaogang223
Copy link
Contributor

Proposed changes

Make skip_page() in ColumnChunkReader more efficient. No more reading page headers if there are pagelocations in chunk.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@carlvinhust2012
Copy link
Contributor

can you add some test case in UT or Regression?

@suxiaogang223 suxiaogang223 changed the title Skip page with offset index [opt](paequet)Skip page with offset index Apr 2, 2024
@suxiaogang223 suxiaogang223 changed the title [opt](paequet)Skip page with offset index [opt](parquet)Skip page with offset index Apr 2, 2024
@suxiaogang223 suxiaogang223 force-pushed the skip_page_with_offset_index branch from 7362f16 to 7a6808a Compare April 7, 2024 13:52
@suxiaogang223 suxiaogang223 force-pushed the skip_page_with_offset_index branch from bc447bc to d609254 Compare April 15, 2024 01:41
@AshinGau
Copy link
Member

LGTM

AshinGau
AshinGau previously approved these changes Apr 23, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 23, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@suxiaogang223 suxiaogang223 force-pushed the skip_page_with_offset_index branch from ed38bbc to ee402f1 Compare April 23, 2024 08:26
@suxiaogang223
Copy link
Contributor Author

suxiaogang223 commented Apr 23, 2024

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 31.32 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ee402f1cd43bbcdfe5d0d75a91904858994ed7ca, data reload: false

query1	0.04	0.04	0.02
query2	0.08	0.03	0.03
query3	0.24	0.05	0.05
query4	1.67	0.08	0.07
query5	0.51	0.52	0.51
query6	1.20	0.87	0.81
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.49	0.44	0.45
query10	0.51	0.52	0.51
query11	0.14	0.10	0.12
query12	0.14	0.10	0.10
query13	0.63	0.64	0.63
query14	0.98	0.90	1.05
query15	0.86	0.86	0.85
query16	0.37	0.37	0.38
query17	0.97	0.95	1.05
query18	0.23	0.24	0.24
query19	1.86	1.84	1.82
query20	0.01	0.01	0.01
query21	15.41	0.67	0.66
query22	3.75	7.80	2.06
query23	18.29	1.42	1.31
query24	1.57	0.32	0.27
query25	0.15	0.09	0.11
query26	0.27	0.18	0.17
query27	0.10	0.10	0.09
query28	13.48	1.02	1.01
query29	12.71	3.40	3.42
query30	0.26	0.08	0.06
query31	2.84	0.40	0.40
query32	3.25	0.50	0.49
query33	2.75	2.72	2.75
query34	17.39	4.51	4.56
query35	4.47	4.70	4.46
query36	0.66	0.47	0.46
query37	0.20	0.17	0.18
query38	0.20	0.19	0.18
query39	0.05	0.05	0.04
query40	0.19	0.16	0.15
query41	0.11	0.06	0.06
query42	0.06	0.06	0.06
query43	0.06	0.05	0.04
Total cold run time: 109.22 s
Total hot run time: 31.32 s

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 24, 2024
@suxiaogang223
Copy link
Contributor Author

run buildall

1 similar comment
@suxiaogang223
Copy link
Contributor Author

run buildall

@suxiaogang223 suxiaogang223 force-pushed the skip_page_with_offset_index branch from acadd36 to 0e2294f Compare April 24, 2024 06:57
@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.19% (8918/25343)
Line Coverage: 26.98% (73378/272002)
Region Coverage: 26.15% (37897/144921)
Branch Coverage: 22.98% (19295/83976)
Coverage Report: http://coverage.selectdb-in.cc/coverage/0e2294f29f924a107bb194241f0232e4e6d7c8dc_0e2294f29f924a107bb194241f0232e4e6d7c8dc/report/index.html

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 24, 2024
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman
Copy link
Contributor

Need to add some test methods

@morningman morningman merged commit 95a1be4 into apache:master Apr 26, 2024
yiguolei pushed a commit that referenced this pull request Apr 26, 2024
Make skip_page() in ColumnChunkReader more efficient. No more reading page headers if there are pagelocations in chunk.
@suxiaogang223 suxiaogang223 deleted the skip_page_with_offset_index branch April 26, 2024 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants