[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding by hubgeter · Pull Request #57208 · apache/doris

hubgeter · 2025-10-21T14:59:08Z

What problem does this PR solve?

Problem Summary:
When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses memcpy. However, for INT32, INT64, etc., direct assignment is faster than memcpy.

In Parquet dictionary encoding, the actual data is not stored contiguously, resulting in very small memcpy sizes. When analyzing the implementation of memcpy, we can see that for such small sizes, __builtin_memcpy is used instead. The implementation of __builtin_memcpy essentially behaves like a series of simple assignments. You can observe the corresponding assembly code here: https://godbolt.org/z/r9Ma1ozvd.

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

Thearas · 2025-10-21T14:59:14Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

hubgeter · 2025-10-21T14:59:27Z

run buildall

doris-robot · 2025-10-21T15:55:09Z

ClickBench: Total hot run time: 28.68 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7beb05d58e98a4f1f520775ef892cc1bb043a28e, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.06	0.04
query3	0.25	0.09	0.09
query4	1.61	0.12	0.12
query5	0.29	0.26	0.25
query6	1.20	0.67	0.66
query7	0.03	0.02	0.03
query8	0.05	0.04	0.04
query9	0.62	0.53	0.53
query10	0.60	0.59	0.58
query11	0.17	0.12	0.12
query12	0.16	0.13	0.13
query13	0.64	0.64	0.61
query14	1.03	1.03	1.03
query15	0.89	0.86	0.86
query16	0.42	0.42	0.41
query17	1.06	1.11	1.09
query18	0.22	0.20	0.21
query19	1.96	1.92	1.89
query20	0.02	0.01	0.02
query21	15.42	0.19	0.13
query22	5.09	0.08	0.04
query23	15.64	0.27	0.11
query24	2.45	1.69	0.31
query25	0.09	0.06	0.07
query26	0.14	0.14	0.15
query27	0.06	0.06	0.06
query28	5.04	1.18	0.93
query29	12.58	4.13	3.39
query30	0.28	0.14	0.11
query31	2.83	0.63	0.40
query32	3.24	0.56	0.48
query33	3.19	3.10	3.17
query34	16.15	5.55	4.87
query35	4.92	4.93	4.93
query36	0.70	0.52	0.51
query37	0.10	0.07	0.07
query38	0.07	0.04	0.04
query39	0.04	0.04	0.03
query40	0.19	0.16	0.14
query41	0.09	0.03	0.03
query42	0.04	0.03	0.04
query43	0.05	0.04	0.04
Total cold run time: 99.77 s
Total hot run time: 28.68 s

hubgeter · 2025-10-22T07:19:07Z

run buildall

hubgeter · 2025-10-23T02:40:53Z

run buildall

doris-robot · 2025-10-23T03:23:10Z

TPC-DS: Total hot run time: 189993 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1ab29c9b2087d900de21f0abd487788275dad991, data reload: false

query1	1083	425	407	407
query2	6560	1758	1736	1736
query3	6766	227	226	226
query4	26243	23469	23147	23147
query5	4425	671	482	482
query6	345	261	232	232
query7	4667	503	300	300
query8	310	286	272	272
query9	8756	2615	2585	2585
query10	515	368	285	285
query11	15701	15019	14839	14839
query12	200	122	115	115
query13	1676	567	439	439
query14	12631	9300	9360	9300
query15	213	195	180	180
query16	7694	684	490	490
query17	1616	793	613	613
query18	2736	444	344	344
query19	224	243	191	191
query20	145	133	145	133
query21	232	137	118	118
query22	4617	4635	4683	4635
query23	35451	33663	34100	33663
query24	8387	2485	2500	2485
query25	603	541	476	476
query26	1367	293	164	164
query27	2931	542	377	377
query28	4418	2234	2179	2179
query29	828	824	550	550
query30	304	233	210	210
query31	946	842	786	786
query32	85	78	73	73
query33	583	399	346	346
query34	900	913	548	548
query35	879	873	824	824
query36	1008	1082	960	960
query37	138	109	87	87
query38	3535	3612	3515	3515
query39	1492	1414	1425	1414
query40	218	124	121	121
query41	63	58	59	58
query42	123	114	151	114
query43	498	496	476	476
query44	1224	733	744	733
query45	183	181	177	177
query46	916	1021	633	633
query47	1764	1811	1713	1713
query48	403	411	319	319
query49	787	498	419	419
query50	660	707	412	412
query51	3908	3886	3937	3886
query52	108	107	101	101
query53	250	275	201	201
query54	599	600	527	527
query55	87	80	86	80
query56	336	335	312	312
query57	1158	1197	1155	1155
query58	294	281	287	281
query59	2539	2677	2563	2563
query60	368	368	361	361
query61	189	184	204	184
query62	798	723	680	680
query63	233	195	197	195
query64	4416	1162	867	867
query65	4024	3952	3980	3952
query66	1111	434	345	345
query67	15417	15206	14900	14900
query68	8254	885	595	595
query69	488	327	293	293
query70	1390	1333	1239	1239
query71	429	355	329	329
query72	5871	4863	4810	4810
query73	649	591	352	352
query74	8909	9094	8674	8674
query75	3379	3346	2928	2928
query76	3288	1204	741	741
query77	512	394	315	315
query78	9526	9672	8961	8961
query79	2127	811	636	636
query80	711	563	516	516
query81	516	270	227	227
query82	223	170	143	143
query83	280	265	263	263
query84	259	111	94	94
query85	867	537	422	422
query86	381	307	294	294
query87	3693	3746	3660	3660
query88	3272	2299	2320	2299
query89	390	325	313	313
query90	2029	223	225	223
query91	170	164	141	141
query92	89	79	69	69
query93	2218	976	644	644
query94	708	459	347	347
query95	420	330	329	329
query96	496	614	293	293
query97	2933	2979	2869	2869
query98	245	211	237	211
query99	1357	1390	1272	1272
Total cold run time: 279963 ms
Total hot run time: 189993 ms

doris-robot · 2025-10-23T03:28:17Z

ClickBench: Total hot run time: 27.59 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1ab29c9b2087d900de21f0abd487788275dad991, data reload: false

query1	0.06	0.06	0.05
query2	0.10	0.06	0.06
query3	0.27	0.08	0.08
query4	1.60	0.12	0.12
query5	0.27	0.28	0.26
query6	1.18	0.66	0.66
query7	0.03	0.03	0.03
query8	0.06	0.04	0.04
query9	0.63	0.54	0.51
query10	0.59	0.58	0.58
query11	0.17	0.12	0.11
query12	0.15	0.12	0.12
query13	0.62	0.60	0.60
query14	1.05	1.01	1.01
query15	0.85	0.83	0.84
query16	0.40	0.40	0.39
query17	1.04	1.04	1.02
query18	0.22	0.21	0.20
query19	2.08	1.88	1.76
query20	0.01	0.01	0.02
query21	15.43	0.20	0.13
query22	4.92	0.07	0.05
query23	15.67	0.26	0.10
query24	2.48	0.51	0.38
query25	0.07	0.06	0.06
query26	0.14	0.13	0.14
query27	0.07	0.05	0.07
query28	4.41	1.18	0.93
query29	12.65	3.98	3.33
query30	0.29	0.15	0.11
query31	2.81	0.62	0.39
query32	3.26	0.55	0.48
query33	3.02	3.12	3.02
query34	15.76	5.20	4.56
query35	4.62	4.54	4.58
query36	0.67	0.52	0.51
query37	0.10	0.07	0.07
query38	0.06	0.04	0.04
query39	0.04	0.04	0.03
query40	0.18	0.14	0.14
query41	0.10	0.03	0.04
query42	0.04	0.04	0.04
query43	0.04	0.03	0.04
Total cold run time: 98.21 s
Total hot run time: 27.59 s

hello-stephen · 2025-10-23T03:55:07Z

BE UT Coverage Report

Increment line coverage 90.00% (27/30) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	52.64% (17933/34065)
Line Coverage	37.88% (162710/429497)
Region Coverage	32.30% (124156/384326)
Branch Coverage	33.70% (54372/161344)

github-actions · 2025-10-28T02:54:27Z

PR approved by at least one committer and no changes requested.

github-actions · 2025-10-28T02:54:29Z

PR approved by anyone and no changes requested.

hello-stephen · 2025-10-28T06:06:25Z

BE Regression && UT Coverage Report

Increment line coverage 100.00% (30/30) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	71.38% (23873/33447)
Line Coverage	57.78% (248338/429828)
Region Coverage	52.84% (205861/389589)
Branch Coverage	54.62% (88662/162329)

kaka11chen

LGTM

… decode RLE_DICTIONARY encoding (#57208) ### What problem does this PR solve? Problem Summary: When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses memcpy. However, for INT32, INT64, etc., direct assignment is faster than memcpy. In Parquet dictionary encoding, the actual data is not stored contiguously, resulting in very small memcpy sizes. When analyzing the implementation of `memcpy`, we can see that for such small sizes, `__builtin_memcpy` is used instead. The implementation of `__builtin_memcpy` essentially behaves like a series of simple assignments. You can observe the corresponding assembly code here: https://godbolt.org/z/r9Ma1ozvd.

… decode RLE_DICTIONARY encoding (apache#57208) ### What problem does this PR solve? Problem Summary: When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses memcpy. However, for INT32, INT64, etc., direct assignment is faster than memcpy. In Parquet dictionary encoding, the actual data is not stored contiguously, resulting in very small memcpy sizes. When analyzing the implementation of `memcpy`, we can see that for such small sizes, `__builtin_memcpy` is used instead. The implementation of `__builtin_memcpy` essentially behaves like a series of simple assignments. You can observe the corresponding assembly code here: https://godbolt.org/z/r9Ma1ozvd.

… decode RLE_DICTIONARY encoding (apache#57208) Problem Summary: When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses memcpy. However, for INT32, INT64, etc., direct assignment is faster than memcpy. In Parquet dictionary encoding, the actual data is not stored contiguously, resulting in very small memcpy sizes. When analyzing the implementation of `memcpy`, we can see that for such small sizes, `__builtin_memcpy` is used instead. The implementation of `__builtin_memcpy` essentially behaves like a series of simple assignments. You can observe the corresponding assembly code here: https://godbolt.org/z/r9Ma1ozvd.

…reader when decode RLE_DICTIONARY encoding (#57208) (#57614) bp #57208

… reader when decode RLE_DICTIONARY encoding #57208 (#57563) Cherry-picked from #57208 Co-authored-by: daidai <changyuwei@selectdb.com>

[enhancement](parquet)improve parquet fixedLengthDict decode performance

7beb05d

fix format

a87509a

hubgeter changed the title ~~[enhancement](parquet)improve parquet fixedLengthDict decode performance~~ [enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding Oct 22, 2025

fix ut

1ab29c9

morningman added dev/3.1.x dev/4.0.x labels Oct 28, 2025

morningman approved these changes Oct 28, 2025

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 28, 2025

github-actions bot added the reviewed label Oct 28, 2025

kaka11chen approved these changes Oct 31, 2025

View reviewed changes

morningman merged commit 6fe6656 into apache:master Oct 31, 2025
29 of 31 checks passed

github-actions bot added the dev/3.1.x-conflict label Oct 31, 2025

github-actions bot mentioned this pull request Oct 31, 2025

branch-4.0: [enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding #57208 #57563

Merged

hubgeter mentioned this pull request Nov 3, 2025

branch-3.1:[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding (#57208) #57614

Merged

16 tasks

morningman pushed a commit that referenced this pull request Nov 4, 2025

branch-3.1:[enhancement](parquet)Optimize the performance of parquet …

873d39e

…reader when decode RLE_DICTIONARY encoding (#57208) (#57614) bp #57208

morningman removed the dev/3.1.x label Nov 4, 2025

morningman added dev/3.1.3-merged and removed dev/3.1.x-conflict labels Nov 4, 2025

yiguolei pushed a commit that referenced this pull request Nov 10, 2025

branch-4.0: [enhancement](parquet)Optimize the performance of parquet…

2dbbc6c

… reader when decode RLE_DICTIONARY encoding #57208 (#57563) Cherry-picked from #57208 Co-authored-by: daidai <changyuwei@selectdb.com>

yiguolei added dev/4.0.2-merged and removed dev/4.0.x labels Nov 10, 2025

yiguolei mentioned this pull request Dec 2, 2025

4.0.2 Release Notes #58605

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding#57208

[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding#57208
morningman merged 3 commits intoapache:masterfrom
hubgeter:improve_rle_encoding

hubgeter commented Oct 21, 2025 •

edited

Loading

Uh oh!

Thearas commented Oct 21, 2025

Uh oh!

hubgeter commented Oct 21, 2025

Uh oh!

doris-robot commented Oct 21, 2025

Uh oh!

hubgeter commented Oct 22, 2025

Uh oh!

hubgeter commented Oct 23, 2025

Uh oh!

doris-robot commented Oct 23, 2025

Uh oh!

doris-robot commented Oct 23, 2025

Uh oh!

hello-stephen commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 28, 2025

Uh oh!

github-actions bot commented Oct 28, 2025

Uh oh!

hello-stephen commented Oct 28, 2025

Uh oh!

kaka11chen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

hubgeter commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Oct 21, 2025

Uh oh!

hubgeter commented Oct 21, 2025

Uh oh!

doris-robot commented Oct 21, 2025

Uh oh!

hubgeter commented Oct 22, 2025

Uh oh!

hubgeter commented Oct 23, 2025

Uh oh!

doris-robot commented Oct 23, 2025

Uh oh!

doris-robot commented Oct 23, 2025

Uh oh!

hello-stephen commented Oct 23, 2025

BE UT Coverage Report

Uh oh!

github-actions bot commented Oct 28, 2025

Uh oh!

github-actions bot commented Oct 28, 2025

Uh oh!

hello-stephen commented Oct 28, 2025

BE Regression && UT Coverage Report

Uh oh!

kaka11chen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

hubgeter commented Oct 21, 2025 •

edited

Loading