Skip to content

Conversation

@airborne12
Copy link
Member

cherry pick from #45833

… writer (apache#45833)

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Optimize memory usage when adding string values for bloom filter index.
Using uint64 hash value instead of string values itself, it is expected
to save a lot of memory for especially long text
@airborne12
Copy link
Member Author

run buildall

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@doris-robot
Copy link

TPC-H: Total hot run time: 40825 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 36c4117a727b7dda35d46eeae4cb673e7f3cf4ce, data reload: false

------ Round 1 ----------------------------------
q1	17583	7947	7257	7257
q2	2051	181	174	174
q3	10602	1093	1218	1093
q4	10571	716	764	716
q5	7719	2834	2785	2785
q6	236	144	145	144
q7	980	622	601	601
q8	9354	1979	2038	1979
q9	6550	6389	6373	6373
q10	6975	2308	2338	2308
q11	465	265	267	265
q12	406	215	213	213
q13	17780	2976	3015	2976
q14	243	206	206	206
q15	571	515	530	515
q16	664	628	614	614
q17	982	602	555	555
q18	7412	6563	6778	6563
q19	1385	1059	1152	1059
q20	478	214	203	203
q21	3965	3255	3269	3255
q22	1130	1008	971	971
Total cold run time: 108102 ms
Total hot run time: 40825 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7235	7193	7191	7191
q2	328	234	228	228
q3	2964	2910	2968	2910
q4	2016	1817	1787	1787
q5	5698	5788	5742	5742
q6	223	147	148	147
q7	2253	1826	1826	1826
q8	3367	3596	3440	3440
q9	8900	8963	8907	8907
q10	3568	3545	3568	3545
q11	598	497	499	497
q12	853	645	598	598
q13	10493	3209	3194	3194
q14	301	286	260	260
q15	572	524	506	506
q16	733	665	677	665
q17	1858	1620	1597	1597
q18	8368	7736	7391	7391
q19	1681	1696	1517	1517
q20	2116	1914	1859	1859
q21	5460	5266	5412	5266
q22	1112	1038	1014	1014
Total cold run time: 70697 ms
Total hot run time: 60087 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197680 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 36c4117a727b7dda35d46eeae4cb673e7f3cf4ce, data reload: false

query1	1326	916	909	909
query2	6240	2050	1992	1992
query3	10824	4304	4245	4245
query4	65922	29185	23423	23423
query5	4925	455	454	454
query6	427	177	172	172
query7	5616	320	313	313
query8	311	233	233	233
query9	8928	2685	2677	2677
query10	467	269	254	254
query11	17228	15214	15688	15214
query12	162	102	100	100
query13	1519	449	429	429
query14	10306	7698	7459	7459
query15	207	179	188	179
query16	6748	503	494	494
query17	1062	590	589	589
query18	1924	359	338	338
query19	211	160	158	158
query20	128	112	111	111
query21	59	47	45	45
query22	4769	4513	4423	4423
query23	34778	34411	34421	34411
query24	6083	2966	2889	2889
query25	536	427	424	424
query26	677	167	163	163
query27	1922	311	311	311
query28	4292	2521	2515	2515
query29	706	465	464	464
query30	249	168	174	168
query31	997	803	839	803
query32	66	60	55	55
query33	474	275	289	275
query34	924	502	505	502
query35	813	763	755	755
query36	1060	964	971	964
query37	119	70	72	70
query38	4157	3969	4153	3969
query39	1510	1476	1444	1444
query40	139	82	86	82
query41	50	45	47	45
query42	114	102	98	98
query43	530	491	480	480
query44	1186	857	865	857
query45	187	171	170	170
query46	1138	755	731	731
query47	2069	1927	1931	1927
query48	475	376	423	376
query49	738	369	401	369
query50	834	439	426	426
query51	7422	7297	7246	7246
query52	98	87	84	84
query53	256	184	188	184
query54	552	449	445	445
query55	78	74	73	73
query56	259	227	243	227
query57	1218	1074	1139	1074
query58	214	205	197	197
query59	3083	3156	2853	2853
query60	267	247	252	247
query61	109	108	108	108
query62	764	666	650	650
query63	217	190	194	190
query64	1407	655	636	636
query65	3291	3182	3169	3169
query66	721	298	304	298
query67	15831	15625	15734	15625
query68	3867	571	580	571
query69	428	263	268	263
query70	1202	1129	1135	1129
query71	334	254	254	254
query72	6530	4054	3975	3975
query73	751	348	349	348
query74	10082	8924	9032	8924
query75	3388	2623	2639	2623
query76	1762	1000	1066	1000
query77	486	261	263	261
query78	10493	9669	9689	9669
query79	1227	595	599	595
query80	1016	425	421	421
query81	522	240	233	233
query82	213	112	114	112
query83	166	142	141	141
query84	286	81	82	81
query85	977	305	286	286
query86	401	296	278	278
query87	4420	4206	4401	4206
query88	3796	2394	2389	2389
query89	405	279	280	279
query90	1971	184	185	184
query91	183	145	142	142
query92	67	48	49	48
query93	1927	549	546	546
query94	866	303	294	294
query95	353	249	255	249
query96	608	279	276	276
query97	3362	3164	3176	3164
query98	222	215	189	189
query99	1582	1310	1290	1290
Total cold run time: 315812 ms
Total hot run time: 197680 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.67 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 36c4117a727b7dda35d46eeae4cb673e7f3cf4ce, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.02
query3	0.23	0.07	0.07
query4	1.63	0.10	0.10
query5	0.52	0.50	0.52
query6	1.13	0.73	0.73
query7	0.03	0.02	0.01
query8	0.04	0.04	0.03
query9	0.58	0.48	0.49
query10	0.54	0.54	0.55
query11	0.14	0.10	0.10
query12	0.13	0.11	0.11
query13	0.60	0.60	0.59
query14	2.92	2.91	2.91
query15	0.90	0.83	0.83
query16	0.39	0.41	0.37
query17	1.04	1.07	1.03
query18	0.25	0.22	0.22
query19	2.00	1.93	2.05
query20	0.01	0.01	0.01
query21	15.35	0.57	0.58
query22	2.35	2.39	2.65
query23	16.90	0.98	0.90
query24	3.02	1.03	1.72
query25	0.18	0.27	0.17
query26	0.41	0.15	0.14
query27	0.03	0.05	0.04
query28	9.87	1.11	1.08
query29	12.55	3.17	3.19
query30	0.24	0.05	0.06
query31	2.86	0.40	0.39
query32	3.26	0.46	0.47
query33	2.99	3.04	3.03
query34	16.78	4.52	4.45
query35	4.49	4.49	4.42
query36	0.67	0.49	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.03
query39	0.04	0.02	0.02
query40	0.18	0.12	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.02	0.03
Total cold run time: 105.63 s
Total hot run time: 33.67 s

@airborne12 airborne12 merged commit cac25be into apache:branch-3.0 Dec 27, 2024
20 of 21 checks passed
@airborne12 airborne12 deleted the pick_45833_to_origin_branch-3.0 branch December 27, 2024 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants