Skip to content

Conversation

@924060929
Copy link
Contributor

@924060929 924060929 commented Feb 7, 2025

What problem does this PR solve?

This pr can speedup huge InPredicate for partition pruning, for this sql:

select *
from tbl
where dt in ('2024-01-02 01:00:00',  ... , '2024-05-02 02:00:00') -- about 2k literals

In my test case, use 2k literals query the table with 20k hour (range / list) partitions, this pr can speed up from 12s to 160ms, if disable binary search filtering, and speed up from 6s to 160ms if enable binary search filtering

The changes:

  1. add SessionVariable.enable_binary_search_filtering_partitions to disable binary search filtering partitions
  2. cache the Set/RangeSet of the InPredicate.options, instead of foreach options to evaluate it for every partition
  3. use big RangeSet to intersect small RangeSet, the order is not trivial, because the TreeRangeSet is red-black tree, the right side need to foreach all items, and the left side can search fast
  4. skip prune tablets if the number of InPredicate.options > Config.max_distribution_pruner_recursion_depth, because it tend to hit all buckets

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 7, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@924060929
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31594 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 696321b5aca4861445aef43e1057a0c5f5314f62, data reload: false

------ Round 1 ----------------------------------
q1	17570	5287	5060	5060
q2	2047	323	185	185
q3	10368	1226	762	762
q4	10210	989	537	537
q5	7524	2397	2303	2303
q6	189	173	131	131
q7	902	748	588	588
q8	9286	1306	1083	1083
q9	4889	4771	4721	4721
q10	6805	2311	1882	1882
q11	486	268	256	256
q12	344	350	224	224
q13	17767	3669	3087	3087
q14	230	230	205	205
q15	533	470	462	462
q16	645	629	586	586
q17	567	872	339	339
q18	6653	6195	6222	6195
q19	1222	957	543	543
q20	304	329	186	186
q21	2782	2174	1955	1955
q22	374	341	304	304
Total cold run time: 101697 ms
Total hot run time: 31594 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5112	5104	5120	5104
q2	249	333	230	230
q3	2237	2721	2347	2347
q4	1503	1837	1385	1385
q5	4195	4140	4147	4140
q6	208	163	126	126
q7	1886	1822	1639	1639
q8	2599	2609	2493	2493
q9	7290	7175	7181	7175
q10	3000	3184	2748	2748
q11	582	527	496	496
q12	689	774	613	613
q13	3601	3855	3263	3263
q14	285	289	286	286
q15	502	473	460	460
q16	656	686	637	637
q17	1131	1616	1347	1347
q18	7588	7419	7226	7226
q19	772	765	816	765
q20	1993	2025	1888	1888
q21	5376	4877	4882	4877
q22	655	585	528	528
Total cold run time: 52109 ms
Total hot run time: 49773 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190664 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 696321b5aca4861445aef43e1057a0c5f5314f62, data reload: false

query1	1312	953	924	924
query2	6236	1876	1869	1869
query3	11018	4472	4563	4472
query4	53942	25334	23456	23456
query5	5056	564	481	481
query6	345	209	200	200
query7	4920	516	301	301
query8	311	255	246	246
query9	5707	2496	2486	2486
query10	423	331	258	258
query11	15165	15170	14910	14910
query12	160	111	109	109
query13	1051	521	406	406
query14	10260	6443	6782	6443
query15	218	209	195	195
query16	6682	635	485	485
query17	1079	720	586	586
query18	906	398	324	324
query19	198	199	174	174
query20	129	127	159	127
query21	205	125	101	101
query22	4259	4498	4460	4460
query23	33945	33265	33335	33265
query24	6338	2455	2471	2455
query25	474	455	395	395
query26	715	280	155	155
query27	2228	488	323	323
query28	3074	2440	2384	2384
query29	541	560	409	409
query30	218	196	156	156
query31	890	880	796	796
query32	78	61	62	61
query33	441	384	315	315
query34	1029	858	500	500
query35	798	841	762	762
query36	946	991	886	886
query37	127	115	81	81
query38	4270	4294	4168	4168
query39	1501	1473	1432	1432
query40	208	119	102	102
query41	51	50	48	48
query42	119	107	105	105
query43	508	551	498	498
query44	1339	821	792	792
query45	186	171	165	165
query46	885	1060	658	658
query47	1831	1817	1769	1769
query48	417	424	312	312
query49	719	531	414	414
query50	720	740	417	417
query51	4326	4361	4208	4208
query52	110	108	105	105
query53	238	267	201	201
query54	478	517	421	421
query55	84	81	80	80
query56	261	273	271	271
query57	1136	1158	1124	1124
query58	238	235	251	235
query59	2801	2785	2807	2785
query60	294	271	262	262
query61	131	118	119	118
query62	740	757	671	671
query63	234	192	193	192
query64	1918	1044	713	713
query65	3400	3152	3151	3151
query66	773	400	297	297
query67	16119	15660	15421	15421
query68	5990	784	513	513
query69	538	312	266	266
query70	1188	1099	1121	1099
query71	456	287	280	280
query72	5925	3738	3783	3738
query73	1279	750	347	347
query74	9038	9010	9136	9010
query75	3242	3156	2693	2693
query76	3923	1173	735	735
query77	566	384	281	281
query78	10058	10176	9443	9443
query79	2520	824	588	588
query80	609	528	470	470
query81	501	285	248	248
query82	469	155	124	124
query83	175	176	153	153
query84	292	101	82	82
query85	791	347	315	315
query86	385	312	292	292
query87	4708	4695	4484	4484
query88	3553	2157	2167	2157
query89	419	315	288	288
query90	1784	195	233	195
query91	132	135	110	110
query92	78	60	60	60
query93	1929	1040	581	581
query94	699	397	288	288
query95	356	270	260	260
query96	486	577	272	272
query97	2796	2853	2740	2740
query98	232	207	202	202
query99	1375	1408	1259	1259
Total cold run time: 293951 ms
Total hot run time: 190664 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 696321b5aca4861445aef43e1057a0c5f5314f62, data reload: false

query1	0.04	0.04	0.05
query2	0.07	0.04	0.03
query3	0.23	0.06	0.07
query4	1.61	0.10	0.10
query5	0.42	0.42	0.40
query6	1.16	0.65	0.66
query7	0.03	0.02	0.01
query8	0.04	0.04	0.03
query9	0.61	0.52	0.51
query10	0.58	0.57	0.57
query11	0.15	0.10	0.11
query12	0.15	0.11	0.11
query13	0.62	0.60	0.61
query14	2.79	2.81	2.72
query15	0.93	0.84	0.86
query16	0.40	0.36	0.36
query17	1.04	1.01	1.03
query18	0.21	0.19	0.20
query19	1.91	1.79	1.98
query20	0.01	0.01	0.01
query21	15.37	0.88	0.54
query22	0.75	1.17	0.65
query23	14.99	1.39	0.57
query24	7.19	0.91	0.68
query25	0.46	0.26	0.08
query26	0.63	0.17	0.14
query27	0.05	0.04	0.04
query28	9.40	0.88	0.43
query29	12.55	4.04	3.33
query30	0.24	0.09	0.06
query31	2.83	0.61	0.38
query32	3.22	0.54	0.48
query33	3.05	3.03	3.02
query34	15.79	5.12	4.53
query35	4.52	4.52	4.54
query36	0.66	0.49	0.47
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.14	0.14
query41	0.08	0.03	0.03
query42	0.03	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 105.19 s
Total hot run time: 30.39 s

@924060929
Copy link
Contributor Author

run buildall

1 similar comment
@924060929
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31839 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ac316e861057bf6c1eeb09ed4c51b5b484e7bf89, data reload: false

------ Round 1 ----------------------------------
q1	17564	5224	5273	5224
q2	2057	305	174	174
q3	10413	1289	754	754
q4	10214	1028	533	533
q5	7545	2404	2338	2338
q6	185	167	132	132
q7	917	739	594	594
q8	9300	1289	1116	1116
q9	4914	4655	4899	4655
q10	6822	2310	1893	1893
q11	483	294	268	268
q12	345	356	210	210
q13	17747	3661	3137	3137
q14	230	219	213	213
q15	514	466	453	453
q16	639	613	591	591
q17	581	883	343	343
q18	6899	6266	6196	6196
q19	1572	954	556	556
q20	300	321	195	195
q21	2795	2163	1956	1956
q22	357	329	308	308
Total cold run time: 102393 ms
Total hot run time: 31839 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5139	5087	5062	5062
q2	233	324	228	228
q3	2176	2704	2317	2317
q4	1422	1840	1358	1358
q5	4267	4193	4140	4140
q6	211	168	125	125
q7	1881	1828	1719	1719
q8	2590	2677	2596	2596
q9	7209	7209	7205	7205
q10	3020	3244	2788	2788
q11	593	545	501	501
q12	677	796	629	629
q13	3426	3838	3285	3285
q14	279	300	276	276
q15	520	478	449	449
q16	635	708	622	622
q17	1142	1582	1364	1364
q18	7637	7418	7344	7344
q19	782	818	952	818
q20	1971	2115	1860	1860
q21	5515	4961	4828	4828
q22	599	586	584	584
Total cold run time: 51924 ms
Total hot run time: 50098 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190775 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ac316e861057bf6c1eeb09ed4c51b5b484e7bf89, data reload: false

query1	1311	953	974	953
query2	6159	1922	1886	1886
query3	10994	4560	4445	4445
query4	53736	25534	23563	23563
query5	5289	513	493	493
query6	394	191	176	176
query7	5253	505	283	283
query8	334	250	232	232
query9	6990	2589	2584	2584
query10	431	298	258	258
query11	15174	15118	14885	14885
query12	153	110	111	110
query13	1263	535	399	399
query14	11042	6413	6859	6413
query15	214	198	182	182
query16	7043	630	467	467
query17	1045	700	577	577
query18	1526	416	304	304
query19	190	188	159	159
query20	124	131	133	131
query21	213	122	102	102
query22	4705	4810	4550	4550
query23	34045	33338	33153	33153
query24	5620	2429	2436	2429
query25	454	455	410	410
query26	661	273	153	153
query27	1841	495	330	330
query28	2810	2465	2424	2424
query29	567	559	437	437
query30	212	189	162	162
query31	882	908	803	803
query32	67	63	61	61
query33	447	359	314	314
query34	832	875	514	514
query35	795	847	757	757
query36	951	996	921	921
query37	122	102	78	78
query38	4322	4303	4231	4231
query39	1538	1477	1426	1426
query40	207	126	116	116
query41	58	55	56	55
query42	119	102	105	102
query43	511	539	514	514
query44	1355	833	833	833
query45	184	176	167	167
query46	890	1069	660	660
query47	1885	1890	1818	1818
query48	428	438	305	305
query49	705	497	414	414
query50	711	785	422	422
query51	4253	4307	4187	4187
query52	111	107	109	107
query53	246	260	193	193
query54	484	490	418	418
query55	84	78	86	78
query56	276	307	271	271
query57	1188	1217	1129	1129
query58	260	241	277	241
query59	2730	2976	2838	2838
query60	321	266	262	262
query61	121	117	122	117
query62	757	768	658	658
query63	226	188	190	188
query64	1488	1044	695	695
query65	3248	3159	3131	3131
query66	796	388	294	294
query67	16029	15712	15641	15641
query68	5407	767	501	501
query69	531	290	262	262
query70	1141	1125	1081	1081
query71	449	291	269	269
query72	6394	3673	3725	3673
query73	1063	744	346	346
query74	9175	9138	8835	8835
query75	3344	3136	2670	2670
query76	3942	1167	755	755
query77	542	364	269	269
query78	10024	10302	9441	9441
query79	2133	828	589	589
query80	620	531	450	450
query81	508	275	232	232
query82	226	152	117	117
query83	176	172	150	150
query84	291	90	72	72
query85	747	355	303	303
query86	356	318	292	292
query87	4496	4470	4542	4470
query88	3684	2288	2162	2162
query89	415	319	282	282
query90	1789	196	190	190
query91	130	135	107	107
query92	75	61	55	55
query93	1953	1035	564	564
query94	658	410	304	304
query95	349	260	257	257
query96	500	565	266	266
query97	2786	2835	2725	2725
query98	231	201	201	201
query99	1373	1413	1276	1276
Total cold run time: 294928 ms
Total hot run time: 190775 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.9 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ac316e861057bf6c1eeb09ed4c51b5b484e7bf89, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.07
query4	1.62	0.10	0.10
query5	0.42	0.41	0.42
query6	1.18	0.67	0.66
query7	0.02	0.02	0.01
query8	0.03	0.03	0.03
query9	0.58	0.54	0.53
query10	0.57	0.57	0.56
query11	0.15	0.10	0.11
query12	0.14	0.11	0.11
query13	0.61	0.60	0.60
query14	2.70	2.72	2.81
query15	0.92	0.85	0.84
query16	0.37	0.37	0.37
query17	1.01	1.04	1.06
query18	0.23	0.20	0.20
query19	1.87	1.81	1.99
query20	0.01	0.02	0.01
query21	15.38	0.87	0.53
query22	0.76	1.42	0.94
query23	14.70	1.38	0.59
query24	7.49	0.99	1.17
query25	0.48	0.31	0.10
query26	0.56	0.17	0.14
query27	0.05	0.05	0.05
query28	9.57	0.83	0.41
query29	12.54	3.88	3.27
query30	0.25	0.10	0.07
query31	2.81	0.58	0.38
query32	3.23	0.56	0.46
query33	3.02	3.10	3.01
query34	15.81	5.08	4.46
query35	4.53	4.46	4.51
query36	0.69	0.49	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.13	0.14
query41	0.09	0.03	0.03
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.14 s
Total hot run time: 30.9 s

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31776 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7d2162d3bfb337dd3c1e635df1e45158bf98d9ba, data reload: false

------ Round 1 ----------------------------------
q1	17598	5287	5087	5087
q2	2064	299	163	163
q3	10408	1298	744	744
q4	10210	1013	543	543
q5	7548	2349	2403	2349
q6	188	168	135	135
q7	897	753	591	591
q8	9314	1292	1127	1127
q9	4844	4720	4734	4720
q10	6830	2313	1886	1886
q11	481	277	256	256
q12	348	357	219	219
q13	17779	3707	3121	3121
q14	229	230	206	206
q15	520	472	461	461
q16	634	616	597	597
q17	584	873	342	342
q18	6953	6239	6264	6239
q19	1393	941	562	562
q20	313	323	190	190
q21	2941	2118	1942	1942
q22	370	335	296	296
Total cold run time: 102446 ms
Total hot run time: 31776 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5152	5159	5097	5097
q2	241	329	233	233
q3	2183	2691	2277	2277
q4	1420	1846	1353	1353
q5	4240	4160	4148	4148
q6	215	164	125	125
q7	1861	1843	1699	1699
q8	2631	2724	2610	2610
q9	7278	7234	7216	7216
q10	3017	3221	2806	2806
q11	584	540	486	486
q12	705	835	671	671
q13	3386	3934	3302	3302
q14	272	295	282	282
q15	515	470	469	469
q16	635	667	649	649
q17	1153	1609	1337	1337
q18	7647	7494	7286	7286
q19	809	798	826	798
q20	1942	2039	1867	1867
q21	5449	4956	4772	4772
q22	612	594	574	574
Total cold run time: 51947 ms
Total hot run time: 50057 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190304 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7d2162d3bfb337dd3c1e635df1e45158bf98d9ba, data reload: false

query1	1324	976	938	938
query2	6344	1835	1878	1835
query3	10982	4417	4434	4417
query4	54512	25757	22949	22949
query5	5039	543	504	504
query6	347	199	185	185
query7	4941	514	307	307
query8	316	250	229	229
query9	5774	2525	2505	2505
query10	411	309	251	251
query11	15093	15047	14886	14886
query12	149	109	105	105
query13	1064	527	424	424
query14	10208	6234	7038	6234
query15	214	209	193	193
query16	7048	672	499	499
query17	1093	728	607	607
query18	1525	435	322	322
query19	201	191	170	170
query20	128	122	133	122
query21	212	130	107	107
query22	4497	4623	4729	4623
query23	33998	33351	33560	33351
query24	5591	2464	2409	2409
query25	486	446	402	402
query26	725	272	157	157
query27	1860	489	336	336
query28	2792	2470	2407	2407
query29	584	582	428	428
query30	210	192	153	153
query31	879	861	813	813
query32	72	58	60	58
query33	437	359	324	324
query34	766	850	509	509
query35	787	815	780	780
query36	1002	1003	920	920
query37	136	103	76	76
query38	4344	4385	4290	4290
query39	1475	1460	1428	1428
query40	211	122	104	104
query41	51	52	71	52
query42	120	106	111	106
query43	502	532	501	501
query44	1308	821	825	821
query45	177	174	162	162
query46	890	1082	655	655
query47	1910	1913	1856	1856
query48	390	435	317	317
query49	720	538	431	431
query50	729	739	409	409
query51	4296	4287	4233	4233
query52	105	109	102	102
query53	239	269	199	199
query54	481	491	418	418
query55	87	80	90	80
query56	274	264	262	262
query57	1212	1248	1131	1131
query58	256	249	243	243
query59	2876	2935	2743	2743
query60	314	272	277	272
query61	138	118	144	118
query62	720	732	700	700
query63	237	193	187	187
query64	1918	1042	700	700
query65	3181	3131	3119	3119
query66	729	396	303	303
query67	15851	15528	15622	15528
query68	5506	794	595	595
query69	526	298	273	273
query70	1180	1114	1100	1100
query71	445	299	259	259
query72	6025	3681	3837	3681
query73	1307	753	355	355
query74	9046	8966	8747	8747
query75	3205	3153	2698	2698
query76	3967	1171	749	749
query77	541	378	278	278
query78	9979	10164	9401	9401
query79	2127	831	610	610
query80	668	534	499	499
query81	500	273	234	234
query82	254	155	119	119
query83	168	172	159	159
query84	289	99	73	73
query85	733	351	309	309
query86	352	314	299	299
query87	4564	4611	4397	4397
query88	3074	2233	2185	2185
query89	395	319	297	297
query90	1775	190	191	190
query91	142	134	112	112
query92	71	61	56	56
query93	2264	1058	593	593
query94	680	396	280	280
query95	350	270	258	258
query96	492	556	274	274
query97	2772	2827	2759	2759
query98	224	202	218	202
query99	1331	1379	1258	1258
Total cold run time: 292440 ms
Total hot run time: 190304 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7d2162d3bfb337dd3c1e635df1e45158bf98d9ba, data reload: false

query1	0.03	0.03	0.05
query2	0.07	0.03	0.04
query3	0.24	0.07	0.07
query4	1.61	0.10	0.10
query5	0.41	0.40	0.41
query6	1.16	0.66	0.67
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.58	0.52	0.51
query10	0.58	0.58	0.56
query11	0.15	0.12	0.11
query12	0.14	0.10	0.11
query13	0.61	0.60	0.61
query14	2.67	2.67	2.71
query15	0.93	0.84	0.84
query16	0.37	0.38	0.37
query17	1.01	1.02	1.02
query18	0.21	0.20	0.20
query19	1.91	1.78	1.93
query20	0.01	0.02	0.01
query21	15.36	0.93	0.57
query22	0.77	1.20	0.71
query23	14.87	1.38	0.65
query24	7.45	1.66	0.77
query25	0.54	0.16	0.13
query26	0.49	0.16	0.15
query27	0.05	0.05	0.05
query28	10.15	0.79	0.43
query29	12.52	3.96	3.27
query30	0.27	0.09	0.06
query31	2.82	0.58	0.38
query32	3.22	0.54	0.47
query33	2.96	2.93	3.02
query34	15.81	5.09	4.49
query35	4.49	4.51	4.49
query36	0.65	0.48	0.49
query37	0.09	0.07	0.06
query38	0.05	0.03	0.04
query39	0.03	0.03	0.03
query40	0.17	0.14	0.13
query41	0.09	0.03	0.02
query42	0.04	0.03	0.02
query43	0.04	0.03	0.02
Total cold run time: 105.68 s
Total hot run time: 30.44 s

@924060929
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31306 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 32063f9acd0c99093017423264d3d2331a311d7e, data reload: false

------ Round 1 ----------------------------------
q1	17588	5310	5074	5074
q2	2056	301	172	172
q3	10383	1416	722	722
q4	10209	1014	562	562
q5	7539	2487	2256	2256
q6	187	167	136	136
q7	911	753	602	602
q8	9309	1335	1124	1124
q9	4975	4546	4632	4546
q10	6816	2336	1883	1883
q11	472	284	260	260
q12	350	361	219	219
q13	17764	3681	3031	3031
q14	226	230	203	203
q15	510	468	463	463
q16	615	618	572	572
q17	607	860	333	333
q18	6610	6198	6239	6198
q19	1209	946	527	527
q20	317	323	203	203
q21	2759	2109	1905	1905
q22	361	341	315	315
Total cold run time: 101773 ms
Total hot run time: 31306 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5129	5098	5119	5098
q2	240	328	241	241
q3	2172	2624	2306	2306
q4	1402	1832	1347	1347
q5	4214	4152	4111	4111
q6	213	163	123	123
q7	1859	1801	1648	1648
q8	2555	2441	2512	2441
q9	7318	7197	7049	7049
q10	2979	3270	2725	2725
q11	583	502	494	494
q12	697	783	611	611
q13	3498	3889	3221	3221
q14	292	290	278	278
q15	513	483	465	465
q16	620	692	628	628
q17	1127	1538	1364	1364
q18	7469	7175	7161	7161
q19	802	826	849	826
q20	2006	2001	1906	1906
q21	5280	4930	4933	4930
q22	645	587	555	555
Total cold run time: 51613 ms
Total hot run time: 49528 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183992 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 32063f9acd0c99093017423264d3d2331a311d7e, data reload: false

query1	974	368	369	368
query2	6529	1879	1808	1808
query3	6797	219	227	219
query4	26342	23898	23427	23427
query5	4302	679	538	538
query6	312	192	178	178
query7	4604	504	290	290
query8	282	249	218	218
query9	8596	2529	2532	2529
query10	467	325	263	263
query11	15183	15163	14984	14984
query12	158	110	108	108
query13	1660	527	397	397
query14	9148	6249	6215	6215
query15	208	201	178	178
query16	7129	645	470	470
query17	935	712	560	560
query18	1963	397	316	316
query19	201	198	158	158
query20	120	119	122	119
query21	215	124	106	106
query22	4242	4384	4412	4384
query23	34553	33477	32991	32991
query24	7732	2358	2395	2358
query25	523	448	379	379
query26	1244	278	156	156
query27	2437	475	324	324
query28	4182	2420	2389	2389
query29	776	549	459	459
query30	230	184	153	153
query31	915	847	777	777
query32	80	60	65	60
query33	557	372	291	291
query34	779	852	491	491
query35	771	798	732	732
query36	985	979	893	893
query37	113	102	69	69
query38	4224	4092	4071	4071
query39	1428	1372	1393	1372
query40	208	115	100	100
query41	56	50	48	48
query42	124	109	109	109
query43	510	502	469	469
query44	1296	791	783	783
query45	179	172	162	162
query46	891	1054	664	664
query47	1776	1768	1734	1734
query48	408	435	302	302
query49	767	490	401	401
query50	696	734	408	408
query51	4165	4138	4122	4122
query52	113	102	95	95
query53	222	245	185	185
query54	503	487	392	392
query55	79	81	86	81
query56	255	262	246	246
query57	1127	1148	1074	1074
query58	273	248	240	240
query59	2537	2531	2582	2531
query60	279	281	256	256
query61	122	116	117	116
query62	806	713	661	661
query63	230	187	184	184
query64	4452	997	660	660
query65	3204	3124	3198	3124
query66	1145	396	294	294
query67	15832	15722	15387	15387
query68	8079	761	518	518
query69	465	293	261	261
query70	1221	1147	1140	1140
query71	411	290	263	263
query72	5770	3530	3724	3530
query73	717	766	362	362
query74	9004	9159	8891	8891
query75	3280	3150	2683	2683
query76	3297	1165	743	743
query77	597	373	279	279
query78	9897	10002	9369	9369
query79	2451	853	638	638
query80	621	522	480	480
query81	481	289	242	242
query82	666	128	97	97
query83	175	167	153	153
query84	286	105	72	72
query85	782	353	313	313
query86	331	301	289	289
query87	4489	4540	4555	4540
query88	3850	2300	2209	2209
query89	399	317	281	281
query90	1894	198	201	198
query91	133	142	116	116
query92	70	61	53	53
query93	1453	1006	575	575
query94	695	397	283	283
query95	360	280	265	265
query96	493	548	277	277
query97	2760	2886	2727	2727
query98	227	263	202	202
query99	1570	1420	1317	1317
Total cold run time: 271075 ms
Total hot run time: 183992 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.19 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 32063f9acd0c99093017423264d3d2331a311d7e, data reload: false

query1	0.03	0.04	0.03
query2	0.07	0.03	0.03
query3	0.24	0.06	0.06
query4	1.64	0.10	0.10
query5	0.42	0.42	0.40
query6	1.19	0.65	0.66
query7	0.03	0.01	0.01
query8	0.04	0.03	0.03
query9	0.60	0.52	0.52
query10	0.56	0.58	0.59
query11	0.15	0.10	0.10
query12	0.14	0.11	0.11
query13	0.61	0.59	0.60
query14	2.68	2.73	2.70
query15	0.91	0.85	0.85
query16	0.39	0.38	0.37
query17	1.02	1.03	1.06
query18	0.21	0.19	0.20
query19	1.92	1.77	1.95
query20	0.01	0.02	0.02
query21	15.37	0.91	0.58
query22	0.75	1.08	0.71
query23	15.00	1.38	0.66
query24	12.34	0.98	0.32
query25	0.31	0.10	0.09
query26	0.61	0.20	0.13
query27	0.06	0.05	0.05
query28	6.16	0.78	0.42
query29	12.55	3.92	3.25
query30	0.25	0.09	0.06
query31	2.82	0.59	0.39
query32	3.24	0.57	0.47
query33	2.98	3.00	3.03
query34	15.87	5.12	4.54
query35	4.54	4.59	4.57
query36	0.67	0.49	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.03	0.03
query40	0.17	0.13	0.12
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.87 s
Total hot run time: 30.19 s

@924060929
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31698 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a4dbc9d668c73f3e08fba2b8f09607e0934e435a, data reload: false

------ Round 1 ----------------------------------
q1	17618	5419	5124	5124
q2	2047	312	175	175
q3	10398	1341	721	721
q4	10228	1017	550	550
q5	7525	2400	2393	2393
q6	196	170	134	134
q7	920	755	615	615
q8	9317	1350	1110	1110
q9	4926	4663	4659	4659
q10	6829	2317	1875	1875
q11	475	274	250	250
q12	350	358	222	222
q13	17767	3692	3075	3075
q14	233	222	204	204
q15	511	462	458	458
q16	623	603	571	571
q17	584	854	331	331
q18	7335	6301	6251	6251
q19	1207	935	548	548
q20	311	335	191	191
q21	2835	2187	1939	1939
q22	351	335	302	302
Total cold run time: 102586 ms
Total hot run time: 31698 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5124	5171	5106	5106
q2	241	334	233	233
q3	2169	2711	2301	2301
q4	1424	1814	1398	1398
q5	4211	4134	4158	4134
q6	206	160	124	124
q7	1839	1833	1772	1772
q8	2596	2567	2507	2507
q9	7268	7115	7121	7115
q10	3038	3232	2749	2749
q11	579	519	496	496
q12	685	762	616	616
q13	3571	3978	3232	3232
q14	284	303	273	273
q15	514	465	454	454
q16	626	690	646	646
q17	1133	1603	1319	1319
q18	7624	7332	7229	7229
q19	861	1075	1156	1075
q20	1985	2104	1919	1919
q21	5350	4948	4857	4857
q22	626	587	549	549
Total cold run time: 51954 ms
Total hot run time: 50104 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183899 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a4dbc9d668c73f3e08fba2b8f09607e0934e435a, data reload: false

query1	961	373	364	364
query2	6529	1874	1827	1827
query3	6791	214	216	214
query4	25842	23538	23275	23275
query5	4358	683	512	512
query6	308	205	171	171
query7	4597	492	297	297
query8	280	235	236	235
query9	8634	2520	2518	2518
query10	463	301	244	244
query11	15392	15200	14873	14873
query12	155	112	105	105
query13	1665	510	390	390
query14	9057	6281	6276	6276
query15	216	193	171	171
query16	7124	647	459	459
query17	1174	715	582	582
query18	1921	401	295	295
query19	192	187	152	152
query20	119	116	116	116
query21	209	131	99	99
query22	4316	4373	4281	4281
query23	33614	32875	32961	32875
query24	7812	2334	2369	2334
query25	539	461	385	385
query26	1231	270	159	159
query27	2461	473	331	331
query28	4239	2418	2391	2391
query29	734	547	417	417
query30	230	184	169	169
query31	929	839	793	793
query32	76	64	64	64
query33	556	364	308	308
query34	774	838	521	521
query35	823	814	742	742
query36	957	996	890	890
query37	114	95	73	73
query38	4182	4171	4193	4171
query39	1454	1389	1400	1389
query40	213	116	102	102
query41	55	79	50	50
query42	128	108	113	108
query43	492	488	477	477
query44	1258	798	799	798
query45	177	166	161	161
query46	870	1019	636	636
query47	1755	1798	1717	1717
query48	384	413	309	309
query49	797	478	414	414
query50	678	713	427	427
query51	4158	4184	4174	4174
query52	104	105	94	94
query53	228	261	189	189
query54	478	502	416	416
query55	86	83	83	83
query56	261	284	268	268
query57	1161	1123	1071	1071
query58	243	241	250	241
query59	2520	2601	2426	2426
query60	305	306	256	256
query61	122	124	130	124
query62	764	723	664	664
query63	236	198	191	191
query64	4381	997	673	673
query65	3199	3141	3119	3119
query66	1130	403	312	312
query67	15695	15391	15360	15360
query68	2299	804	555	555
query69	423	316	266	266
query70	1169	1119	1123	1119
query71	336	286	280	280
query72	6156	3827	3862	3827
query73	631	743	376	376
query74	9119	9150	9052	9052
query75	3104	3157	2682	2682
query76	2232	1153	731	731
query77	345	367	292	292
query78	10044	10175	9228	9228
query79	1523	859	616	616
query80	1358	513	532	513
query81	549	276	233	233
query82	891	131	98	98
query83	253	177	158	158
query84	231	91	73	73
query85	799	347	296	296
query86	391	312	288	288
query87	4572	4443	4490	4443
query88	2994	2222	2193	2193
query89	403	328	293	293
query90	1815	197	197	197
query91	135	137	109	109
query92	60	58	56	56
query93	1118	1028	592	592
query94	669	419	294	294
query95	344	263	266	263
query96	490	526	280	280
query97	2723	2808	2687	2687
query98	222	202	205	202
query99	1301	1419	1279	1279
Total cold run time: 261605 ms
Total hot run time: 183899 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.38 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a4dbc9d668c73f3e08fba2b8f09607e0934e435a, data reload: false

query1	0.03	0.04	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.07
query4	1.63	0.10	0.10
query5	0.41	0.42	0.40
query6	1.17	0.67	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.60	0.52	0.51
query10	0.56	0.58	0.57
query11	0.15	0.10	0.10
query12	0.15	0.12	0.11
query13	0.62	0.61	0.59
query14	2.68	2.74	2.71
query15	0.92	0.86	0.84
query16	0.39	0.37	0.38
query17	1.02	1.07	1.03
query18	0.21	0.19	0.20
query19	1.92	1.81	2.03
query20	0.02	0.02	0.01
query21	15.36	0.89	0.55
query22	0.77	1.17	0.66
query23	14.95	1.39	0.65
query24	7.85	4.91	0.52
query25	0.31	0.27	0.17
query26	0.82	0.17	0.15
query27	0.06	0.05	0.05
query28	6.49	0.78	0.46
query29	12.54	4.08	3.29
query30	0.25	0.09	0.06
query31	2.83	0.58	0.39
query32	3.24	0.55	0.46
query33	2.95	2.95	3.01
query34	15.76	5.18	4.51
query35	4.55	4.54	4.52
query36	0.66	0.49	0.49
query37	0.09	0.07	0.07
query38	0.05	0.04	0.04
query39	0.02	0.03	0.02
query40	0.17	0.13	0.14
query41	0.07	0.02	0.02
query42	0.04	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 102.71 s
Total hot run time: 30.38 s

@doris-robot
Copy link

ClickBench: Total hot run time: 31.25 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2da04ccc71f70007d0ecce139c088d0e1e494de9, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.03	0.03
query3	0.24	0.06	0.07
query4	1.61	0.10	0.10
query5	0.56	0.55	0.55
query6	1.19	0.72	0.71
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.57	0.53	0.53
query10	0.57	0.57	0.57
query11	0.15	0.10	0.11
query12	0.14	0.11	0.10
query13	0.61	0.59	0.59
query14	2.69	2.81	2.80
query15	0.93	0.85	0.85
query16	0.37	0.38	0.39
query17	1.00	1.00	1.03
query18	0.20	0.19	0.19
query19	1.85	1.82	1.96
query20	0.02	0.01	0.01
query21	15.36	0.88	0.55
query22	0.75	1.22	0.72
query23	14.87	1.36	0.60
query24	6.94	1.71	1.10
query25	0.52	0.37	0.13
query26	0.63	0.16	0.14
query27	0.05	0.05	0.05
query28	9.41	0.87	0.42
query29	12.55	4.01	3.33
query30	0.25	0.09	0.07
query31	2.84	0.59	0.37
query32	3.23	0.55	0.47
query33	2.96	3.00	3.00
query34	15.83	5.20	4.50
query35	4.60	4.54	4.58
query36	0.69	0.49	0.48
query37	0.10	0.06	0.07
query38	0.06	0.04	0.04
query39	0.04	0.02	0.02
query40	0.17	0.15	0.13
query41	0.08	0.03	0.02
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 104.87 s
Total hot run time: 31.25 s

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2025

PR approved by anyone and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 7, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2025

PR approved by at least one committer and no changes requested.

@924060929 924060929 merged commit 5ec7d34 into apache:master Mar 7, 2025
26 checks passed
@924060929 924060929 deleted the opt-in-filter branch March 7, 2025 08:01
924060929 added a commit that referenced this pull request Mar 26, 2025
)

skip run PruneOlapScanTablet when exists lots of InPredicate, follow-up
#47608
github-actions bot pushed a commit that referenced this pull request Mar 26, 2025
)

skip run PruneOlapScanTablet when exists lots of InPredicate, follow-up
#47608
github-actions bot pushed a commit that referenced this pull request Mar 26, 2025
)

skip run PruneOlapScanTablet when exists lots of InPredicate, follow-up
#47608
yiguolei pushed a commit that referenced this pull request Mar 28, 2025
dataroaring pushed a commit that referenced this pull request Apr 22, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…#47608)

This pr can speedup huge InPredicate for partition pruning, for this
sql:
```sql
select *
from tbl
where dt in ('2024-01-02 01:00:00',  ... , '2024-05-02 02:00:00') -- about 2k literals
```

In my test case, use 2k literals query the table with 20k hour (range /
list) partitions, this pr can speed up from 12s to 160ms, if disable
binary search filtering, and speed up from 6s to 160ms if enable binary
search filtering

The changes:
1. add `SessionVariable.enable_binary_search_filtering_partitions` to
disable binary search filtering partitions
2. cache the Set/RangeSet of the `InPredicate.options`, instead of
foreach options to evaluate it for every partition
3. use big RangeSet to intersect small RangeSet, the order is not
trivial, because the TreeRangeSet is red-black tree, the right side need
to foreach all items, and the left side can search fast
4. skip prune tablets if the number of InPredicate.options >
`Config.max_distribution_pruner_recursion_depth`, because it tend to hit
all buckets
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…che#49386)

skip run PruneOlapScanTablet when exists lots of InPredicate, follow-up
apache#47608
924060929 added a commit to 924060929/incubator-doris that referenced this pull request Jul 1, 2025
…#47608)

This pr can speedup huge InPredicate for partition pruning, for this
sql:
```sql
select *
from tbl
where dt in ('2024-01-02 01:00:00',  ... , '2024-05-02 02:00:00') -- about 2k literals
```

In my test case, use 2k literals query the table with 20k hour (range /
list) partitions, this pr can speed up from 12s to 160ms, if disable
binary search filtering, and speed up from 6s to 160ms if enable binary
search filtering

The changes:
1. add `SessionVariable.enable_binary_search_filtering_partitions` to
disable binary search filtering partitions
2. cache the Set/RangeSet of the `InPredicate.options`, instead of
foreach options to evaluate it for every partition
3. use big RangeSet to intersect small RangeSet, the order is not
trivial, because the TreeRangeSet is red-black tree, the right side need
to foreach all items, and the left side can search fast
4. skip prune tablets if the number of InPredicate.options >
`Config.max_distribution_pruner_recursion_depth`, because it tend to hit
all buckets

(cherry picked from commit 5ec7d34)
morrySnow pushed a commit that referenced this pull request Jul 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants