Skip to content

Conversation

@morningman
Copy link
Contributor

Proposed changes

This PR #23026 support the partition prune for hive table with _HIVE_DEFAULT_PARTITION,
but it will always select partition with _HIVE_DEFAULT_PARTITION.

This PR #31613 support null partition for olap table's list partition, so we can treat _HIVE_DEFAULT_PARTITION
as null partition of hive table.

So this PR change the partition prune logic

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

@morningman morningman changed the title Hive default partition [fix](hive) support partition prune for _HIVE_DEFAULT_PARTITION_ Mar 4, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 4, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2024

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2024

PR approved by anyone and no changes requested.

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38042 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e4ed84595eabb97802816081fe057b53a84b0dbf, data reload: false

------ Round 1 ----------------------------------
q1	17664	4107	4029	4029
q2	2040	158	152	152
q3	10653	979	940	940
q4	4705	939	942	939
q5	7608	2947	2974	2947
q6	181	128	124	124
q7	1304	837	826	826
q8	9598	2106	2093	2093
q9	7211	6490	6436	6436
q10	8236	2534	2527	2527
q11	418	222	216	216
q12	790	326	318	318
q13	17968	2920	2969	2920
q14	271	253	261	253
q15	475	445	441	441
q16	469	397	411	397
q17	947	882	852	852
q18	6853	5887	5847	5847
q19	1566	1535	1521	1521
q20	545	307	285	285
q21	7484	3693	3691	3691
q22	797	300	288	288
Total cold run time: 107783 ms
Total hot run time: 38042 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4002	3999	4024	3999
q2	329	225	223	223
q3	2969	2881	2927	2881
q4	1836	1809	1814	1809
q5	5231	5243	5240	5240
q6	217	115	114	114
q7	2265	1862	1843	1843
q8	3211	3266	3299	3266
q9	8533	8507	8528	8507
q10	6161	3715	3727	3715
q11	534	429	453	429
q12	686	510	543	510
q13	13323	2804	2799	2799
q14	276	266	256	256
q15	465	450	445	445
q16	466	411	408	408
q17	1692	1663	1673	1663
q18	7984	7340	7291	7291
q19	1840	1618	1616	1616
q20	1948	1715	1678	1678
q21	4974	4832	4791	4791
q22	534	469	471	469
Total cold run time: 69476 ms
Total hot run time: 53952 ms

// If any partition key is hive default partition, return true.
// Only used for hive table.
public boolean isHiveDefaultPartition() {
for (PartitionKey partitionKey : partitionKeys) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDefaultPartition() not need delete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, isDefaultPartition() is for other cases.
"default" partition is not "null" partition.
the _HIVE_DEFAULT_PARTITION_ is actually "null" partition in Doris

@doris-robot
Copy link

TPC-DS: Total hot run time: 176748 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e4ed84595eabb97802816081fe057b53a84b0dbf, data reload: false

query1	921	355	334	334
query2	7386	2083	2026	2026
query3	6699	215	206	206
query4	27091	20918	20978	20918
query5	4381	507	534	507
query6	281	183	171	171
query7	4629	307	292	292
query8	245	169	161	161
query9	8490	2206	2186	2186
query10	426	228	241	228
query11	14723	14216	14262	14216
query12	135	100	88	88
query13	1632	433	426	426
query14	8968	7003	6845	6845
query15	244	192	184	184
query16	7255	273	272	272
query17	948	611	580	580
query18	1924	298	293	293
query19	209	156	162	156
query20	95	97	87	87
query21	199	133	126	126
query22	4577	4402	4438	4402
query23	31729	30500	30319	30319
query24	12356	3082	3110	3082
query25	718	392	384	384
query26	1918	166	170	166
query27	3018	367	373	367
query28	6510	1820	1809	1809
query29	1330	617	610	610
query30	308	149	153	149
query31	918	722	732	722
query32	101	67	60	60
query33	754	277	253	253
query34	995	465	472	465
query35	920	825	791	791
query36	959	812	822	812
query37	279	69	70	69
query38	3196	3045	3164	3045
query39	1417	1369	1400	1369
query40	299	124	122	122
query41	61	56	55	55
query42	105	100	111	100
query43	440	397	386	386
query44	1075	736	715	715
query45	207	197	194	194
query46	1053	808	771	771
query47	1646	1543	1568	1543
query48	423	364	348	348
query49	1205	356	340	340
query50	784	375	383	375
query51	6723	6618	6584	6584
query52	107	97	101	97
query53	354	282	291	282
query54	321	241	252	241
query55	96	82	87	82
query56	259	232	233	232
query57	1090	1001	1001	1001
query58	252	219	220	219
query59	2465	2431	2260	2260
query60	262	252	274	252
query61	117	114	116	114
query62	662	407	397	397
query63	305	277	288	277
query64	6330	3423	3189	3189
query65	3047	3014	3016	3014
query66	1445	328	324	324
query67	15173	14287	14404	14287
query68	12577	536	599	536
query69	703	396	380	380
query70	1410	1123	1058	1058
query71	629	275	269	269
query72	9899	2580	2488	2488
query73	2663	339	340	339
query74	7162	6758	6755	6755
query75	6815	2712	2738	2712
query76	7981	1074	1166	1074
query77	882	251	254	251
query78	10021	9669	9539	9539
query79	9933	517	516	516
query80	1076	413	437	413
query81	493	217	212	212
query82	301	91	88	88
query83	249	149	144	144
query84	289	79	82	79
query85	1239	338	324	324
query86	377	280	290	280
query87	3378	3216	3172	3172
query88	3181	2288	2282	2282
query89	524	375	358	358
query90	2466	179	179	179
query91	160	129	125	125
query92	62	49	51	49
query93	3666	536	523	523
query94	1662	186	184	184
query95	456	347	354	347
query96	607	271	265	265
query97	3997	3849	3922	3849
query98	246	221	207	207
query99	1058	774	776	774
Total cold run time: 311352 ms
Total hot run time: 176748 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.11 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e4ed84595eabb97802816081fe057b53a84b0dbf, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.02	0.02
query3	0.24	0.06	0.05
query4	1.65	0.10	0.10
query5	0.52	0.48	0.50
query6	1.27	0.66	0.67
query7	0.01	0.01	0.01
query8	0.04	0.03	0.03
query9	0.56	0.51	0.53
query10	0.57	0.56	0.58
query11	0.13	0.10	0.10
query12	0.13	0.11	0.10
query13	0.58	0.57	0.57
query14	0.74	0.76	0.76
query15	0.82	0.80	0.81
query16	0.37	0.38	0.36
query17	0.98	1.01	0.97
query18	0.25	0.24	0.26
query19	1.80	1.76	1.77
query20	0.02	0.01	0.01
query21	15.40	0.64	0.64
query22	2.87	4.89	2.64
query23	17.61	1.02	0.90
query24	2.28	0.62	0.38
query25	0.22	0.05	0.04
query26	0.18	0.14	0.14
query27	0.04	0.03	0.02
query28	11.92	0.84	0.82
query29	12.66	3.40	3.32
query30	0.59	0.60	0.53
query31	2.80	0.33	0.34
query32	3.36	0.44	0.44
query33	3.00	2.91	2.86
query34	15.50	4.33	4.34
query35	4.34	4.34	4.32
query36	1.08	1.00	1.02
query37	0.07	0.05	0.05
query38	0.04	0.03	0.04
query39	0.03	0.01	0.01
query40	0.18	0.13	0.14
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 105.1 s
Total hot run time: 31.11 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit e4ed84595eabb97802816081fe057b53a84b0dbf with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       17.0 seconds inserted 10000000 Rows, about 588K ops/s

@morningman morningman merged commit 2bbbdae into apache:master Mar 5, 2024
yiguolei pushed a commit that referenced this pull request Mar 6, 2024
)

This PR #23026 support the partition prune for hive table with `_HIVE_DEFAULT_PARTITION`,
but it will always select partition with `_HIVE_DEFAULT_PARTITION`.

This PR #31613 support null partition for olap table's list partition, so we can treat `_HIVE_DEFAULT_PARTITION`
as null partition of hive table.

So this PR change the partition prune logic
@yiguolei yiguolei mentioned this pull request Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants