Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented May 8, 2025

What problem does this PR solve?

Problem Summary:
Support enable or disable hive partition cache at Catalog level for hive catalog.

Previously, if user want to disable the hive partition cache, they can only set the
max_hive_partition_table_cache_num=0 in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property partition.cache.ttl-second.
If set to 0, the hive partition cache will be disabled, so if new partitioned is added,
Doris will read the new partition immediately.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented May 8, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34618 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f1c880d5084f696d392b48efdea2b5f73b3091bb, data reload: false

------ Round 1 ----------------------------------
q1	26228	5052	5003	5003
q2	2068	278	179	179
q3	10386	1271	705	705
q4	10220	984	534	534
q5	7544	2390	2333	2333
q6	185	164	134	134
q7	906	727	622	622
q8	9327	1299	1066	1066
q9	6918	5135	5097	5097
q10	6855	2330	1888	1888
q11	504	285	287	285
q12	355	352	214	214
q13	17798	3681	3184	3184
q14	224	236	211	211
q15	527	470	485	470
q16	417	434	367	367
q17	609	857	379	379
q18	7606	7151	6932	6932
q19	1404	961	558	558
q20	334	331	225	225
q21	4090	3336	3249	3249
q22	1050	983	983	983
Total cold run time: 115555 ms
Total hot run time: 34618 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5099	5080	5087	5080
q2	240	326	231	231
q3	2167	2628	2278	2278
q4	1369	1785	1398	1398
q5	4437	4423	4391	4391
q6	216	173	133	133
q7	2034	1930	1799	1799
q8	2617	2620	2618	2618
q9	7252	7162	7134	7134
q10	2968	3141	2745	2745
q11	588	518	504	504
q12	686	750	624	624
q13	3471	3932	3366	3366
q14	280	298	277	277
q15	514	475	476	475
q16	463	489	450	450
q17	1168	1562	1391	1391
q18	7832	7598	7381	7381
q19	798	906	1016	906
q20	1969	1960	1857	1857
q21	5373	4946	4713	4713
q22	1115	1058	997	997
Total cold run time: 52656 ms
Total hot run time: 50748 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189623 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f1c880d5084f696d392b48efdea2b5f73b3091bb, data reload: false

query1	1381	1121	1069	1069
query2	6135	1779	1761	1761
query3	11020	4610	4286	4286
query4	54943	25363	23025	23025
query5	5147	472	465	465
query6	341	216	189	189
query7	4906	497	291	291
query8	330	256	242	242
query9	5755	2580	2571	2571
query10	438	336	264	264
query11	15066	15003	14766	14766
query12	163	107	103	103
query13	1062	524	409	409
query14	10106	6411	6632	6411
query15	209	188	195	188
query16	7121	678	486	486
query17	1096	750	597	597
query18	1582	418	353	353
query19	197	195	172	172
query20	133	125	123	123
query21	207	127	111	111
query22	4370	4471	4342	4342
query23	34104	33507	33643	33507
query24	6673	2399	2381	2381
query25	447	460	420	420
query26	690	278	155	155
query27	2243	512	346	346
query28	3133	2148	2120	2120
query29	573	565	431	431
query30	273	222	195	195
query31	855	878	774	774
query32	71	71	69	69
query33	458	388	319	319
query34	757	872	526	526
query35	785	828	755	755
query36	944	1026	903	903
query37	113	101	81	81
query38	4225	4225	4314	4225
query39	1509	1454	1427	1427
query40	208	118	111	111
query41	57	55	52	52
query42	130	111	103	103
query43	485	503	464	464
query44	1314	816	824	816
query45	182	178	170	170
query46	848	1034	646	646
query47	1837	1891	1770	1770
query48	379	413	322	322
query49	660	541	424	424
query50	675	694	419	419
query51	4206	4166	4145	4145
query52	110	111	102	102
query53	232	265	191	191
query54	586	580	513	513
query55	87	82	91	82
query56	311	304	285	285
query57	1173	1202	1119	1119
query58	253	263	266	263
query59	2651	2725	2595	2595
query60	318	349	306	306
query61	130	127	138	127
query62	740	779	705	705
query63	233	195	185	185
query64	1468	1020	668	668
query65	4447	4326	4246	4246
query66	720	405	311	311
query67	15795	15600	15467	15467
query68	8474	870	505	505
query69	533	306	266	266
query70	1223	1122	1098	1098
query71	510	320	286	286
query72	5963	4936	2361	2361
query73	1168	700	337	337
query74	8952	9102	8898	8898
query75	3786	3175	2668	2668
query76	4218	1185	768	768
query77	624	361	284	284
query78	10095	10194	9262	9262
query79	2133	815	568	568
query80	571	508	444	444
query81	485	256	214	214
query82	422	128	102	102
query83	256	245	238	238
query84	301	115	79	79
query85	795	413	314	314
query86	437	294	283	283
query87	4421	4381	4272	4272
query88	3464	2188	2206	2188
query89	402	309	278	278
query90	1882	204	213	204
query91	142	137	110	110
query92	78	58	58	58
query93	1824	966	576	576
query94	674	414	296	296
query95	370	354	288	288
query96	489	556	273	273
query97	3144	3228	3109	3109
query98	229	210	200	200
query99	1377	1420	1331	1331
Total cold run time: 299238 ms
Total hot run time: 189623 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.79 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f1c880d5084f696d392b48efdea2b5f73b3091bb, data reload: false

query1	0.03	0.04	0.03
query2	0.12	0.11	0.11
query3	0.26	0.19	0.20
query4	1.60	0.20	0.11
query5	0.56	0.57	0.55
query6	1.18	0.72	0.72
query7	0.02	0.02	0.02
query8	0.04	0.03	0.04
query9	0.57	0.53	0.50
query10	0.59	0.57	0.55
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.62	0.59	0.60
query14	0.79	0.79	0.80
query15	0.88	0.85	0.84
query16	0.40	0.39	0.38
query17	1.02	1.04	1.02
query18	0.21	0.19	0.19
query19	1.85	1.77	1.84
query20	0.02	0.01	0.01
query21	15.43	0.96	0.58
query22	0.76	1.11	0.69
query23	15.02	1.38	0.65
query24	7.22	1.76	0.37
query25	0.39	0.15	0.18
query26	0.64	0.18	0.15
query27	0.05	0.05	0.05
query28	8.93	0.86	0.42
query29	12.54	3.99	3.36
query30	0.25	0.10	0.07
query31	2.85	0.62	0.39
query32	3.22	0.55	0.46
query33	3.03	3.03	3.07
query34	15.81	5.08	4.51
query35	4.48	4.53	4.47
query36	0.68	0.50	0.48
query37	0.09	0.06	0.06
query38	0.06	0.04	0.04
query39	0.04	0.02	0.02
query40	0.16	0.14	0.14
query41	0.08	0.03	0.02
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 102.86 s
Total hot run time: 28.79 s

@morningman morningman requested a review from Copilot May 9, 2025 03:21
@morningman morningman marked this pull request as ready for review May 9, 2025 03:21
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for configuring the hive partition cache at the catalog level via a new property ("partition.cache.ttl-second") and updates existing cache-related configurations. Key changes include:

  • Removing a drop command in one of the Hive test suites.
  • Introducing new tests and modifying existing ones for both file meta cache and partition cache.
  • Updating cache initialization logic and configuration validations in the FE code.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
regression-test/suites/external_table_p0/hive/test_hive_star_qualifier.groovy Commenting out a drop catalog command in the test suite.
regression-test/suites/external_table_p0/hive/test_hive_meta_cache.groovy Adding new tests for invalid TTL values and proper caching behavior.
regression-test/suites/external_table_p0/export/hive_read/orc/test_hive_read_orc.groovy Changing test suite tags to include additional external docker tags.
regression-test/data/external_table_p0/hive/test_hive_meta_cache.out Updating generated expected output for the meta cache tests.
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HiveMetaStoreCache.java Changing method access modifiers and updating cache initialization logic.
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSExternalCatalog.java Introducing new constants and validations for the partition cache TTL property.
fe/fe-common/src/main/java/org/apache/doris/common/Config.java Updating configuration descriptions to include an additional table type.
Comments suppressed due to low confidence (2)

fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HiveMetaStoreCache.java:134

  • The access level of init() has been changed from private to public. Please ensure that this change is intended and document its external usage if applicable.
public void init() {

fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HiveMetaStoreCache.java:173

  • The method setNewFileCache() has been changed from public to private; ensure that this does not affect any intended external calls or testing, and update the documentation if necessary.
private void setNewFileCache() {

@morningman morningman added the usercase Important user case type label label May 9, 2025
@morningman
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented May 9, 2025

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 33648 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 00e0e7c8512d2a52b1d086add84a250c2022295b, data reload: false

------ Round 1 ----------------------------------
q1	26271	5343	4989	4989
q2	2074	275	178	178
q3	10397	1229	688	688
q4	10737	1033	523	523
q5	8147	2409	2289	2289
q6	181	165	134	134
q7	925	735	621	621
q8	9320	1292	1116	1116
q9	6989	5074	5124	5074
q10	6923	2329	1947	1947
q11	474	286	280	280
q12	354	359	213	213
q13	21744	3722	3106	3106
q14	232	225	216	216
q15	574	502	505	502
q16	858	450	381	381
q17	637	842	377	377
q18	8800	7429	6961	6961
q19	2371	958	536	536
q20	318	328	213	213
q21	3918	2577	2361	2361
q22	1045	999	943	943
Total cold run time: 123289 ms
Total hot run time: 33648 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5141	4982	5009	4982
q2	234	329	232	232
q3	2150	2630	2478	2478
q4	1445	1841	1451	1451
q5	4507	4352	4314	4314
q6	217	173	128	128
q7	1984	1893	1716	1716
q8	2648	2541	2454	2454
q9	7087	7138	7093	7093
q10	2958	3159	2723	2723
q11	567	511	483	483
q12	697	767	611	611
q13	3511	3970	3294	3294
q14	291	305	275	275
q15	528	504	497	497
q16	453	479	447	447
q17	1144	1525	1379	1379
q18	7756	7477	7482	7477
q19	757	756	790	756
q20	1927	2016	1905	1905
q21	5177	4990	4736	4736
q22	1061	1050	994	994
Total cold run time: 52240 ms
Total hot run time: 50425 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191154 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 00e0e7c8512d2a52b1d086add84a250c2022295b, data reload: false

query1	1428	1073	1067	1067
query2	6341	1763	1776	1763
query3	11007	4587	4313	4313
query4	54013	25106	23189	23189
query5	5121	515	450	450
query6	389	206	201	201
query7	5198	512	273	273
query8	325	265	233	233
query9	6683	2541	2541	2541
query10	424	312	260	260
query11	15055	14968	14784	14784
query12	162	113	102	102
query13	1189	510	404	404
query14	10077	6192	6319	6192
query15	193	195	180	180
query16	7017	668	504	504
query17	1092	782	643	643
query18	1523	400	362	362
query19	204	202	163	163
query20	122	120	115	115
query21	207	127	110	110
query22	4414	4358	4360	4358
query23	34066	33435	33369	33369
query24	6681	2431	2451	2431
query25	489	470	387	387
query26	688	279	155	155
query27	2284	503	336	336
query28	3160	2123	2128	2123
query29	578	545	434	434
query30	275	220	197	197
query31	914	869	769	769
query32	76	67	60	60
query33	478	354	298	298
query34	756	855	525	525
query35	806	824	739	739
query36	969	982	908	908
query37	105	101	71	71
query38	4236	4324	4157	4157
query39	1516	1450	1437	1437
query40	220	115	102	102
query41	57	52	51	51
query42	130	104	111	104
query43	492	501	479	479
query44	1322	803	813	803
query45	183	182	170	170
query46	849	1035	633	633
query47	1836	1880	1788	1788
query48	397	415	307	307
query49	710	526	409	409
query50	668	686	413	413
query51	4176	4311	4186	4186
query52	107	110	98	98
query53	224	259	184	184
query54	581	624	536	536
query55	77	81	77	77
query56	287	315	310	310
query57	1188	1192	1138	1138
query58	259	250	270	250
query59	2645	2776	2586	2586
query60	327	343	331	331
query61	136	122	141	122
query62	761	718	666	666
query63	227	181	184	181
query64	1443	1011	729	729
query65	4385	4201	4237	4201
query66	721	395	302	302
query67	16023	15377	15195	15195
query68	7084	889	520	520
query69	529	307	272	272
query70	1140	1081	1140	1081
query71	507	308	282	282
query72	6057	4790	4752	4752
query73	1193	649	336	336
query74	8948	9003	8583	8583
query75	3849	3213	2696	2696
query76	4323	1218	757	757
query77	603	360	285	285
query78	9937	10038	9187	9187
query79	2452	808	560	560
query80	624	503	456	456
query81	545	264	221	221
query82	446	125	95	95
query83	340	260	233	233
query84	289	107	87	87
query85	789	359	307	307
query86	362	302	284	284
query87	4339	4382	4312	4312
query88	3277	2215	2240	2215
query89	411	316	274	274
query90	1977	207	215	207
query91	144	142	113	113
query92	138	62	56	56
query93	1408	955	576	576
query94	645	407	298	298
query95	362	288	299	288
query96	487	570	275	275
query97	3173	3205	3080	3080
query98	228	205	204	204
query99	1443	1395	1308	1308
Total cold run time: 298741 ms
Total hot run time: 191154 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.15 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 00e0e7c8512d2a52b1d086add84a250c2022295b, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.10	0.12
query3	0.25	0.19	0.19
query4	1.60	0.19	0.19
query5	0.59	0.59	0.59
query6	1.20	0.71	0.71
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.58	0.51	0.52
query10	0.56	0.57	0.57
query11	0.15	0.11	0.11
query12	0.15	0.12	0.11
query13	0.62	0.61	0.59
query14	0.79	0.81	0.80
query15	0.89	0.86	0.84
query16	0.36	0.37	0.40
query17	1.05	1.01	1.03
query18	0.20	0.19	0.20
query19	1.86	1.84	1.81
query20	0.01	0.01	0.01
query21	15.40	0.92	0.58
query22	0.76	1.13	0.68
query23	14.97	1.41	0.60
query24	7.00	2.10	0.72
query25	0.51	0.14	0.10
query26	0.75	0.16	0.14
query27	0.05	0.06	0.05
query28	9.85	0.89	0.45
query29	12.55	4.07	3.37
query30	0.25	0.10	0.06
query31	2.82	0.60	0.39
query32	3.23	0.55	0.48
query33	3.12	3.01	3.10
query34	15.79	5.12	4.49
query35	4.52	4.50	4.47
query36	0.65	0.50	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.18	0.14	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.8 s
Total hot run time: 29.15 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 9, 2025
@github-actions
Copy link
Contributor

github-actions bot commented May 9, 2025

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit 00f5f58 into apache:master May 9, 2025
27 of 29 checks passed
github-actions bot pushed a commit that referenced this pull request May 9, 2025
### What problem does this PR solve?

Problem Summary:
Support enable or disable hive partition cache at Catalog level for hive
catalog.

Previously, if user want to disable the hive partition cache, they can
only set the
`max_hive_partition_table_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `partition.cache.ttl-second`.
If set to 0, the hive partition cache will be disabled, so if new
partitioned is added,
Doris will read the new partition immediately.
github-actions bot pushed a commit that referenced this pull request May 9, 2025
### What problem does this PR solve?

Problem Summary:
Support enable or disable hive partition cache at Catalog level for hive
catalog.

Previously, if user want to disable the hive partition cache, they can
only set the
`max_hive_partition_table_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `partition.cache.ttl-second`.
If set to 0, the hive partition cache will be disabled, so if new
partitioned is added,
Doris will read the new partition immediately.
yiguolei pushed a commit that referenced this pull request May 9, 2025
…50724 (#50762)

Cherry-picked from #50724

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
@yiguolei yiguolei mentioned this pull request May 13, 2025
dataroaring pushed a commit that referenced this pull request May 13, 2025
…50724 (#50761)

Cherry-picked from #50724

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
morningman added a commit that referenced this pull request May 19, 2025
### What problem does this PR solve?

Problem Summary:
Just same as #50724, support enable or disable schema cache at Catalog
level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set
the
`max_external_schema_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `schema.cache.ttl-second`.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.
github-actions bot pushed a commit that referenced this pull request May 19, 2025
### What problem does this PR solve?

Problem Summary:
Just same as #50724, support enable or disable schema cache at Catalog
level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set
the
`max_external_schema_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `schema.cache.ttl-second`.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

Problem Summary:
Support enable or disable hive partition cache at Catalog level for hive
catalog.

Previously, if user want to disable the hive partition cache, they can
only set the
`max_hive_partition_table_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `partition.cache.ttl-second`.
If set to 0, the hive partition cache will be disabled, so if new
partitioned is added,
Doris will read the new partition immediately.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

Problem Summary:
Just same as apache#50724, support enable or disable schema cache at Catalog
level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set
the
`max_external_schema_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `schema.cache.ttl-second`.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.
@gavinchou gavinchou mentioned this pull request Jun 11, 2025
github-actions bot pushed a commit that referenced this pull request Jun 24, 2025
### What problem does this PR solve?

Problem Summary:
Just same as #50724, support enable or disable schema cache at Catalog
level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set
the
`max_external_schema_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `schema.cache.ttl-second`.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.10-merged dev/3.0.6-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants