Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented May 15, 2025

What problem does this PR solve?

Problem Summary:
Just same as #50724, support enable or disable schema cache at Catalog level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set the
max_external_schema_cache_num=0 in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property schema.cache.ttl-second.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33754 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 18e7d8c4bca8b55108ff0af1c289f80c4f55ed83, data reload: false

------ Round 1 ----------------------------------
q1	25783	5076	4981	4981
q2	2078	275	190	190
q3	10390	1191	686	686
q4	10226	1003	525	525
q5	7551	2217	2364	2217
q6	187	164	137	137
q7	898	771	631	631
q8	9350	1255	1051	1051
q9	6804	5130	5143	5130
q10	7013	2333	1889	1889
q11	506	284	285	284
q12	349	362	213	213
q13	17785	3664	3101	3101
q14	218	228	220	220
q15	528	501	482	482
q16	424	430	376	376
q17	593	846	364	364
q18	7514	7189	7087	7087
q19	1226	935	563	563
q20	348	339	221	221
q21	4060	3211	2427	2427
q22	1020	1004	979	979
Total cold run time: 114851 ms
Total hot run time: 33754 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5113	5102	5083	5083
q2	237	326	233	233
q3	2147	2603	2274	2274
q4	1360	1746	1307	1307
q5	4425	4392	4433	4392
q6	206	168	126	126
q7	2040	1886	1755	1755
q8	2546	2540	2508	2508
q9	7138	7184	7083	7083
q10	2998	3190	2767	2767
q11	569	516	483	483
q12	686	766	609	609
q13	3540	3882	3266	3266
q14	293	291	270	270
q15	522	480	479	479
q16	444	524	438	438
q17	1148	1538	1371	1371
q18	7729	7532	7397	7397
q19	825	846	868	846
q20	1996	2063	1881	1881
q21	4775	4511	4375	4375
q22	1091	1057	1011	1011
Total cold run time: 51828 ms
Total hot run time: 49954 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192839 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 18e7d8c4bca8b55108ff0af1c289f80c4f55ed83, data reload: false

query1	1392	1084	1050	1050
query2	6218	1869	1849	1849
query3	10979	4506	4532	4506
query4	54248	25176	23292	23292
query5	5078	512	456	456
query6	357	210	200	200
query7	4906	499	290	290
query8	324	252	237	237
query9	5792	2605	2605	2605
query10	442	317	282	282
query11	15033	15011	14875	14875
query12	163	109	106	106
query13	1062	521	403	403
query14	10212	6457	6408	6408
query15	222	200	193	193
query16	7083	673	463	463
query17	1083	773	626	626
query18	1524	421	324	324
query19	203	196	176	176
query20	140	127	122	122
query21	253	125	114	114
query22	4389	4356	4344	4344
query23	34371	33712	33557	33557
query24	6853	2418	2426	2418
query25	457	480	420	420
query26	735	274	149	149
query27	2313	512	343	343
query28	3118	2137	2140	2137
query29	590	563	437	437
query30	269	223	191	191
query31	867	865	774	774
query32	74	71	62	62
query33	468	373	310	310
query34	770	861	549	549
query35	799	822	766	766
query36	956	1007	902	902
query37	114	99	78	78
query38	4209	4335	4263	4263
query39	1523	1455	1441	1441
query40	208	121	131	121
query41	56	58	55	55
query42	122	113	113	113
query43	495	526	490	490
query44	1339	824	822	822
query45	176	175	169	169
query46	837	1045	664	664
query47	1810	1884	1807	1807
query48	397	426	323	323
query49	713	513	440	440
query50	659	714	408	408
query51	4229	4238	4259	4238
query52	124	111	102	102
query53	232	261	192	192
query54	603	564	508	508
query55	82	81	80	80
query56	319	308	283	283
query57	1208	1204	1121	1121
query58	267	256	274	256
query59	2795	2857	2742	2742
query60	329	314	316	314
query61	127	123	122	122
query62	754	747	684	684
query63	231	189	199	189
query64	2103	1101	681	681
query65	4345	4283	4224	4224
query66	784	407	304	304
query67	16041	15634	15424	15424
query68	7386	889	520	520
query69	541	299	273	273
query70	1224	1155	1101	1101
query71	506	317	287	287
query72	5558	4693	4701	4693
query73	1349	579	344	344
query74	9243	9030	8875	8875
query75	3812	3199	2717	2717
query76	4231	1195	758	758
query77	614	352	276	276
query78	10151	10079	9347	9347
query79	6878	758	559	559
query80	714	513	450	450
query81	496	253	228	228
query82	739	125	99	99
query83	340	241	242	241
query84	308	120	97	97
query85	796	359	321	321
query86	399	323	288	288
query87	4533	4466	4323	4323
query88	4019	2316	2261	2261
query89	452	320	280	280
query90	1894	211	211	211
query91	145	142	110	110
query92	74	58	61	58
query93	4318	921	577	577
query94	663	403	284	284
query95	369	299	290	290
query96	503	576	279	279
query97	2706	2771	2610	2610
query98	240	202	202	202
query99	1454	1441	1286	1286
Total cold run time: 307629 ms
Total hot run time: 192839 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.24 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 18e7d8c4bca8b55108ff0af1c289f80c4f55ed83, data reload: false

query1	0.04	0.04	0.03
query2	0.11	0.11	0.11
query3	0.26	0.19	0.19
query4	1.60	0.19	0.19
query5	0.48	0.43	0.46
query6	1.18	0.66	0.67
query7	0.02	0.02	0.02
query8	0.04	0.04	0.04
query9	0.60	0.52	0.52
query10	0.56	0.58	0.56
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.60	0.60	0.59
query14	0.79	0.82	0.80
query15	0.87	0.86	0.86
query16	0.38	0.38	0.40
query17	1.04	1.07	1.01
query18	0.22	0.21	0.21
query19	1.98	1.84	1.74
query20	0.01	0.01	0.02
query21	15.39	0.90	0.54
query22	0.76	1.22	0.68
query23	14.85	1.33	0.63
query24	7.17	0.90	1.11
query25	0.47	0.23	0.16
query26	0.61	0.17	0.14
query27	0.05	0.05	0.05
query28	9.73	0.94	0.46
query29	12.56	4.09	3.35
query30	0.25	0.10	0.06
query31	2.82	0.62	0.39
query32	3.22	0.55	0.48
query33	3.04	3.19	3.02
query34	15.81	5.10	4.52
query35	4.45	4.59	4.48
query36	0.66	0.50	0.48
query37	0.08	0.06	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.02
query40	0.16	0.14	0.13
query41	0.08	0.03	0.02
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 103.42 s
Total hot run time: 29.24 s

@morningman morningman added dev/3.0.x usercase Important user case type label labels May 16, 2025
@morningman morningman changed the title [feat](hive) add catalog level schema cache property #50724 [feat](hive) add catalog level schema cache property May 16, 2025
@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34170 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit fdb5aa7e5c6efb5f9e63618dbb5626cc5a2e43f7, data reload: false

------ Round 1 ----------------------------------
q1	26480	5045	5013	5013
q2	2094	291	186	186
q3	10376	1267	741	741
q4	10245	1007	545	545
q5	8484	2549	2325	2325
q6	268	161	133	133
q7	928	767	620	620
q8	9354	1326	1126	1126
q9	6897	5154	5127	5127
q10	6813	2310	1891	1891
q11	495	289	276	276
q12	362	384	217	217
q13	17815	3674	3098	3098
q14	232	231	209	209
q15	522	489	491	489
q16	446	431	385	385
q17	653	875	394	394
q18	7629	7232	7154	7154
q19	1272	970	567	567
q20	347	358	244	244
q21	4170	3216	2457	2457
q22	1060	1006	973	973
Total cold run time: 116942 ms
Total hot run time: 34170 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5102	5088	5053	5053
q2	242	326	232	232
q3	2214	2707	2303	2303
q4	1346	1844	1468	1468
q5	4604	4404	4421	4404
q6	210	164	128	128
q7	1935	1914	1780	1780
q8	2615	2563	2492	2492
q9	7288	7078	7224	7078
q10	3036	3184	2743	2743
q11	572	498	491	491
q12	669	750	617	617
q13	3426	3987	3247	3247
q14	294	311	276	276
q15	550	483	486	483
q16	433	490	437	437
q17	1177	1568	1400	1400
q18	7799	7355	7390	7355
q19	871	864	825	825
q20	2011	2026	1892	1892
q21	4747	4357	4343	4343
q22	1083	1040	954	954
Total cold run time: 52224 ms
Total hot run time: 50001 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186719 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit fdb5aa7e5c6efb5f9e63618dbb5626cc5a2e43f7, data reload: false

query1	1006	469	524	469
query2	6567	1945	1917	1917
query3	6747	220	217	217
query4	26423	23225	23594	23225
query5	4364	645	484	484
query6	308	219	206	206
query7	4618	497	302	302
query8	302	256	249	249
query9	8606	2704	2670	2670
query10	495	328	263	263
query11	15168	15037	14812	14812
query12	164	111	105	105
query13	1645	547	410	410
query14	8720	6179	6135	6135
query15	213	191	172	172
query16	7130	637	490	490
query17	940	730	572	572
query18	1982	407	316	316
query19	199	190	166	166
query20	125	119	124	119
query21	214	125	107	107
query22	4094	4168	4141	4141
query23	33990	33142	33168	33142
query24	8456	2410	2384	2384
query25	531	450	386	386
query26	1249	266	156	156
query27	2753	508	346	346
query28	4306	2156	2140	2140
query29	771	558	435	435
query30	279	220	188	188
query31	932	840	764	764
query32	71	65	63	63
query33	543	368	302	302
query34	816	879	511	511
query35	800	786	753	753
query36	970	990	869	869
query37	115	105	76	76
query38	4076	4103	4064	4064
query39	1475	1395	1406	1395
query40	212	121	108	108
query41	59	54	54	54
query42	118	111	138	111
query43	516	529	497	497
query44	1316	836	814	814
query45	182	173	168	168
query46	840	1026	632	632
query47	1749	1799	1776	1776
query48	399	414	311	311
query49	775	517	444	444
query50	646	694	409	409
query51	4149	4160	4038	4038
query52	110	104	99	99
query53	235	256	188	188
query54	584	577	502	502
query55	86	83	84	83
query56	326	320	298	298
query57	1126	1130	1107	1107
query58	266	262	263	262
query59	2609	2693	2705	2693
query60	332	326	326	326
query61	130	127	126	126
query62	786	724	658	658
query63	229	185	190	185
query64	4428	1109	800	800
query65	4314	4215	4247	4215
query66	1168	428	382	382
query67	15946	15700	15551	15551
query68	8454	894	522	522
query69	469	319	272	272
query70	1226	1117	1110	1110
query71	456	318	304	304
query72	5818	4729	4804	4729
query73	724	656	365	365
query74	8942	8806	8782	8782
query75	3966	3210	2771	2771
query76	3650	1184	751	751
query77	797	392	287	287
query78	10201	10236	9312	9312
query79	2515	797	582	582
query80	626	516	439	439
query81	467	267	222	222
query82	465	128	96	96
query83	285	253	230	230
query84	297	109	88	88
query85	787	364	307	307
query86	340	287	281	281
query87	4440	4513	4368	4368
query88	3141	2314	2345	2314
query89	395	321	284	284
query90	1924	205	204	204
query91	145	143	115	115
query92	82	63	56	56
query93	1417	954	569	569
query94	677	418	312	312
query95	379	289	284	284
query96	505	577	290	290
query97	2751	2804	2662	2662
query98	238	217	208	208
query99	1425	1411	1245	1245
Total cold run time: 273803 ms
Total hot run time: 186719 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.61 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit fdb5aa7e5c6efb5f9e63618dbb5626cc5a2e43f7, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.10	0.11
query3	0.25	0.19	0.20
query4	1.58	0.19	0.11
query5	0.42	0.42	0.42
query6	1.18	0.67	0.67
query7	0.03	0.02	0.02
query8	0.04	0.04	0.04
query9	0.58	0.52	0.52
query10	0.57	0.57	0.58
query11	0.15	0.11	0.11
query12	0.14	0.12	0.11
query13	0.60	0.61	0.59
query14	0.78	0.81	0.80
query15	0.88	0.86	0.85
query16	0.39	0.37	0.38
query17	1.05	1.07	1.07
query18	0.23	0.21	0.22
query19	1.93	1.85	1.80
query20	0.01	0.01	0.01
query21	15.41	0.85	0.53
query22	0.75	1.22	0.72
query23	14.83	1.35	0.62
query24	7.03	1.37	1.23
query25	0.47	0.27	0.09
query26	0.65	0.16	0.14
query27	0.05	0.04	0.04
query28	9.78	0.91	0.44
query29	12.54	3.97	3.36
query30	0.25	0.09	0.08
query31	2.82	0.61	0.38
query32	3.23	0.56	0.46
query33	3.08	3.08	3.06
query34	15.92	5.14	4.54
query35	4.53	4.59	4.52
query36	0.65	0.49	0.48
query37	0.08	0.07	0.06
query38	0.06	0.04	0.04
query39	0.03	0.03	0.02
query40	0.18	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 103.47 s
Total hot run time: 29.61 s

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@kaka11chen kaka11chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 19, 2025
@morningman morningman merged commit 2eee6d7 into apache:master May 19, 2025
25 of 27 checks passed
github-actions bot pushed a commit that referenced this pull request May 19, 2025
### What problem does this PR solve?

Problem Summary:
Just same as #50724, support enable or disable schema cache at Catalog
level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set
the
`max_external_schema_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `schema.cache.ttl-second`.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.
morningman added a commit that referenced this pull request May 21, 2025
…get_schema_from_table" mode (#51057)

### What problem does this PR solve?

Related PR: #50958

Problem Summary:

In #50958, I introduced the catalog level schema cache switch.
But if use also add `"get_schema_from_table" = "true"`, the schema will
be fetched from
`Hive Table` instead of schema cache, so the schema cache config is
broken.

This PR fix it:

1. Always get new `Hive Table` instance to get schema from table. To
make schema change config work.
2. For `show create table xxx`, it will always show the latest schema of
a hive table.
morningman added a commit that referenced this pull request May 21, 2025
…get_schema_from_table" mode (#51057)

Related PR: #50958

Problem Summary:

In #50958, I introduced the catalog level schema cache switch.
But if use also add `"get_schema_from_table" = "true"`, the schema will
be fetched from
`Hive Table` instead of schema cache, so the schema cache config is
broken.

This PR fix it:

1. Always get new `Hive Table` instance to get schema from table. To
make schema change config work.
2. For `show create table xxx`, it will always show the latest schema of
a hive table.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

Problem Summary:
Just same as apache#50724, support enable or disable schema cache at Catalog
level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set
the
`max_external_schema_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `schema.cache.ttl-second`.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…get_schema_from_table" mode (apache#51057)

### What problem does this PR solve?

Related PR: apache#50958

Problem Summary:

In apache#50958, I introduced the catalog level schema cache switch.
But if use also add `"get_schema_from_table" = "true"`, the schema will
be fetched from
`Hive Table` instead of schema cache, so the schema cache config is
broken.

This PR fix it:

1. Always get new `Hive Table` instance to get schema from table. To
make schema change config work.
2. For `show create table xxx`, it will always show the latest schema of
a hive table.
dataroaring pushed a commit that referenced this pull request Jun 24, 2025
#51057 (#51011)

Cherry-picked from #50958 #51057

---------

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
github-actions bot pushed a commit that referenced this pull request Jun 24, 2025
### What problem does this PR solve?

Problem Summary:
Just same as #50724, support enable or disable schema cache at Catalog
level for all kinds of external catalogs.

Previously, if user want to disable the schema cache, they can only set
the
`max_external_schema_cache_num=0` in fe.conf and restart FE.
And this config will effect all catalogs.

In this PR, I add a new catalog property `schema.cache.ttl-second`.
If set to 0, the schema cache will be disabled, so if schema is changed
Doris will read the new schema immediately.
morningman added a commit that referenced this pull request Jun 24, 2025
… (#52198)

Cherry-picked from #50958

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
morningman added a commit to morningman/doris that referenced this pull request Jun 24, 2025
…get_schema_from_table" mode (apache#51057)

Related PR: apache#50958

Problem Summary:

In apache#50958, I introduced the catalog level schema cache switch.
But if use also add `"get_schema_from_table" = "true"`, the schema will
be fetched from
`Hive Table` instead of schema cache, so the schema cache config is
broken.

This PR fix it:

1. Always get new `Hive Table` instance to get schema from table. To
make schema change config work.
2. For `show create table xxx`, it will always show the latest schema of
a hive table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.7-merged dev/3.1.0-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants