Skip to content

Conversation

@morningman
Copy link
Contributor

What problem does this PR solve?

Problem Summary:

We should strictly follow the order of synchronized of ExternalCatalog and ExternalDatabase.
First is ExternalCatalog, then ExternalDatabase.

Java stack information for the threads listed above:
===================================================
"STATS_FETCH-3":
        at org.apache.doris.datasource.ExternalCatalog.makeSureInitialized(ExternalCatalog.java:302)
        - waiting to lock <0x000000060b1c13f0> (a org.apache.doris.datasource.jdbc.JdbcExternalCatalog)
        at org.apache.doris.datasource.ExternalDatabase.makeSureInitialized(ExternalDatabase.java:163)
        - locked <0x000000060b1c14b0> (a org.apache.doris.datasource.jdbc.JdbcExternalDatabase)
        at org.apache.doris.datasource.ExternalDatabase.getTableNullable(ExternalDatabase.java:706)
        at org.apache.doris.datasource.ExternalDatabase.getTableNullable(ExternalDatabase.java:72)
        at org.apache.doris.catalog.DatabaseIf.getTableOrException(DatabaseIf.java:154)
        at org.apache.doris.statistics.util.StatisticsUtil.findTable(StatisticsUtil.java:461)
        at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:41)
        at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:29)
        at org.apache.doris.statistics.BasicAsyncCacheLoader.lambda$asyncLoad$0(BasicAsyncCacheLoader.java:39)
        at org.apache.doris.statistics.BasicAsyncCacheLoader$$Lambda$2610/0x00007f0b1d4a5c00.get(Unknown Source)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(java.base@17.0.15/CompletableFuture.java:1768)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.15/ThreadPoolExecutor.java:1136)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.15/ThreadPoolExecutor.java:635)
        at java.lang.Thread.run(java.base@17.0.15/Thread.java:840)
"mysql-nio-pool-73":
        at org.apache.doris.datasource.ExternalDatabase.resetToUninitialized(ExternalDatabase.java:135)
        - waiting to lock <0x000000060b1c14b0> (a org.apache.doris.datasource.jdbc.JdbcExternalDatabase)
        at org.apache.doris.datasource.ExternalCatalog.refreshOnlyCatalogCache(ExternalCatalog.java:594)
        at org.apache.doris.datasource.ExternalCatalog.resetToUninitialized(ExternalCatalog.java:579)
        - locked <0x000000060b1c13f0> (a org.apache.doris.datasource.jdbc.JdbcExternalCatalog)
        at org.apache.doris.datasource.jdbc.JdbcExternalCatalog.resetToUninitialized(JdbcExternalCatalog.java:132)
        - locked <0x000000060b1c13f0> (a org.apache.doris.datasource.jdbc.JdbcExternalCatalog)
        at org.apache.doris.catalog.RefreshManager.refreshCatalogInternal(RefreshManager.java:75)
        at org.apache.doris.catalog.RefreshManager.handleRefreshCatalog(RefreshManager.java:58)
        at org.apache.doris.nereids.trees.plans.commands.refresh.RefreshCatalogCommand.run(RefreshCatalogCommand.java:79)
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:707)
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:545)
        at org.apache.doris.qe.StmtExecutor.queryRetry(StmtExecutor.java:507)
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:492)
        at org.apache.doris.qe.ConnectProcessor.executeQuery(ConnectProcessor.java:346)
        at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:246)
        at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:233)
        at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:261)
        at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:443)
        at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52)
        at org.apache.doris.mysql.ReadListener$$Lambda$1018/0x00007f0b1cbc4000.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.15/ThreadPoolExecutor.java:1136)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.15/ThreadPoolExecutor.java:635)
        at java.lang.Thread.run(java.base@17.0.15/Thread.java:840)

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34511 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c791e114072416778f909d8c136e9eca7babc213, data reload: false

------ Round 1 ----------------------------------
q1	17610	5179	5055	5055
q2	1955	306	206	206
q3	10249	1344	698	698
q4	10240	1020	526	526
q5	7668	2480	2341	2341
q6	207	171	140	140
q7	956	786	620	620
q8	9340	1410	1078	1078
q9	7625	5234	5226	5226
q10	6914	2409	1978	1978
q11	492	320	281	281
q12	371	391	240	240
q13	17786	3700	3101	3101
q14	254	250	241	241
q15	588	532	530	530
q16	440	445	390	390
q17	608	895	366	366
q18	7771	7299	7264	7264
q19	1263	983	596	596
q20	388	375	244	244
q21	3755	2664	2392	2392
q22	1080	1064	998	998
Total cold run time: 107560 ms
Total hot run time: 34511 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5152	5130	5195	5130
q2	267	345	244	244
q3	2207	2656	2311	2311
q4	1357	1790	1380	1380
q5	4437	4432	4496	4432
q6	216	182	135	135
q7	2079	1934	1851	1851
q8	2626	2636	2587	2587
q9	7362	7446	7324	7324
q10	3211	3259	2895	2895
q11	591	511	506	506
q12	673	789	635	635
q13	3678	3997	3279	3279
q14	300	318	299	299
q15	546	501	485	485
q16	454	529	460	460
q17	1176	1559	1391	1391
q18	7858	8007	7591	7591
q19	871	884	1077	884
q20	2118	2060	1941	1941
q21	5243	4558	4557	4557
q22	1115	1072	995	995
Total cold run time: 53537 ms
Total hot run time: 51312 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191843 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c791e114072416778f909d8c136e9eca7babc213, data reload: false

query1	1025	425	402	402
query2	6545	1726	1782	1726
query3	6776	233	234	233
query4	26292	23502	23588	23502
query5	4465	729	613	613
query6	346	250	235	235
query7	4643	516	309	309
query8	390	340	343	340
query9	8626	2933	2958	2933
query10	512	396	327	327
query11	15549	15062	14926	14926
query12	185	138	133	133
query13	1700	595	462	462
query14	9541	5995	5977	5977
query15	222	197	178	178
query16	7690	657	465	465
query17	1245	735	597	597
query18	2044	428	335	335
query19	210	201	173	173
query20	145	133	155	133
query21	237	146	127	127
query22	4174	4218	4172	4172
query23	34124	33374	33316	33316
query24	8134	2426	2376	2376
query25	553	498	439	439
query26	751	279	172	172
query27	2665	519	363	363
query28	4352	2233	2182	2182
query29	694	591	460	460
query30	305	237	212	212
query31	933	891	757	757
query32	95	94	87	87
query33	579	436	370	370
query34	832	864	552	552
query35	835	864	790	790
query36	1002	1022	932	932
query37	130	109	93	93
query38	4096	4260	4114	4114
query39	1527	1483	1462	1462
query40	259	146	145	145
query41	119	106	103	103
query42	134	120	124	120
query43	515	505	466	466
query44	1427	919	891	891
query45	188	182	188	182
query46	867	1015	647	647
query47	1806	1873	1835	1835
query48	412	481	342	342
query49	762	598	477	477
query50	679	727	429	429
query51	5371	5656	5557	5557
query52	129	127	119	119
query53	256	288	212	212
query54	626	616	551	551
query55	98	98	97	97
query56	371	367	344	344
query57	1267	1260	1154	1154
query58	320	313	314	313
query59	2603	2684	2627	2627
query60	388	374	371	371
query61	156	186	153	153
query62	830	718	682	682
query63	242	219	219	219
query64	3287	1074	746	746
query65	4249	4239	4225	4225
query66	1109	602	534	534
query67	16072	15915	15517	15517
query68	8724	915	546	546
query69	509	375	317	317
query70	1286	1143	1177	1143
query71	513	356	420	356
query72	5388	4817	4914	4817
query73	778	692	379	379
query74	9047	9292	8964	8964
query75	4121	3256	2694	2694
query76	3601	1166	783	783
query77	849	454	385	385
query78	10125	10267	9459	9459
query79	1352	812	600	600
query80	686	613	551	551
query81	499	275	243	243
query82	205	149	119	119
query83	302	295	273	273
query84	257	127	107	107
query85	786	508	364	364
query86	353	347	335	335
query87	4466	4496	4366	4366
query88	3053	2417	2412	2412
query89	409	350	311	311
query90	2080	247	256	247
query91	156	163	130	130
query92	94	84	84	84
query93	1092	974	605	605
query94	715	409	327	327
query95	423	335	333	333
query96	504	612	297	297
query97	2742	2807	2726	2726
query98	253	232	220	220
query99	1470	1449	1317	1317
Total cold run time: 275549 ms
Total hot run time: 191843 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.3 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c791e114072416778f909d8c136e9eca7babc213, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.05
query3	0.25	0.07	0.08
query4	1.62	0.11	0.10
query5	0.45	0.44	0.40
query6	1.15	0.66	0.68
query7	0.02	0.02	0.02
query8	0.05	0.04	0.04
query9	0.60	0.54	0.52
query10	0.59	0.56	0.58
query11	0.17	0.12	0.12
query12	0.16	0.12	0.13
query13	0.63	0.60	0.60
query14	0.83	0.82	0.82
query15	0.89	0.88	0.90
query16	0.37	0.38	0.40
query17	1.04	1.10	1.05
query18	0.23	0.22	0.22
query19	2.03	1.88	1.92
query20	0.01	0.01	0.01
query21	15.39	0.90	0.54
query22	0.78	1.39	0.76
query23	14.76	1.39	0.64
query24	6.55	1.10	1.24
query25	0.50	0.21	0.08
query26	0.54	0.17	0.16
query27	0.06	0.05	0.05
query28	10.14	1.00	0.44
query29	12.58	3.99	3.34
query30	3.16	3.08	3.06
query31	2.82	0.60	0.40
query32	3.24	0.55	0.48
query33	3.17	3.12	3.12
query34	15.62	5.46	4.82
query35	4.88	4.87	4.87
query36	0.70	0.51	0.50
query37	0.10	0.08	0.08
query38	0.06	0.06	0.05
query39	0.04	0.03	0.04
query40	0.19	0.16	0.16
query41	0.08	0.03	0.04
query42	0.04	0.03	0.04
query43	0.05	0.05	0.04
Total cold run time: 106.66 s
Total hot run time: 33.3 s

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 23, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit 1e7d2ff into apache:master Jul 23, 2025
33 of 34 checks passed
w41ter pushed a commit to w41ter/incubator-doris that referenced this pull request Jul 30, 2025
### What problem does this PR solve?

Problem Summary:

We should strictly follow the order of `synchronized` of ExternalCatalog
and ExternalDatabase.
First is `ExternalCatalog`, then `ExternalDatabase`.

```
Java stack information for the threads listed above:
===================================================
"STATS_FETCH-3":
        at org.apache.doris.datasource.ExternalCatalog.makeSureInitialized(ExternalCatalog.java:302)
        - waiting to lock <0x000000060b1c13f0> (a org.apache.doris.datasource.jdbc.JdbcExternalCatalog)
        at org.apache.doris.datasource.ExternalDatabase.makeSureInitialized(ExternalDatabase.java:163)
        - locked <0x000000060b1c14b0> (a org.apache.doris.datasource.jdbc.JdbcExternalDatabase)
        at org.apache.doris.datasource.ExternalDatabase.getTableNullable(ExternalDatabase.java:706)
        at org.apache.doris.datasource.ExternalDatabase.getTableNullable(ExternalDatabase.java:72)
        at org.apache.doris.catalog.DatabaseIf.getTableOrException(DatabaseIf.java:154)
        at org.apache.doris.statistics.util.StatisticsUtil.findTable(StatisticsUtil.java:461)
        at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:41)
        at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:29)
        at org.apache.doris.statistics.BasicAsyncCacheLoader.lambda$asyncLoad$0(BasicAsyncCacheLoader.java:39)
        at org.apache.doris.statistics.BasicAsyncCacheLoader$$Lambda$2610/0x00007f0b1d4a5c00.get(Unknown Source)
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(java.base@17.0.15/CompletableFuture.java:1768)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.15/ThreadPoolExecutor.java:1136)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.15/ThreadPoolExecutor.java:635)
        at java.lang.Thread.run(java.base@17.0.15/Thread.java:840)
"mysql-nio-pool-73":
        at org.apache.doris.datasource.ExternalDatabase.resetToUninitialized(ExternalDatabase.java:135)
        - waiting to lock <0x000000060b1c14b0> (a org.apache.doris.datasource.jdbc.JdbcExternalDatabase)
        at org.apache.doris.datasource.ExternalCatalog.refreshOnlyCatalogCache(ExternalCatalog.java:594)
        at org.apache.doris.datasource.ExternalCatalog.resetToUninitialized(ExternalCatalog.java:579)
        - locked <0x000000060b1c13f0> (a org.apache.doris.datasource.jdbc.JdbcExternalCatalog)
        at org.apache.doris.datasource.jdbc.JdbcExternalCatalog.resetToUninitialized(JdbcExternalCatalog.java:132)
        - locked <0x000000060b1c13f0> (a org.apache.doris.datasource.jdbc.JdbcExternalCatalog)
        at org.apache.doris.catalog.RefreshManager.refreshCatalogInternal(RefreshManager.java:75)
        at org.apache.doris.catalog.RefreshManager.handleRefreshCatalog(RefreshManager.java:58)
        at org.apache.doris.nereids.trees.plans.commands.refresh.RefreshCatalogCommand.run(RefreshCatalogCommand.java:79)
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:707)
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:545)
        at org.apache.doris.qe.StmtExecutor.queryRetry(StmtExecutor.java:507)
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:492)
        at org.apache.doris.qe.ConnectProcessor.executeQuery(ConnectProcessor.java:346)
        at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:246)
        at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:233)
        at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:261)
        at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:443)
        at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52)
        at org.apache.doris.mysql.ReadListener$$Lambda$1018/0x00007f0b1cbc4000.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@17.0.15/ThreadPoolExecutor.java:1136)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@17.0.15/ThreadPoolExecutor.java:635)
        at java.lang.Thread.run(java.base@17.0.15/Thread.java:840)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants