Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Feb 13, 2025

What problem does this PR solve?

This pull request includes changes to ensure that the interrupted flag of the current thread is reset before performing certain operations to prevent failures in acquiring locks. The changes primarily affect the BDBEnvironment and BDBJEJournal classes.

Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) ~[?:1.8.0_352-352]
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) ~[?:1.8.0_352-352]
        at com.sleepycat.je.latch.SharedLatchImpl.acquireShared(SharedLatchImpl.java:103) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootINInternal(Tree.java:447) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootIN(Tree.java:431) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.search(Tree.java:2185) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getFirstNode(Tree.java:812) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.positionFirstOrLast(CursorImpl.java:1776) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.traverseDbWithCursor(CursorImpl.java:3953) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.DbTree.getDbNames(DbTree.java:1808) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.Environment.getDatabaseNames(Environment.java:2458) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        ... 21 more

This a self-defend logic. Because we found that some times other logic may set the thread as interrupted and
does not handle it.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@morningman morningman marked this pull request as ready for review February 13, 2025 07:49
@morningman
Copy link
Contributor Author

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 13, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 31419 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 64cfc5413ff3c7dd57123bd8e485f23ee57cfe05, data reload: false

------ Round 1 ----------------------------------
q1	17613	5204	5023	5023
q2	2054	321	169	169
q3	10461	1242	744	744
q4	10261	1008	532	532
q5	7915	2293	2368	2293
q6	186	168	131	131
q7	867	762	599	599
q8	9298	1190	1129	1129
q9	4903	4838	4673	4673
q10	6867	2289	1889	1889
q11	481	292	250	250
q12	343	351	213	213
q13	17759	3685	3028	3028
q14	225	217	198	198
q15	512	470	462	462
q16	626	608	573	573
q17	553	867	349	349
q18	6850	6190	6315	6190
q19	1636	929	547	547
q20	313	328	184	184
q21	2901	2074	1946	1946
q22	365	329	297	297
Total cold run time: 102989 ms
Total hot run time: 31419 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5135	5063	5067	5063
q2	226	327	227	227
q3	2147	2676	2273	2273
q4	1405	1820	1370	1370
q5	4228	4128	4134	4128
q6	210	164	123	123
q7	1863	1810	1696	1696
q8	2660	2585	2508	2508
q9	7132	7254	7134	7134
q10	3010	3194	2795	2795
q11	572	515	501	501
q12	681	751	618	618
q13	3439	3840	3377	3377
q14	278	292	275	275
q15	509	456	465	456
q16	647	698	641	641
q17	1124	1536	1385	1385
q18	7537	7426	7393	7393
q19	767	818	925	818
q20	1984	2054	1848	1848
q21	5408	4865	4861	4861
q22	651	608	561	561
Total cold run time: 51613 ms
Total hot run time: 50051 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190699 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 64cfc5413ff3c7dd57123bd8e485f23ee57cfe05, data reload: false

query1	1305	931	953	931
query2	6240	1846	1829	1829
query3	10967	4435	4436	4435
query4	54851	25175	22995	22995
query5	5244	540	482	482
query6	350	187	188	187
query7	5005	503	296	296
query8	325	246	237	237
query9	6072	2591	2589	2589
query10	436	312	249	249
query11	15250	15179	15116	15116
query12	157	106	111	106
query13	1115	522	402	402
query14	10241	6630	7049	6630
query15	208	201	193	193
query16	6930	674	441	441
query17	1058	707	573	573
query18	1528	417	303	303
query19	194	196	175	175
query20	129	128	124	124
query21	223	121	103	103
query22	4389	4535	4411	4411
query23	34174	33576	33439	33439
query24	5549	2437	2453	2437
query25	463	465	400	400
query26	690	286	157	157
query27	1917	505	346	346
query28	2760	2482	2448	2448
query29	590	573	428	428
query30	221	181	164	164
query31	887	905	819	819
query32	111	64	62	62
query33	450	354	309	309
query34	780	885	511	511
query35	824	818	775	775
query36	983	1028	936	936
query37	130	99	78	78
query38	4362	4358	4406	4358
query39	1497	1444	1440	1440
query40	214	118	110	110
query41	51	52	48	48
query42	126	107	104	104
query43	508	517	488	488
query44	1351	827	811	811
query45	181	175	174	174
query46	866	1058	644	644
query47	1860	1888	1789	1789
query48	390	436	312	312
query49	697	528	430	430
query50	728	756	429	429
query51	4307	4340	4227	4227
query52	108	111	93	93
query53	229	263	190	190
query54	487	503	433	433
query55	83	83	81	81
query56	318	268	274	268
query57	1132	1187	1112	1112
query58	245	249	272	249
query59	2661	2748	2658	2658
query60	289	277	267	267
query61	118	120	113	113
query62	751	753	686	686
query63	239	199	196	196
query64	1463	1075	691	691
query65	3281	3126	3139	3126
query66	724	416	315	315
query67	15932	15436	15381	15381
query68	3172	810	537	537
query69	494	303	272	272
query70	1208	1128	1185	1128
query71	352	303	273	273
query72	6423	3733	3745	3733
query73	658	762	345	345
query74	9035	9238	8943	8943
query75	3178	3164	2718	2718
query76	2099	1158	783	783
query77	544	379	277	277
query78	10233	10132	9307	9307
query79	2577	782	592	592
query80	940	528	438	438
query81	538	277	243	243
query82	406	126	96	96
query83	177	200	150	150
query84	282	92	71	71
query85	774	365	314	314
query86	391	310	281	281
query87	4568	4511	4424	4424
query88	3028	2199	2186	2186
query89	396	317	282	282
query90	1608	190	193	190
query91	128	133	107	107
query92	63	59	54	54
query93	2675	1003	575	575
query94	682	394	298	298
query95	351	275	267	267
query96	492	566	267	267
query97	2809	2843	2751	2751
query98	223	201	204	201
query99	1317	1393	1257	1257
Total cold run time: 289765 ms
Total hot run time: 190699 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.01 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 64cfc5413ff3c7dd57123bd8e485f23ee57cfe05, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.03
query3	0.24	0.06	0.07
query4	1.63	0.10	0.11
query5	0.41	0.42	0.40
query6	1.19	0.67	0.65
query7	0.03	0.02	0.02
query8	0.04	0.03	0.03
query9	0.58	0.50	0.54
query10	0.59	0.58	0.57
query11	0.16	0.11	0.11
query12	0.15	0.11	0.11
query13	0.63	0.59	0.60
query14	2.80	2.85	2.66
query15	0.92	0.84	0.85
query16	0.38	0.39	0.38
query17	1.04	1.06	1.03
query18	0.22	0.19	0.19
query19	1.87	1.81	1.96
query20	0.01	0.02	0.01
query21	15.38	0.89	0.56
query22	0.74	1.28	0.95
query23	14.73	1.43	0.65
query24	7.94	4.40	0.99
query25	0.31	0.34	0.12
query26	0.86	0.18	0.15
query27	0.05	0.05	0.04
query28	6.37	0.77	0.43
query29	12.53	3.99	3.26
query30	0.25	0.09	0.06
query31	2.83	0.57	0.38
query32	3.22	0.54	0.46
query33	2.95	3.06	3.01
query34	15.90	5.09	4.49
query35	4.49	4.51	4.54
query36	0.67	0.49	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.03
query40	0.16	0.15	0.13
query41	0.08	0.03	0.03
query42	0.03	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 102.69 s
Total hot run time: 31.01 s

@morningman morningman merged commit 9e0c754 into apache:master Feb 14, 2025
30 of 31 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 14, 2025
### What problem does this PR solve?

This pull request includes changes to ensure that the interrupted flag
of the current thread is reset before performing certain operations to
prevent failures in acquiring locks. The changes primarily affect the
`BDBEnvironment` and `BDBJEJournal` classes.

```
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) ~[?:1.8.0_352-352]
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) ~[?:1.8.0_352-352]
        at com.sleepycat.je.latch.SharedLatchImpl.acquireShared(SharedLatchImpl.java:103) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootINInternal(Tree.java:447) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootIN(Tree.java:431) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.search(Tree.java:2185) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getFirstNode(Tree.java:812) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.positionFirstOrLast(CursorImpl.java:1776) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.traverseDbWithCursor(CursorImpl.java:3953) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.DbTree.getDbNames(DbTree.java:1808) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.Environment.getDatabaseNames(Environment.java:2458) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        ... 21 more
```

This a self-defend logic. Because we found that some times other logic
may set the thread as interrupted and
does not handle it.
github-actions bot pushed a commit that referenced this pull request Feb 14, 2025
### What problem does this PR solve?

This pull request includes changes to ensure that the interrupted flag
of the current thread is reset before performing certain operations to
prevent failures in acquiring locks. The changes primarily affect the
`BDBEnvironment` and `BDBJEJournal` classes.

```
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) ~[?:1.8.0_352-352]
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) ~[?:1.8.0_352-352]
        at com.sleepycat.je.latch.SharedLatchImpl.acquireShared(SharedLatchImpl.java:103) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootINInternal(Tree.java:447) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootIN(Tree.java:431) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.search(Tree.java:2185) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getFirstNode(Tree.java:812) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.positionFirstOrLast(CursorImpl.java:1776) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.traverseDbWithCursor(CursorImpl.java:3953) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.DbTree.getDbNames(DbTree.java:1808) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.Environment.getDatabaseNames(Environment.java:2458) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        ... 21 more
```

This a self-defend logic. Because we found that some times other logic
may set the thread as interrupted and
does not handle it.
morningman added a commit that referenced this pull request Feb 18, 2025
…7874 (#47943)

Cherry-picked from #47874

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
### What problem does this PR solve?

This pull request includes changes to ensure that the interrupted flag
of the current thread is reset before performing certain operations to
prevent failures in acquiring locks. The changes primarily affect the
`BDBEnvironment` and `BDBJEJournal` classes.

```
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) ~[?:1.8.0_352-352]
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) ~[?:1.8.0_352-352]
        at com.sleepycat.je.latch.SharedLatchImpl.acquireShared(SharedLatchImpl.java:103) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootINInternal(Tree.java:447) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootIN(Tree.java:431) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.search(Tree.java:2185) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getFirstNode(Tree.java:812) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.positionFirstOrLast(CursorImpl.java:1776) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.traverseDbWithCursor(CursorImpl.java:3953) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.DbTree.getDbNames(DbTree.java:1808) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.Environment.getDatabaseNames(Environment.java:2458) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        ... 21 more
```

This a self-defend logic. Because we found that some times other logic
may set the thread as interrupted and
does not handle it.
dataroaring pushed a commit that referenced this pull request Feb 24, 2025
…7874 (#47941)

Cherry-picked from #47874

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
@gavinchou gavinchou mentioned this pull request Apr 23, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

This pull request includes changes to ensure that the interrupted flag
of the current thread is reset before performing certain operations to
prevent failures in acquiring locks. The changes primarily affect the
`BDBEnvironment` and `BDBJEJournal` classes.

```
Caused by: java.lang.InterruptedException
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) ~[?:1.8.0_352-352]
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) ~[?:1.8.0_352-352]
        at com.sleepycat.je.latch.SharedLatchImpl.acquireShared(SharedLatchImpl.java:103) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootINInternal(Tree.java:447) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getRootIN(Tree.java:431) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.search(Tree.java:2185) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.tree.Tree.getFirstNode(Tree.java:812) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.positionFirstOrLast(CursorImpl.java:1776) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.CursorImpl.traverseDbWithCursor(CursorImpl.java:3953) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.dbi.DbTree.getDbNames(DbTree.java:1808) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        at com.sleepycat.je.Environment.getDatabaseNames(Environment.java:2458) ~[je-18.3.14-doris-SNAPSHOT.jar:18.3.14-doris-SNAPSHOT]
        ... 21 more
```

This a self-defend logic. Because we found that some times other logic
may set the thread as interrupted and
does not handle it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.x dev/2.1.9-merged dev/3.0.5-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants