Skip to content

Conversation

@kaijchen
Copy link
Member

@kaijchen kaijchen commented Feb 5, 2025

What problem does this PR solve?

Issue Number: CORE-3230 CIR-6008 CIR-6145 CIR-6220

Problem Summary:

Improve error message "close wait failed coz rpc error".
Return detailed cancel message directly if there is one.
If there is no cancel message, return "VNodeChannel ... is cancelled".

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 5, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaijchen
Copy link
Member Author

kaijchen commented Feb 5, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31960 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ca60900ed85681ee447758ae3415930711e5b3e2, data reload: false

------ Round 1 ----------------------------------
q1	17601	5510	5357	5357
q2	2049	303	166	166
q3	10601	1221	763	763
q4	10213	954	523	523
q5	7895	2345	2145	2145
q6	194	165	131	131
q7	876	745	589	589
q8	9234	1338	1091	1091
q9	5118	4914	4850	4850
q10	6845	2311	1867	1867
q11	476	271	247	247
q12	336	355	220	220
q13	17764	3660	3066	3066
q14	234	236	203	203
q15	516	472	458	458
q16	643	632	579	579
q17	570	850	329	329
q18	6887	6324	6357	6324
q19	1985	944	537	537
q20	311	318	188	188
q21	2799	2188	2021	2021
q22	361	335	306	306
Total cold run time: 103508 ms
Total hot run time: 31960 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5577	5498	5484	5484
q2	235	330	237	237
q3	2276	2664	2325	2325
q4	1460	1849	1404	1404
q5	4269	4741	4629	4629
q6	167	159	125	125
q7	2039	1936	1852	1852
q8	2571	2800	2677	2677
q9	7359	7183	7331	7183
q10	3046	3263	2765	2765
q11	592	513	474	474
q12	646	728	595	595
q13	3455	3890	3265	3265
q14	280	301	286	286
q15	511	475	467	467
q16	662	702	656	656
q17	1343	1752	1264	1264
q18	7657	7514	7303	7303
q19	823	1146	1081	1081
q20	2048	2024	1874	1874
q21	5688	5215	4827	4827
q22	601	591	559	559
Total cold run time: 53305 ms
Total hot run time: 51332 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191938 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ca60900ed85681ee447758ae3415930711e5b3e2, data reload: false

query1	1328	948	910	910
query2	6244	2080	2091	2080
query3	10885	4413	4371	4371
query4	32562	23131	23106	23106
query5	3644	592	470	470
query6	268	198	214	198
query7	3985	476	306	306
query8	303	236	246	236
query9	9512	2679	2673	2673
query10	450	319	247	247
query11	17832	15158	15228	15158
query12	177	108	107	107
query13	1571	550	392	392
query14	10505	6343	7047	6343
query15	253	205	200	200
query16	7765	642	483	483
query17	1566	765	572	572
query18	2089	401	310	310
query19	209	186	156	156
query20	122	117	111	111
query21	209	123	105	105
query22	4593	4573	4593	4573
query23	34354	34095	33494	33494
query24	6319	2347	2315	2315
query25	476	449	386	386
query26	712	254	158	158
query27	2131	477	338	338
query28	5516	2482	2473	2473
query29	564	528	444	444
query30	215	184	156	156
query31	981	886	849	849
query32	71	59	56	56
query33	504	368	311	311
query34	771	919	532	532
query35	834	856	754	754
query36	1032	1062	986	986
query37	127	103	80	80
query38	4481	4331	4260	4260
query39	1472	1462	1411	1411
query40	207	116	100	100
query41	49	47	48	47
query42	115	101	100	100
query43	514	530	506	506
query44	1325	805	804	804
query45	189	205	177	177
query46	870	1041	660	660
query47	1900	1889	1889	1889
query48	382	405	325	325
query49	715	501	405	405
query50	690	702	395	395
query51	4273	4277	4238	4238
query52	106	103	97	97
query53	244	252	198	198
query54	513	492	434	434
query55	79	83	79	79
query56	270	269	246	246
query57	1198	1231	1143	1143
query58	245	245	243	243
query59	3226	3348	3349	3348
query60	298	290	265	265
query61	165	113	115	113
query62	771	745	661	661
query63	237	188	181	181
query64	3276	1085	681	681
query65	3304	3246	3260	3246
query66	787	428	292	292
query67	15981	15947	15351	15351
query68	2189	825	561	561
query69	425	296	265	265
query70	1225	1174	1140	1140
query71	323	283	256	256
query72	5885	3859	3826	3826
query73	651	749	370	370
query74	9555	9351	9024	9024
query75	3137	3155	2646	2646
query76	2034	1112	843	843
query77	328	376	263	263
query78	10198	10221	9344	9344
query79	980	884	590	590
query80	1132	527	439	439
query81	528	274	238	238
query82	207	154	122	122
query83	290	169	147	147
query84	230	87	75	75
query85	828	356	361	356
query86	395	324	297	297
query87	4427	4713	4424	4424
query88	3346	2220	2166	2166
query89	397	331	295	295
query90	1648	186	186	186
query91	132	136	104	104
query92	56	56	51	51
query93	942	875	547	547
query94	606	414	287	287
query95	336	266	250	250
query96	499	597	287	287
query97	2834	2877	2754	2754
query98	217	197	200	197
query99	1296	1365	1254	1254
Total cold run time: 275837 ms
Total hot run time: 191938 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 42.10% (11013/26161)
Line Coverage: 32.37% (92940/287116)
Region Coverage: 31.53% (47668/151185)
Branch Coverage: 27.54% (24112/87550)
Coverage Report: http://coverage.selectdb-in.cc/coverage/ca60900ed85681ee447758ae3415930711e5b3e2_ca60900ed85681ee447758ae3415930711e5b3e2/report/index.html

@doris-robot
Copy link

ClickBench: Total hot run time: 30.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ca60900ed85681ee447758ae3415930711e5b3e2, data reload: false

query1	0.03	0.03	0.04
query2	0.08	0.03	0.03
query3	0.24	0.07	0.07
query4	1.62	0.11	0.10
query5	0.42	0.41	0.41
query6	1.18	0.65	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.58	0.51	0.51
query10	0.54	0.57	0.56
query11	0.14	0.11	0.10
query12	0.14	0.11	0.11
query13	0.60	0.60	0.61
query14	2.72	2.78	2.76
query15	0.90	0.83	0.83
query16	0.40	0.40	0.39
query17	1.02	1.07	1.04
query18	0.23	0.21	0.20
query19	1.85	1.85	1.94
query20	0.01	0.01	0.01
query21	15.39	0.91	0.60
query22	0.74	0.73	0.65
query23	15.40	1.44	0.57
query24	2.69	0.86	0.25
query25	0.18	0.19	0.20
query26	0.32	0.14	0.15
query27	0.05	0.06	0.06
query28	13.10	1.07	0.43
query29	12.59	3.87	3.26
query30	0.25	0.09	0.06
query31	2.83	0.58	0.39
query32	3.23	0.55	0.46
query33	3.04	3.06	3.04
query34	16.51	5.15	4.53
query35	4.51	4.46	4.48
query36	0.63	0.50	0.51
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.16	0.13	0.12
query41	0.09	0.03	0.02
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 104.71 s
Total hot run time: 30.2 s

dataroaring
dataroaring previously approved these changes Feb 10, 2025
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 10, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

liaoxin01
liaoxin01 previously approved these changes Feb 10, 2025
Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kaijchen kaijchen dismissed stale reviews from liaoxin01 and dataroaring via 137ac2a February 10, 2025 12:55
@kaijchen
Copy link
Member Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Feb 10, 2025
@doris-robot
Copy link

TPC-H: Total hot run time: 31392 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 137ac2a34ed63e9f9b1af7d3b7b7456dac26bed0, data reload: false

------ Round 1 ----------------------------------
q1	17606	5326	5117	5117
q2	2055	314	166	166
q3	10395	1390	716	716
q4	10226	1017	529	529
q5	7534	2410	2355	2355
q6	199	170	132	132
q7	900	746	612	612
q8	9323	1334	1093	1093
q9	5241	4572	4448	4448
q10	6829	2305	1892	1892
q11	479	279	265	265
q12	360	358	217	217
q13	17758	3616	3097	3097
q14	225	225	211	211
q15	510	460	477	460
q16	628	616	570	570
q17	568	885	351	351
q18	6631	6292	6165	6165
q19	1398	953	540	540
q20	318	334	200	200
q21	2771	2197	1938	1938
q22	364	327	318	318
Total cold run time: 102318 ms
Total hot run time: 31392 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5166	5100	5150	5100
q2	241	340	230	230
q3	2137	2702	2326	2326
q4	1475	1855	1433	1433
q5	4271	4154	4198	4154
q6	198	164	125	125
q7	1894	1839	1728	1728
q8	2644	2597	2567	2567
q9	7271	7097	7048	7048
q10	2997	3176	2748	2748
q11	576	527	535	527
q12	708	789	631	631
q13	3540	3922	3191	3191
q14	268	287	269	269
q15	508	459	474	459
q16	640	685	643	643
q17	1160	1601	1359	1359
q18	7524	7277	7350	7277
q19	797	823	957	823
q20	2006	2002	1865	1865
q21	5337	4939	4844	4844
q22	637	586	530	530
Total cold run time: 51995 ms
Total hot run time: 49877 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183726 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 137ac2a34ed63e9f9b1af7d3b7b7456dac26bed0, data reload: false

query1	970	402	370	370
query2	6515	1858	1849	1849
query3	6791	221	215	215
query4	26780	23713	23421	23421
query5	4348	703	501	501
query6	301	199	187	187
query7	4599	494	309	309
query8	304	239	239	239
query9	8588	2492	2489	2489
query10	481	309	242	242
query11	15770	15171	15130	15130
query12	159	107	105	105
query13	1651	517	383	383
query14	10323	6464	6129	6129
query15	210	200	179	179
query16	7426	618	464	464
query17	1185	706	550	550
query18	1959	392	293	293
query19	188	187	158	158
query20	117	114	116	114
query21	205	118	100	100
query22	4293	4087	3988	3988
query23	34065	33054	33274	33054
query24	7703	2407	2416	2407
query25	580	473	424	424
query26	1254	270	157	157
query27	2183	492	340	340
query28	3966	2396	2374	2374
query29	756	579	456	456
query30	238	194	162	162
query31	960	869	813	813
query32	75	67	62	62
query33	577	390	317	317
query34	792	874	486	486
query35	809	848	744	744
query36	943	987	908	908
query37	128	105	142	105
query38	4097	4206	4185	4185
query39	1480	1401	1402	1401
query40	212	114	103	103
query41	54	52	49	49
query42	124	105	103	103
query43	502	506	489	489
query44	1301	792	791	791
query45	178	169	164	164
query46	865	1038	628	628
query47	1739	1818	1741	1741
query48	382	409	317	317
query49	786	498	435	435
query50	684	724	429	429
query51	4205	4125	4164	4125
query52	105	117	95	95
query53	224	250	186	186
query54	497	480	402	402
query55	92	77	80	77
query56	272	274	250	250
query57	1146	1144	1113	1113
query58	245	237	238	237
query59	2690	2858	2688	2688
query60	300	289	270	270
query61	129	149	127	127
query62	783	718	654	654
query63	225	191	188	188
query64	4305	996	685	685
query65	3277	3151	3148	3148
query66	1118	410	313	313
query67	16099	15603	15184	15184
query68	7977	768	518	518
query69	459	300	265	265
query70	1219	1175	1119	1119
query71	417	297	261	261
query72	5654	3522	3768	3522
query73	748	755	361	361
query74	8975	8988	8755	8755
query75	3344	3173	2678	2678
query76	3276	1183	750	750
query77	625	368	291	291
query78	10023	10101	9300	9300
query79	2666	814	603	603
query80	598	520	510	510
query81	505	277	238	238
query82	705	156	115	115
query83	179	174	152	152
query84	237	96	79	79
query85	788	403	309	309
query86	372	303	295	295
query87	4433	4635	4379	4379
query88	3708	2207	2208	2207
query89	406	330	289	289
query90	1936	200	194	194
query91	141	154	112	112
query92	74	62	58	58
query93	1905	1011	586	586
query94	694	412	303	303
query95	354	269	260	260
query96	475	561	271	271
query97	2770	2824	2705	2705
query98	238	206	203	203
query99	1339	1399	1237	1237
Total cold run time: 273548 ms
Total hot run time: 183726 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.01 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 137ac2a34ed63e9f9b1af7d3b7b7456dac26bed0, data reload: false

query1	0.04	0.03	0.05
query2	0.07	0.04	0.03
query3	0.23	0.07	0.06
query4	1.62	0.10	0.10
query5	0.42	0.41	0.39
query6	1.17	0.66	0.67
query7	0.03	0.02	0.01
query8	0.04	0.03	0.03
query9	0.59	0.53	0.51
query10	0.57	0.59	0.57
query11	0.15	0.11	0.11
query12	0.14	0.11	0.12
query13	0.63	0.61	0.61
query14	2.71	2.80	2.70
query15	0.93	0.86	0.84
query16	0.37	0.37	0.38
query17	1.00	1.01	1.03
query18	0.22	0.19	0.19
query19	1.96	1.84	2.04
query20	0.01	0.01	0.01
query21	15.35	0.88	0.55
query22	0.75	1.09	0.71
query23	14.96	1.35	0.63
query24	7.17	1.08	1.15
query25	0.53	0.22	0.08
query26	0.49	0.16	0.14
query27	0.05	0.05	0.05
query28	9.57	0.87	0.43
query29	12.58	4.05	3.41
query30	0.26	0.08	0.06
query31	2.82	0.58	0.38
query32	3.22	0.55	0.47
query33	3.02	3.03	3.05
query34	15.69	5.10	4.48
query35	4.49	4.52	4.54
query36	0.66	0.49	0.49
query37	0.10	0.06	0.06
query38	0.05	0.04	0.04
query39	0.04	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.02 s
Total hot run time: 31.01 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 42.81% (11262/26309)
Line Coverage: 32.78% (94594/288565)
Region Coverage: 31.94% (48501/151841)
Branch Coverage: 27.82% (24462/87944)
Coverage Report: http://coverage.selectdb-in.cc/coverage/137ac2a34ed63e9f9b1af7d3b7b7456dac26bed0_137ac2a34ed63e9f9b1af7d3b7b7456dac26bed0/report/index.html

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 12, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@liaoxin01 liaoxin01 merged commit 2a5530d into apache:master Feb 12, 2025
26 of 28 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 12, 2025
…r" (#47518)

Improve error message `"close wait failed coz rpc error"`.
Return detailed cancel message directly if there is one.
If there is no cancel message, return `"VNodeChannel ... is cancelled"`.
github-actions bot pushed a commit that referenced this pull request Feb 12, 2025
…r" (#47518)

Improve error message `"close wait failed coz rpc error"`.
Return detailed cancel message directly if there is one.
If there is no cancel message, return `"VNodeChannel ... is cancelled"`.
yiguolei pushed a commit that referenced this pull request Feb 18, 2025
…coz rpc error" #47518 (#47805)

Cherry-picked from #47518

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…r" (apache#47518)

Improve error message `"close wait failed coz rpc error"`.
Return detailed cancel message directly if there is one.
If there is no cancel message, return `"VNodeChannel ... is cancelled"`.
dataroaring pushed a commit that referenced this pull request Feb 24, 2025
…coz rpc error" #47518 (#47804)

Cherry-picked from #47518

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…r" (apache#47518)

Improve error message `"close wait failed coz rpc error"`.
Return detailed cancel message directly if there is one.
If there is no cancel message, return `"VNodeChannel ... is cancelled"`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.5-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants