Skip to content

Conversation

@platoneko
Copy link
Contributor

@platoneko platoneko commented Jun 21, 2024

Proposed changes

Fix data loss when node channel been cancelled before close wait.
When an error occurs in VNodeChannel::try_send_pending_block, invoking VNodeChannel::cancel sets the VNodeChannel to closed. In VTabletWriter::close, if VNodeChannel::cancel is called before VNodeChannel::close_wait, it bypasses the error handling code directly, causing the transaction to still be considered successful.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@platoneko platoneko marked this pull request as ready for review June 21, 2024 07:22
@platoneko
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.47% (9008/24699)
Line Coverage: 28.01% (73885/263744)
Region Coverage: 27.49% (38378/139594)
Branch Coverage: 24.19% (19561/80862)
Coverage Report: http://coverage.selectdb-in.cc/coverage/e59c0df76cd7a68306c90503c00c523c358c8c33_e59c0df76cd7a68306c90503c00c523c358c8c33/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 39918 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e59c0df76cd7a68306c90503c00c523c358c8c33, data reload: false

------ Round 1 ----------------------------------
q1	17984	4421	4397	4397
q2	2655	197	197	197
q3	10990	1142	1105	1105
q4	10896	787	751	751
q5	7494	2685	2655	2655
q6	229	137	135	135
q7	981	608	611	608
q8	9404	2081	2063	2063
q9	8682	6466	6438	6438
q10	8914	3684	3712	3684
q11	446	243	231	231
q12	395	229	229	229
q13	18707	2989	3001	2989
q14	274	206	222	206
q15	521	476	479	476
q16	515	385	379	379
q17	954	611	705	611
q18	8031	7502	7380	7380
q19	7418	1390	1527	1390
q20	681	322	334	322
q21	4990	3331	3901	3331
q22	393	349	341	341
Total cold run time: 121554 ms
Total hot run time: 39918 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4500	4258	4216	4216
q2	372	280	268	268
q3	2923	2780	2690	2690
q4	1873	1627	1599	1599
q5	5209	5276	5261	5261
q6	218	125	129	125
q7	2074	1700	1759	1700
q8	3171	3319	3297	3297
q9	8313	8297	8349	8297
q10	3837	3648	3636	3636
q11	588	497	497	497
q12	792	600	599	599
q13	16637	2974	3015	2974
q14	288	291	256	256
q15	524	478	466	466
q16	456	416	414	414
q17	1755	1477	1479	1477
q18	7593	7451	7358	7358
q19	1671	1516	1599	1516
q20	1961	1797	1759	1759
q21	4891	4761	4590	4590
q22	637	567	569	567
Total cold run time: 70283 ms
Total hot run time: 53562 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172383 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e59c0df76cd7a68306c90503c00c523c358c8c33, data reload: false

query1	927	378	384	378
query2	6468	2593	2436	2436
query3	6648	204	220	204
query4	19087	17292	17363	17292
query5	4158	481	474	474
query6	243	164	161	161
query7	4589	296	292	292
query8	347	314	295	295
query9	8566	2429	2429	2429
query10	610	307	304	304
query11	10554	9910	9998	9910
query12	147	87	84	84
query13	1655	380	385	380
query14	9405	7589	7088	7088
query15	232	191	180	180
query16	7830	264	255	255
query17	1906	529	536	529
query18	1958	273	262	262
query19	196	154	157	154
query20	91	84	80	80
query21	222	128	126	126
query22	4197	4097	3945	3945
query23	33808	33066	32852	32852
query24	11816	2863	2819	2819
query25	688	368	357	357
query26	1799	149	149	149
query27	3038	318	314	314
query28	7726	2067	2064	2064
query29	1167	621	596	596
query30	286	160	150	150
query31	943	760	735	735
query32	99	58	56	56
query33	758	297	273	273
query34	995	469	456	456
query35	750	634	636	634
query36	1106	951	945	945
query37	283	67	70	67
query38	2906	2753	2718	2718
query39	881	803	803	803
query40	286	131	128	128
query41	57	52	56	52
query42	128	101	101	101
query43	572	532	560	532
query44	1254	729	739	729
query45	200	164	161	161
query46	1084	711	703	703
query47	1808	1761	1769	1761
query48	365	297	301	297
query49	1200	425	401	401
query50	765	381	395	381
query51	6969	6860	6742	6742
query52	101	95	95	95
query53	361	294	286	286
query54	971	437	432	432
query55	79	76	75	75
query56	279	255	256	255
query57	1148	1040	1050	1040
query58	249	242	260	242
query59	3617	3132	3258	3132
query60	298	273	269	269
query61	90	91	94	91
query62	655	449	425	425
query63	322	294	290	290
query64	9827	2287	1720	1720
query65	3173	3125	3116	3116
query66	1375	334	340	334
query67	15517	15002	14891	14891
query68	4526	519	528	519
query69	514	463	353	353
query70	1079	1148	1135	1135
query71	395	271	278	271
query72	7223	5366	5311	5311
query73	748	322	319	319
query74	5891	5501	5412	5412
query75	3489	2652	2661	2652
query76	2711	979	851	851
query77	666	307	354	307
query78	10178	9713	9686	9686
query79	2212	509	508	508
query80	1560	464	476	464
query81	571	221	219	219
query82	718	101	102	101
query83	305	167	173	167
query84	264	86	89	86
query85	1841	276	269	269
query86	483	331	327	327
query87	3248	3052	3051	3051
query88	4011	2362	2378	2362
query89	491	381	386	381
query90	1813	187	197	187
query91	131	97	96	96
query92	65	51	50	50
query93	2216	517	502	502
query94	1285	190	188	188
query95	414	325	341	325
query96	598	268	271	268
query97	3183	3056	3105	3056
query98	284	201	203	201
query99	1275	832	859	832
Total cold run time: 276223 ms
Total hot run time: 172383 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e59c0df76cd7a68306c90503c00c523c358c8c33, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.22	0.05	0.04
query4	1.68	0.07	0.08
query5	0.50	0.49	0.49
query6	1.13	0.72	0.72
query7	0.02	0.02	0.02
query8	0.05	0.05	0.05
query9	0.54	0.52	0.49
query10	0.55	0.53	0.54
query11	0.15	0.12	0.11
query12	0.15	0.12	0.12
query13	0.59	0.58	0.60
query14	0.81	0.78	0.78
query15	0.82	0.82	0.81
query16	0.37	0.37	0.37
query17	0.97	1.00	1.03
query18	0.22	0.26	0.24
query19	1.90	1.83	1.72
query20	0.01	0.00	0.00
query21	15.42	0.68	0.66
query22	4.58	7.00	1.62
query23	18.27	1.40	1.35
query24	2.17	0.22	0.22
query25	0.15	0.09	0.09
query26	0.25	0.18	0.18
query27	0.08	0.09	0.07
query28	13.25	1.01	0.99
query29	12.63	3.24	3.26
query30	0.26	0.07	0.06
query31	2.85	0.38	0.37
query32	3.29	0.47	0.46
query33	2.87	2.91	2.91
query34	17.05	4.43	4.40
query35	4.50	4.48	4.49
query36	0.65	0.46	0.46
query37	0.18	0.16	0.16
query38	0.16	0.15	0.15
query39	0.04	0.04	0.04
query40	0.17	0.14	0.15
query41	0.09	0.06	0.04
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.81 s
Total hot run time: 30.39 s

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 21, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@gavinchou gavinchou merged commit 8df2c5b into apache:master Jun 24, 2024
dataroaring pushed a commit that referenced this pull request Jun 25, 2024
dataroaring pushed a commit that referenced this pull request Jun 25, 2024
dataroaring pushed a commit that referenced this pull request Jun 26, 2024
…se wait (#36662)

Fix data loss when node channel been cancelled before close wait.
When an error occurs in `VNodeChannel::try_send_pending_block`, invoking
`VNodeChannel::cancel` sets the `VNodeChannel` to closed. In
`VTabletWriter::close`, if `VNodeChannel::cancel` is called before
`VNodeChannel::close_wait`, it bypasses the error handling code
directly, causing the transaction to still be considered successful.
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants