Skip to content

Conversation

@platoneko
Copy link
Contributor

@platoneko platoneko commented Mar 25, 2024

Proposed changes

The single replica cooldown protocal in Doris must maintain the invariant that cooldowned rowsets of a replica exactly corresponds to the cooldown meta id, however skipping rowsets with the same version when following cooldowned data would break this invariant. Modifying the code to skip rowsets with the same rowset id can maintain the invariant.
This PR also fix another potential data loss problem.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring
Copy link
Contributor

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 25, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.26% (8737/24778)
Line Coverage: 27.05% (71571/264540)
Region Coverage: 26.29% (37120/141184)
Branch Coverage: 23.18% (18984/81882)
Coverage Report: http://coverage.selectdb-in.cc/coverage/e21228bf2c53b83dfeaffaedae4fcc17101fb915_e21228bf2c53b83dfeaffaedae4fcc17101fb915/report/index.html

@platoneko
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.27% (8738/24778)
Line Coverage: 27.05% (71567/264552)
Region Coverage: 26.30% (37130/141202)
Branch Coverage: 23.18% (18982/81884)
Coverage Report: http://coverage.selectdb-in.cc/coverage/8757cf6ccd10bcae5c995de5805a44fb8178362a_8757cf6ccd10bcae5c995de5805a44fb8178362a/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 37748 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8757cf6ccd10bcae5c995de5805a44fb8178362a, data reload: false

------ Round 1 ----------------------------------
q1	17627	4160	4114	4114
q2	2111	160	151	151
q3	10579	1130	1171	1130
q4	10230	754	804	754
q5	7451	3015	2979	2979
q6	202	130	124	124
q7	1035	578	573	573
q8	9338	1968	1969	1968
q9	7155	6575	6590	6575
q10	8423	3453	3552	3453
q11	430	225	215	215
q12	420	197	194	194
q13	17797	2841	2827	2827
q14	236	197	209	197
q15	517	457	450	450
q16	492	368	379	368
q17	945	525	620	525
q18	7057	6448	6375	6375
q19	3364	1396	1389	1389
q20	563	247	248	247
q21	3516	2846	2934	2846
q22	343	294	305	294
Total cold run time: 109831 ms
Total hot run time: 37748 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4143	4108	4066	4066
q2	328	229	234	229
q3	2959	2822	2803	2803
q4	1879	1557	1588	1557
q5	5297	5345	5307	5307
q6	194	118	120	118
q7	2199	1853	1863	1853
q8	3138	3276	3240	3240
q9	8642	8656	8665	8656
q10	3768	3764	3779	3764
q11	556	450	447	447
q12	726	537	551	537
q13	16924	2881	2854	2854
q14	286	247	250	247
q15	499	460	459	459
q16	473	444	409	409
q17	1710	1494	1456	1456
q18	7326	7205	7088	7088
q19	1627	1527	1529	1527
q20	1902	1717	1750	1717
q21	4786	4748	4591	4591
q22	556	470	469	469
Total cold run time: 69918 ms
Total hot run time: 53394 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181761 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 8757cf6ccd10bcae5c995de5805a44fb8178362a, data reload: false

query1	936	360	353	353
query2	6552	1933	1861	1861
query3	6709	214	214	214
query4	31959	21303	21289	21289
query5	4360	397	402	397
query6	263	175	176	175
query7	4622	306	295	295
query8	228	173	170	170
query9	9508	2335	2325	2325
query10	574	257	249	249
query11	15285	14202	14204	14202
query12	131	85	89	85
query13	1622	411	416	411
query14	10451	7994	8171	7994
query15	241	192	194	192
query16	7763	264	270	264
query17	1970	596	553	553
query18	2004	300	277	277
query19	342	155	163	155
query20	103	86	85	85
query21	201	130	130	130
query22	4962	4850	4842	4842
query23	33697	32690	32788	32690
query24	10828	2857	2839	2839
query25	623	381	392	381
query26	1173	155	166	155
query27	2548	345	365	345
query28	7005	1892	1865	1865
query29	891	627	637	627
query30	303	155	149	149
query31	984	729	761	729
query32	102	66	58	58
query33	765	267	257	257
query34	1005	478	490	478
query35	847	622	607	607
query36	1045	896	897	896
query37	113	64	64	64
query38	3589	3486	3485	3485
query39	1480	1468	1418	1418
query40	214	120	114	114
query41	50	50	48	48
query42	101	93	95	93
query43	470	439	439	439
query44	1153	719	717	717
query45	259	256	266	256
query46	1111	682	691	682
query47	1902	1834	1844	1834
query48	458	364	368	364
query49	1129	357	344	344
query50	755	379	377	377
query51	6668	6639	6680	6639
query52	106	92	90	90
query53	343	275	279	275
query54	316	230	240	230
query55	89	78	83	78
query56	253	232	228	228
query57	1208	1143	1145	1143
query58	240	206	213	206
query59	2791	2689	2434	2434
query60	264	255	254	254
query61	120	131	121	121
query62	678	441	457	441
query63	316	273	276	273
query64	5305	4077	3996	3996
query65	3057	3019	3023	3019
query66	880	365	358	358
query67	15451	14634	15153	14634
query68	7337	518	514	514
query69	620	381	369	369
query70	1231	1195	1161	1161
query71	518	267	266	266
query72	6633	2742	2579	2579
query73	749	314	314	314
query74	6797	6347	6342	6342
query75	3628	2179	2192	2179
query76	4974	873	890	873
query77	651	295	255	255
query78	11010	10222	10125	10125
query79	11700	528	530	528
query80	1927	372	381	372
query81	537	212	207	207
query82	875	86	89	86
query83	216	147	146	146
query84	285	86	81	81
query85	1462	332	320	320
query86	441	292	313	292
query87	3810	3510	3562	3510
query88	5444	2398	2410	2398
query89	525	366	369	366
query90	1975	180	178	178
query91	171	135	140	135
query92	62	51	49	49
query93	7294	508	485	485
query94	1240	184	185	184
query95	442	344	334	334
query96	613	271	272	271
query97	2686	2469	2488	2469
query98	231	213	211	211
query99	1226	914	933	914
Total cold run time: 310781 ms
Total hot run time: 181761 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 8757cf6ccd10bcae5c995de5805a44fb8178362a with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       21.2 seconds inserted 10000000 Rows, about 471K ops/s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit fd13c8e into apache:master Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.1.0-conflict approved Indicates a PR has been approved by one committer. dev/2.0.8-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants