Skip to content

Conversation

@bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented May 29, 2025

What problem does this PR solve?

If a schema change job failed and retry on BE, the previous failed sc job will not clear its delete bitmap KVs written in MS. So later retry may update existing delete bitmap KVs on new tablet. Considering that delete bitmap KVs will be split into multiple KVs when too large, we may get wrong delete bitmaps if we don't remove the previous KVs before put them.
This PR use pending delete bitmaps to avoid this situation for SC.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@bobhan1
Copy link
Contributor Author

bobhan1 commented May 29, 2025

run buildall

@Thearas
Copy link
Contributor

Thearas commented May 29, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1
Copy link
Contributor Author

bobhan1 commented May 29, 2025

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 100.00% (7/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.26% (1114/1338)
Line Coverage 66.49% (18812/28294)
Region Coverage 66.10% (9322/14102)
Branch Coverage 56.04% (5032/8980)

@doris-robot
Copy link

TPC-H: Total hot run time: 33933 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ce1ce2690b68d50dd4f36fc22159af66b0e61ae1, data reload: false

------ Round 1 ----------------------------------
q1	26192	4960	5006	4960
q2	1970	275	178	178
q3	10293	1284	706	706
q4	10217	1019	542	542
q5	7543	2319	2393	2319
q6	176	167	132	132
q7	910	729	604	604
q8	9322	1323	1130	1130
q9	6757	5115	5129	5115
q10	6861	2293	1923	1923
q11	491	286	280	280
q12	343	347	223	223
q13	17783	3672	3076	3076
q14	224	237	216	216
q15	558	486	489	486
q16	423	437	371	371
q17	595	863	351	351
q18	7566	7201	7222	7201
q19	2037	979	553	553
q20	337	325	220	220
q21	3698	3124	2345	2345
q22	1074	1058	1002	1002
Total cold run time: 115370 ms
Total hot run time: 33933 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5195	5113	5077	5077
q2	242	316	219	219
q3	2109	2657	2307	2307
q4	1381	1768	1329	1329
q5	4481	4400	4416	4400
q6	214	161	132	132
q7	2025	1892	1815	1815
q8	2641	2577	2467	2467
q9	7279	7244	6935	6935
q10	3044	3321	2763	2763
q11	588	517	491	491
q12	718	762	605	605
q13	3512	3898	3237	3237
q14	290	293	282	282
q15	521	479	479	479
q16	446	469	436	436
q17	1140	1527	1388	1388
q18	7846	7491	7380	7380
q19	804	882	1118	882
q20	1978	2058	1815	1815
q21	4867	4512	4364	4364
q22	1106	1065	1024	1024
Total cold run time: 52427 ms
Total hot run time: 49827 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192625 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ce1ce2690b68d50dd4f36fc22159af66b0e61ae1, data reload: false

query1	1403	1101	1062	1062
query2	6298	1895	1871	1871
query3	11141	4709	4420	4420
query4	53334	25670	23537	23537
query5	5216	594	466	466
query6	357	215	198	198
query7	4892	513	293	293
query8	316	236	208	208
query9	5605	2638	2661	2638
query10	457	329	271	271
query11	15186	15045	14810	14810
query12	153	108	116	108
query13	1046	533	416	416
query14	10185	6304	6265	6265
query15	202	200	167	167
query16	7096	669	539	539
query17	1069	703	590	590
query18	1554	401	306	306
query19	199	202	161	161
query20	140	118	125	118
query21	204	119	106	106
query22	4442	4410	4198	4198
query23	34269	33589	33566	33566
query24	6586	2430	2409	2409
query25	484	472	415	415
query26	690	270	149	149
query27	2419	519	348	348
query28	2977	2180	2174	2174
query29	579	560	436	436
query30	277	217	193	193
query31	872	860	805	805
query32	80	65	65	65
query33	478	390	310	310
query34	780	864	544	544
query35	830	855	769	769
query36	955	1000	913	913
query37	116	102	77	77
query38	4170	4211	4210	4210
query39	1698	1439	1462	1439
query40	229	128	108	108
query41	68	68	68	68
query42	122	114	109	109
query43	523	509	490	490
query44	1405	847	840	840
query45	180	178	170	170
query46	854	1054	649	649
query47	1848	1865	1754	1754
query48	402	427	311	311
query49	662	475	417	417
query50	658	708	411	411
query51	4226	4322	4270	4270
query52	110	105	100	100
query53	237	255	190	190
query54	574	596	504	504
query55	85	84	83	83
query56	312	317	341	317
query57	1175	1212	1106	1106
query58	263	269	266	266
query59	2767	2796	2804	2796
query60	335	315	308	308
query61	129	128	125	125
query62	726	737	680	680
query63	229	192	186	186
query64	1497	1065	682	682
query65	4207	4210	4144	4144
query66	718	393	304	304
query67	15898	15646	15349	15349
query68	6451	891	526	526
query69	543	308	276	276
query70	1179	1094	1105	1094
query71	508	318	294	294
query72	5976	4855	4931	4855
query73	1550	696	356	356
query74	8945	8867	9180	8867
query75	4100	3181	2676	2676
query76	4231	1189	764	764
query77	764	384	297	297
query78	10102	10142	9281	9281
query79	2314	809	563	563
query80	649	503	442	442
query81	465	261	215	215
query82	439	126	94	94
query83	294	253	229	229
query84	296	108	88	88
query85	768	359	311	311
query86	369	299	287	287
query87	4406	4396	4331	4331
query88	3829	2291	2251	2251
query89	398	316	279	279
query90	1781	202	204	202
query91	140	141	114	114
query92	75	59	60	59
query93	1902	934	582	582
query94	659	391	302	302
query95	369	286	279	279
query96	492	577	285	285
query97	2698	2812	2679	2679
query98	229	211	197	197
query99	1424	1393	1243	1243
Total cold run time: 297539 ms
Total hot run time: 192625 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.64 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ce1ce2690b68d50dd4f36fc22159af66b0e61ae1, data reload: false

query1	0.04	0.04	0.02
query2	0.14	0.10	0.10
query3	0.37	0.20	0.20
query4	1.59	0.21	0.09
query5	0.43	0.41	0.40
query6	1.18	0.66	0.68
query7	0.02	0.01	0.02
query8	0.06	0.05	0.05
query9	0.62	0.52	0.53
query10	0.58	0.59	0.57
query11	0.25	0.13	0.13
query12	0.25	0.14	0.14
query13	0.64	0.62	0.61
query14	0.79	0.83	0.80
query15	0.97	0.87	0.87
query16	0.37	0.40	0.37
query17	1.09	1.04	1.02
query18	0.18	0.18	0.18
query19	1.96	1.82	1.89
query20	0.02	0.01	0.01
query21	15.41	0.97	0.67
query22	0.93	1.04	0.78
query23	14.69	1.57	0.76
query24	5.10	0.61	0.31
query25	0.16	0.09	0.09
query26	0.56	0.21	0.19
query27	0.09	0.08	0.09
query28	11.02	1.22	0.58
query29	12.63	4.12	3.38
query30	0.28	0.09	0.07
query31	2.83	0.63	0.41
query32	3.24	0.60	0.51
query33	3.02	3.09	3.08
query34	16.26	5.15	4.50
query35	4.51	4.51	4.55
query36	0.64	0.51	0.49
query37	0.20	0.16	0.17
query38	0.16	0.15	0.15
query39	0.05	0.04	0.05
query40	0.19	0.16	0.16
query41	0.11	0.06	0.07
query42	0.06	0.05	0.06
query43	0.06	0.04	0.04
Total cold run time: 103.75 s
Total hot run time: 29.64 s

@bobhan1 bobhan1 force-pushed the sc-remove-before-update-dbm branch from 8ed9703 to 5649633 Compare May 29, 2025 10:47
@bobhan1
Copy link
Contributor Author

bobhan1 commented May 29, 2025

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 100.00% (5/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.26% (1114/1338)
Line Coverage 66.46% (18802/28291)
Region Coverage 66.16% (9329/14101)
Branch Coverage 56.03% (5030/8978)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by anyone and no changes requested.

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhannngchen zhannngchen merged commit 2bedb15 into apache:master Jun 4, 2025
30 checks passed
bobhan1 added a commit to bobhan1/doris that referenced this pull request Jun 5, 2025
…fore update them in schema change (apache#51353)

If a schema change job failed and retry on BE, the previous failed sc
job will not clear its delete bitmap KVs written in MS. So later retry
may update existing delete bitmap KVs on new tablet. Considering that
delete bitmap KVs will be split into multiple KVs when too large, we may
get wrong delete bitmaps if we don't remove the previous KVs before put
them.
This PR use pending delete bitmaps to avoid this situation for SC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.7-merged dev/3.1.0-merged p0_w reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants