Skip to content

Conversation

@bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Mar 3, 2025

What problem does this PR solve?

  1. remove rows check for newly generated block in publish in partial update. When the rowset has multi segments, newly generated segments of conflict rows on these segments will be flushed concurrently using the same rowset writer. So rowset_writer->num_rows() may be larger than new_generated_rows.
  2. correct rowset_ids to add: size when calculate delete bitmap in publish phase. Include invisible rowsets for txn_load
  3. add a config enable_mow_verbose_log to control whether to print verbose log for delete bitmap calculation.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Mar 3, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1 bobhan1 changed the title [Fix](mow) Fix some logs [Fix](mow) Fix some logs for mow Mar 3, 2025
Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2025

PR approved by anyone and no changes requested.

@bobhan1
Copy link
Contributor Author

bobhan1 commented Mar 4, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31774 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 38a763a87044bbf9c191f13958b9ec38d3b0a609, data reload: false

------ Round 1 ----------------------------------
q1	17612	5255	5075	5075
q2	2057	319	164	164
q3	10565	1258	748	748
q4	10213	1007	540	540
q5	7547	2405	2392	2392
q6	186	166	137	137
q7	919	759	598	598
q8	9288	1293	1086	1086
q9	4893	4739	4833	4739
q10	6810	2309	1876	1876
q11	488	278	256	256
q12	355	350	218	218
q13	17779	3691	3068	3068
q14	233	228	207	207
q15	504	470	467	467
q16	639	629	581	581
q17	606	852	363	363
q18	6606	6343	6262	6262
q19	1357	958	558	558
q20	332	324	186	186
q21	2742	2151	1946	1946
q22	366	341	307	307
Total cold run time: 102097 ms
Total hot run time: 31774 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5135	5140	5127	5127
q2	242	333	236	236
q3	2172	2682	2304	2304
q4	1424	1890	1332	1332
q5	4244	4138	4138	4138
q6	207	166	124	124
q7	1873	1845	1674	1674
q8	2624	2566	2571	2566
q9	7224	7251	7255	7251
q10	3010	3225	2794	2794
q11	585	519	486	486
q12	671	792	637	637
q13	3427	3904	3291	3291
q14	279	300	265	265
q15	501	463	469	463
q16	637	680	650	650
q17	1165	1582	1364	1364
q18	7638	7236	7194	7194
q19	847	872	1017	872
q20	1975	2011	1862	1862
q21	5523	5029	4911	4911
q22	653	578	564	564
Total cold run time: 52056 ms
Total hot run time: 50105 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190466 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 38a763a87044bbf9c191f13958b9ec38d3b0a609, data reload: false

query1	1352	992	917	917
query2	6223	1889	1903	1889
query3	11105	4729	4590	4590
query4	56414	25515	23208	23208
query5	5077	548	488	488
query6	382	209	186	186
query7	4929	507	304	304
query8	329	260	240	240
query9	6027	2541	2571	2541
query10	431	295	242	242
query11	15155	15329	14833	14833
query12	155	110	101	101
query13	1056	506	366	366
query14	10819	6529	6318	6318
query15	209	197	193	193
query16	6960	660	479	479
query17	1066	703	549	549
query18	1365	386	311	311
query19	207	200	163	163
query20	137	130	117	117
query21	204	125	110	110
query22	4262	4425	4429	4425
query23	33920	33338	33459	33338
query24	5718	2447	2431	2431
query25	476	477	417	417
query26	677	274	164	164
query27	1684	499	330	330
query28	2757	2480	2456	2456
query29	575	592	459	459
query30	216	192	160	160
query31	851	871	821	821
query32	75	71	63	63
query33	489	384	327	327
query34	788	879	517	517
query35	832	836	778	778
query36	971	1007	911	911
query37	132	106	81	81
query38	4205	4335	4191	4191
query39	1504	1457	1447	1447
query40	212	124	110	110
query41	62	57	60	57
query42	129	104	114	104
query43	517	529	500	500
query44	1309	837	821	821
query45	189	176	167	167
query46	921	1083	664	664
query47	1830	1834	1728	1728
query48	378	413	301	301
query49	734	522	424	424
query50	707	754	424	424
query51	4267	4298	4227	4227
query52	114	108	101	101
query53	249	262	191	191
query54	493	490	419	419
query55	81	84	77	77
query56	276	303	260	260
query57	1161	1183	1128	1128
query58	239	244	235	235
query59	2811	2898	2805	2805
query60	292	293	265	265
query61	115	114	121	114
query62	761	736	710	710
query63	234	201	198	198
query64	1439	1037	663	663
query65	3324	3235	3252	3235
query66	805	401	306	306
query67	16273	15581	15400	15400
query68	7661	823	510	510
query69	533	304	269	269
query70	1168	1179	1118	1118
query71	482	291	268	268
query72	5588	3618	3712	3618
query73	1370	753	349	349
query74	8978	9177	8799	8799
query75	3725	3126	2677	2677
query76	4217	1183	743	743
query77	599	385	279	279
query78	9953	10049	9235	9235
query79	2314	824	594	594
query80	629	516	428	428
query81	515	270	249	249
query82	572	124	92	92
query83	179	183	149	149
query84	292	102	71	71
query85	768	362	309	309
query86	390	299	279	279
query87	4531	4347	4344	4344
query88	3823	2218	2237	2218
query89	413	318	283	283
query90	1879	192	191	191
query91	135	138	114	114
query92	74	58	62	58
query93	1821	1041	574	574
query94	656	385	300	300
query95	354	270	261	261
query96	487	556	267	267
query97	3346	3421	3285	3285
query98	217	206	204	204
query99	1468	1389	1236	1236
Total cold run time: 298793 ms
Total hot run time: 190466 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.06 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 38a763a87044bbf9c191f13958b9ec38d3b0a609, data reload: false

query1	0.04	0.04	0.04
query2	0.07	0.04	0.03
query3	0.24	0.07	0.06
query4	1.62	0.10	0.10
query5	0.56	0.54	0.56
query6	1.20	0.72	0.72
query7	0.02	0.02	0.01
query8	0.04	0.04	0.03
query9	0.58	0.54	0.53
query10	0.56	0.58	0.57
query11	0.16	0.11	0.10
query12	0.14	0.11	0.11
query13	0.62	0.59	0.59
query14	2.83	2.83	2.71
query15	0.94	0.87	0.85
query16	0.38	0.37	0.36
query17	1.00	1.02	1.04
query18	0.22	0.20	0.20
query19	1.89	1.80	2.03
query20	0.01	0.01	0.01
query21	15.35	0.90	0.55
query22	0.77	1.26	0.71
query23	14.78	1.39	0.63
query24	6.66	1.76	1.20
query25	0.52	0.18	0.07
query26	0.55	0.17	0.14
query27	0.05	0.05	0.05
query28	9.50	0.90	0.42
query29	12.56	3.89	3.22
query30	0.25	0.09	0.06
query31	2.82	0.59	0.38
query32	3.22	0.55	0.47
query33	2.97	3.04	3.00
query34	15.79	5.19	4.46
query35	4.53	4.52	4.53
query36	0.66	0.50	0.48
query37	0.09	0.07	0.06
query38	0.06	0.05	0.04
query39	0.03	0.03	0.02
query40	0.17	0.14	0.13
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.6 s
Total hot run time: 31.06 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 37.50% (3/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.84% (12241/26706)
Line Coverage 35.35% (103524/292844)
Region Coverage 34.51% (53044/153688)
Branch Coverage 30.22% (26874/88928)

@bobhan1
Copy link
Contributor Author

bobhan1 commented Mar 5, 2025

run cloud_p0

@zhannngchen zhannngchen merged commit 987bcc9 into apache:master Mar 5, 2025
28 of 29 checks passed
bobhan1 added a commit to bobhan1/doris that referenced this pull request Mar 5, 2025
1. remove rows check for newly generated block in publish in partial
update. When the rowset has multi segments, newly generated segments of
conflict rows on these segments will be flushed concurrently using the
same rowset writer. So `rowset_writer->num_rows()` may be larger than
`new_generated_rows`.
2. correct `rowset_ids to add:` size when calculate delete bitmap in
publish phase. Include invisible rowsets for txn_load
3. add a config `enable_mow_verbose_log` to control whether to print
verbose log for delete bitmap calculation.
dataroaring pushed a commit to bobhan1/doris that referenced this pull request Mar 7, 2025
1. remove rows check for newly generated block in publish in partial
update. When the rowset has multi segments, newly generated segments of
conflict rows on these segments will be flushed concurrently using the
same rowset writer. So `rowset_writer->num_rows()` may be larger than
`new_generated_rows`.
2. correct `rowset_ids to add:` size when calculate delete bitmap in
publish phase. Include invisible rowsets for txn_load
3. add a config `enable_mow_verbose_log` to control whether to print
verbose log for delete bitmap calculation.
bobhan1 added a commit to bobhan1/doris that referenced this pull request Apr 15, 2025
1. remove rows check for newly generated block in publish in partial
update. When the rowset has multi segments, newly generated segments of
conflict rows on these segments will be flushed concurrently using the
same rowset writer. So `rowset_writer->num_rows()` may be larger than
`new_generated_rows`.
2. correct `rowset_ids to add:` size when calculate delete bitmap in
publish phase. Include invisible rowsets for txn_load
3. add a config `enable_mow_verbose_log` to control whether to print
verbose log for delete bitmap calculation.
bobhan1 added a commit to bobhan1/doris that referenced this pull request Apr 15, 2025
1. remove rows check for newly generated block in publish in partial
update. When the rowset has multi segments, newly generated segments of
conflict rows on these segments will be flushed concurrently using the
same rowset writer. So `rowset_writer->num_rows()` may be larger than
`new_generated_rows`.
2. correct `rowset_ids to add:` size when calculate delete bitmap in
publish phase. Include invisible rowsets for txn_load
3. add a config `enable_mow_verbose_log` to control whether to print
verbose log for delete bitmap calculation.
@yiguolei yiguolei mentioned this pull request May 13, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

1. remove rows check for newly generated block in publish in partial
update. When the rowset has multi segments, newly generated segments of
conflict rows on these segments will be flushed concurrently using the
same rowset writer. So `rowset_writer->num_rows()` may be larger than
`new_generated_rows`.
2. correct `rowset_ids to add:` size when calculate delete bitmap in
publish phase. Include invisible rowsets for txn_load
3. add a config `enable_mow_verbose_log` to control whether to print
verbose log for delete bitmap calculation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.10-merged dev/3.0.5-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants