Skip to content

Conversation

@mymeiyi
Copy link
Contributor

@mymeiyi mymeiyi commented Mar 20, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:
introduced by #48968, fix the rowset_cache_version is not deleted if _rs_metas or _stale_rs_metas is changed:

F20250319 13:41:54.708062  5890 tablet_meta.cpp:955] Check failed: false . tablet: 1742356296291, rowset_cache_version size: 1607, _rs_metas size: 135, _stale_rs_metas size: 707
*** Check failure stack trace: ***
    @     0x55aa24363916  google::LogMessage::SendToLog()
    @     0x55aa24360360  google::LogMessage::Flush()
    @     0x55aa24364159  google::LogMessageFatal::~LogMessageFatal()
    @     0x55aa19c476a9  doris::TabletMeta::delete_stale_rs_meta_by_version()
    @     0x55aa19bf38b3  doris::Tablet::_delete_stale_rowset_by_version()
    @     0x55aa19bf451a  doris::Tablet::delete_expired_stale_rowset()
    @     0x55aa19c264f8  doris::TabletManager::for_each_tablet()
    @     0x55aa19c2a473  doris::TabletManager::start_trash_sweep()
    @     0x55aa19bd3fa5  doris::StorageEngine::start_trash_sweep()
    @     0x55aa198b9da6  doris::StorageEngine::_garbage_sweeper_thread_callback()

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring
Copy link
Contributor

Please add some comment to explain the problem which the pr tries to fix.

@mymeiyi mymeiyi force-pushed the fix-mow branch 2 times, most recently from 80acfa5 to f5f298d Compare March 20, 2025 04:00
@mymeiyi
Copy link
Contributor Author

mymeiyi commented Mar 20, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32746 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c52a0640a6fa1eaaaf07aeb9d7d90a00d4a23c95, data reload: false

------ Round 1 ----------------------------------
q1	24505	5043	5056	5043
q2	2038	330	207	207
q3	10330	1315	716	716
q4	10223	1018	524	524
q5	7600	2424	2417	2417
q6	183	165	131	131
q7	931	744	635	635
q8	9320	1300	1115	1115
q9	4959	4735	4933	4735
q10	6807	2325	1901	1901
q11	460	269	264	264
q12	363	355	219	219
q13	17779	3659	3090	3090
q14	223	224	210	210
q15	522	500	495	495
q16	639	616	601	601
q17	584	865	344	344
q18	6884	6567	6465	6465
q19	1201	954	553	553
q20	320	325	189	189
q21	2835	2132	1906	1906
q22	1040	1009	986	986
Total cold run time: 109746 ms
Total hot run time: 32746 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5140	5167	5103	5103
q2	238	324	233	233
q3	2162	2682	2321	2321
q4	1430	1825	1403	1403
q5	4263	4185	4478	4185
q6	222	170	130	130
q7	2023	1961	1766	1766
q8	2651	2546	2606	2546
q9	7276	7281	7166	7166
q10	2980	3204	2672	2672
q11	585	517	489	489
q12	660	761	627	627
q13	3497	3857	3344	3344
q14	289	285	270	270
q15	524	476	494	476
q16	653	683	660	660
q17	1151	1545	1427	1427
q18	7802	7554	7365	7365
q19	809	780	901	780
q20	1971	2044	1917	1917
q21	5399	4919	4630	4630
q22	1076	1045	1019	1019
Total cold run time: 52801 ms
Total hot run time: 50529 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185268 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c52a0640a6fa1eaaaf07aeb9d7d90a00d4a23c95, data reload: false

query1	1007	504	459	459
query2	6537	1932	1919	1919
query3	6800	217	219	217
query4	26504	23729	23149	23149
query5	4318	668	475	475
query6	305	201	183	183
query7	4606	516	302	302
query8	299	248	232	232
query9	8608	2594	2583	2583
query10	442	309	271	271
query11	15467	15074	14934	14934
query12	160	110	104	104
query13	1655	522	392	392
query14	9012	6180	6344	6180
query15	199	197	169	169
query16	7327	636	482	482
query17	1209	719	556	556
query18	1961	397	309	309
query19	194	182	154	154
query20	119	118	117	117
query21	216	125	105	105
query22	4184	4402	4146	4146
query23	33816	33027	32993	32993
query24	8107	2369	2367	2367
query25	551	451	378	378
query26	1218	273	151	151
query27	2652	476	328	328
query28	4328	2440	2382	2382
query29	748	597	422	422
query30	291	211	191	191
query31	935	837	782	782
query32	72	64	60	60
query33	559	351	322	322
query34	773	832	496	496
query35	794	843	736	736
query36	965	986	891	891
query37	117	94	76	76
query38	4231	4359	4125	4125
query39	1464	1384	1378	1378
query40	206	113	100	100
query41	57	56	50	50
query42	116	109	124	109
query43	503	524	477	477
query44	1282	783	781	781
query45	177	168	164	164
query46	825	1021	614	614
query47	1774	1804	1734	1734
query48	369	412	297	297
query49	780	508	401	401
query50	691	717	398	398
query51	4193	4238	4152	4152
query52	114	108	90	90
query53	227	262	192	192
query54	466	490	413	413
query55	82	77	82	77
query56	268	253	264	253
query57	1139	1176	1066	1066
query58	245	253	226	226
query59	2692	2790	2460	2460
query60	282	273	251	251
query61	121	114	113	113
query62	792	736	659	659
query63	220	183	183	183
query64	4314	983	705	705
query65	4421	4348	4355	4348
query66	1134	409	296	296
query67	15646	15579	15358	15358
query68	8599	819	497	497
query69	464	295	260	260
query70	1186	1132	1195	1132
query71	451	295	264	264
query72	5536	3574	3761	3574
query73	773	726	346	346
query74	9165	9167	8987	8987
query75	3903	3175	2722	2722
query76	3671	1169	733	733
query77	771	357	282	282
query78	10296	10030	9319	9319
query79	2612	824	577	577
query80	650	530	433	433
query81	509	261	221	221
query82	586	128	100	100
query83	211	170	154	154
query84	289	95	81	81
query85	799	441	296	296
query86	388	311	277	277
query87	4547	4446	4555	4446
query88	3559	2216	2230	2216
query89	398	318	284	284
query90	1880	208	209	208
query91	140	139	110	110
query92	75	63	56	56
query93	1842	1091	583	583
query94	669	413	257	257
query95	353	268	262	262
query96	486	565	284	284
query97	3339	3394	3337	3337
query98	234	200	207	200
query99	1442	1403	1287	1287
Total cold run time: 275709 ms
Total hot run time: 185268 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.6 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c52a0640a6fa1eaaaf07aeb9d7d90a00d4a23c95, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.10	0.10
query3	0.23	0.19	0.19
query4	1.59	0.20	0.19
query5	0.59	0.58	0.59
query6	1.21	0.72	0.72
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.59	0.53	0.51
query10	0.57	0.57	0.57
query11	0.16	0.11	0.11
query12	0.14	0.11	0.12
query13	0.61	0.60	0.60
query14	2.67	2.84	2.79
query15	0.94	0.85	0.85
query16	0.38	0.39	0.39
query17	1.01	1.06	1.06
query18	0.21	0.19	0.20
query19	1.90	1.99	1.86
query20	0.01	0.01	0.01
query21	15.69	0.89	0.54
query22	0.76	1.24	0.75
query23	14.82	1.36	0.63
query24	7.45	1.21	1.06
query25	0.53	0.34	0.09
query26	0.67	0.16	0.13
query27	0.05	0.04	0.04
query28	9.77	0.88	0.44
query29	12.56	3.91	3.28
query30	0.25	0.10	0.07
query31	2.81	0.59	0.38
query32	3.23	0.55	0.46
query33	3.04	2.99	3.01
query34	15.88	5.13	4.55
query35	4.54	4.49	4.53
query36	0.68	0.49	0.48
query37	0.09	0.07	0.06
query38	0.05	0.03	0.03
query39	0.03	0.02	0.03
query40	0.17	0.13	0.14
query41	0.09	0.03	0.03
query42	0.03	0.02	0.02
query43	0.04	0.03	0.02
Total cold run time: 106.26 s
Total hot run time: 31.6 s

@mymeiyi
Copy link
Contributor Author

mymeiyi commented Mar 20, 2025

run cloud_p0

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 31.37% (16/51) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 48.80% (13069/26783)
Line Coverage 38.37% (112704/293700)
Region Coverage 37.17% (57295/154143)
Branch Coverage 32.27% (28802/89244)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 20, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhannngchen zhannngchen merged commit 24e641b into apache:master Mar 20, 2025
40 of 43 checks passed
github-actions bot pushed a commit that referenced this pull request Mar 20, 2025
Problem Summary:
introduced by #48968, fix the
rowset_cache_version is not deleted if _rs_metas or _stale_rs_metas is
changed:
```
F20250319 13:41:54.708062  5890 tablet_meta.cpp:955] Check failed: false . tablet: 1742356296291, rowset_cache_version size: 1607, _rs_metas size: 135, _stale_rs_metas size: 707
*** Check failure stack trace: ***
    @     0x55aa24363916  google::LogMessage::SendToLog()
    @     0x55aa24360360  google::LogMessage::Flush()
    @     0x55aa24364159  google::LogMessageFatal::~LogMessageFatal()
    @     0x55aa19c476a9  doris::TabletMeta::delete_stale_rs_meta_by_version()
    @     0x55aa19bf38b3  doris::Tablet::_delete_stale_rowset_by_version()
    @     0x55aa19bf451a  doris::Tablet::delete_expired_stale_rowset()
    @     0x55aa19c264f8  doris::TabletManager::for_each_tablet()
    @     0x55aa19c2a473  doris::TabletManager::start_trash_sweep()
    @     0x55aa19bd3fa5  doris::StorageEngine::start_trash_sweep()
    @     0x55aa198b9da6  doris::StorageEngine::_garbage_sweeper_thread_callback()
```
dataroaring pushed a commit that referenced this pull request Mar 21, 2025
Cherry-picked from #49295

Co-authored-by: meiyi <meiyi@selectdb.com>
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
Problem Summary:
introduced by apache#48968, fix the
rowset_cache_version is not deleted if _rs_metas or _stale_rs_metas is
changed:
```
F20250319 13:41:54.708062  5890 tablet_meta.cpp:955] Check failed: false . tablet: 1742356296291, rowset_cache_version size: 1607, _rs_metas size: 135, _stale_rs_metas size: 707
*** Check failure stack trace: ***
    @     0x55aa24363916  google::LogMessage::SendToLog()
    @     0x55aa24360360  google::LogMessage::Flush()
    @     0x55aa24364159  google::LogMessageFatal::~LogMessageFatal()
    @     0x55aa19c476a9  doris::TabletMeta::delete_stale_rs_meta_by_version()
    @     0x55aa19bf38b3  doris::Tablet::_delete_stale_rowset_by_version()
    @     0x55aa19bf451a  doris::Tablet::delete_expired_stale_rowset()
    @     0x55aa19c264f8  doris::TabletManager::for_each_tablet()
    @     0x55aa19c2a473  doris::TabletManager::start_trash_sweep()
    @     0x55aa19bd3fa5  doris::StorageEngine::start_trash_sweep()
    @     0x55aa198b9da6  doris::StorageEngine::_garbage_sweeper_thread_callback()
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.5-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants