Skip to content

Conversation

@bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Mar 18, 2025

What problem does this PR solve?

When converting historical rowsets in sort schema change, it may write many temp rowsets and merge them into one rowset later if memory is not enough. However, these rowsets have fake versions which are like [2^29+x, 2^29+x], so the values of __DORIS_VERSION_COL__ in these temp rowsets will be wrongly replaced by these fake version(see #16509) in Merger::vmerge_rowsets when merging them into a single rowset.

This PR modify these fake versions to avoid it.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1
Copy link
Contributor Author

bobhan1 commented Mar 18, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32608 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0c804f2401e7fabc860360f891cdcab80df788c4, data reload: false

------ Round 1 ----------------------------------
q1	24518	5138	5055	5055
q2	2040	302	175	175
q3	10375	1277	721	721
q4	10226	1040	544	544
q5	7554	2340	2430	2340
q6	197	161	136	136
q7	918	768	625	625
q8	9321	1350	1077	1077
q9	4866	5032	4812	4812
q10	6798	2327	1929	1929
q11	470	272	254	254
q12	347	352	220	220
q13	17762	3666	3057	3057
q14	234	234	217	217
q15	538	507	490	490
q16	629	620	565	565
q17	572	841	356	356
q18	6844	6490	6336	6336
q19	1221	948	558	558
q20	325	333	213	213
q21	2755	2160	1971	1971
q22	1060	1022	957	957
Total cold run time: 109570 ms
Total hot run time: 32608 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5497	5104	5191	5104
q2	245	340	232	232
q3	2173	2716	2292	2292
q4	1457	1834	1449	1449
q5	4263	4164	4555	4164
q6	223	167	125	125
q7	2040	2009	1788	1788
q8	2589	2637	2614	2614
q9	7235	7199	7072	7072
q10	2954	3214	2777	2777
q11	587	499	480	480
q12	676	792	612	612
q13	3423	3828	3268	3268
q14	292	307	267	267
q15	549	506	481	481
q16	640	706	656	656
q17	1146	1573	1354	1354
q18	7781	7654	7551	7551
q19	801	837	974	837
q20	2010	2045	1904	1904
q21	5407	4866	4775	4775
q22	1058	1059	1000	1000
Total cold run time: 53046 ms
Total hot run time: 50802 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186121 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0c804f2401e7fabc860360f891cdcab80df788c4, data reload: false

query1	1024	493	466	466
query2	6561	1939	1980	1939
query3	6800	213	218	213
query4	26918	23717	23004	23004
query5	4442	632	489	489
query6	312	202	195	195
query7	4607	503	294	294
query8	308	255	254	254
query9	8688	2619	2602	2602
query10	443	313	265	265
query11	15459	15209	14837	14837
query12	170	112	106	106
query13	1659	513	412	412
query14	9618	6220	6413	6220
query15	215	192	173	173
query16	7610	628	476	476
query17	1194	701	576	576
query18	1937	404	318	318
query19	195	191	169	169
query20	130	116	120	116
query21	209	122	107	107
query22	4346	4374	4335	4335
query23	34033	33131	33117	33117
query24	8299	2421	2414	2414
query25	568	468	388	388
query26	1239	274	161	161
query27	2684	490	341	341
query28	4296	2417	2419	2417
query29	704	574	428	428
query30	280	212	187	187
query31	965	838	772	772
query32	75	63	66	63
query33	556	359	310	310
query34	783	855	504	504
query35	785	814	750	750
query36	964	985	888	888
query37	114	98	79	79
query38	4183	4151	4036	4036
query39	1438	1395	1382	1382
query40	213	114	107	107
query41	54	50	51	50
query42	120	100	105	100
query43	531	512	481	481
query44	1276	793	791	791
query45	177	174	165	165
query46	832	1030	630	630
query47	1776	1820	1760	1760
query48	387	422	313	313
query49	798	503	416	416
query50	668	729	417	417
query51	4133	4216	4107	4107
query52	106	102	103	102
query53	235	259	195	195
query54	481	501	422	422
query55	89	80	78	78
query56	268	267	244	244
query57	1165	1133	1080	1080
query58	277	231	231	231
query59	2612	2676	2745	2676
query60	273	275	256	256
query61	139	118	118	118
query62	816	733	704	704
query63	227	192	194	192
query64	4270	1062	667	667
query65	4432	4345	4308	4308
query66	1051	406	296	296
query67	15904	15883	15335	15335
query68	8269	877	508	508
query69	466	297	256	256
query70	1208	1110	1102	1102
query71	466	323	276	276
query72	5528	3529	3758	3529
query73	777	698	353	353
query74	8916	9110	9305	9110
query75	3781	3156	2680	2680
query76	3714	1206	762	762
query77	789	368	285	285
query78	10179	10288	9282	9282
query79	1255	850	594	594
query80	645	550	460	460
query81	466	265	223	223
query82	487	126	95	95
query83	171	167	151	151
query84	241	97	74	74
query85	785	361	401	361
query86	338	299	301	299
query87	4389	4601	4437	4437
query88	2898	2299	2305	2299
query89	396	323	283	283
query90	1948	210	209	209
query91	144	139	116	116
query92	76	59	55	55
query93	1147	1086	585	585
query94	681	411	303	303
query95	355	273	264	264
query96	494	562	276	276
query97	3345	3434	3372	3372
query98	230	213	205	205
query99	1499	1408	1273	1273
Total cold run time: 274204 ms
Total hot run time: 186121 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0c804f2401e7fabc860360f891cdcab80df788c4, data reload: false

query1	0.04	0.04	0.04
query2	0.11	0.11	0.11
query3	0.24	0.19	0.19
query4	1.59	0.20	0.19
query5	0.60	0.59	0.59
query6	1.21	0.72	0.71
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.60	0.53	0.54
query10	0.57	0.61	0.58
query11	0.16	0.10	0.10
query12	0.15	0.12	0.11
query13	0.62	0.61	0.60
query14	2.75	2.68	2.79
query15	0.92	0.85	0.86
query16	0.37	0.38	0.39
query17	1.02	1.06	1.04
query18	0.22	0.21	0.19
query19	1.89	1.95	1.88
query20	0.02	0.01	0.02
query21	15.36	0.93	0.55
query22	0.74	1.17	0.67
query23	14.96	1.39	0.65
query24	6.99	1.92	0.83
query25	0.49	0.18	0.20
query26	0.61	0.16	0.13
query27	0.05	0.05	0.05
query28	10.26	0.84	0.42
query29	12.60	3.98	3.28
query30	0.26	0.09	0.07
query31	2.84	0.58	0.38
query32	3.23	0.56	0.48
query33	2.99	3.03	3.04
query34	15.60	5.15	4.50
query35	4.51	4.59	4.52
query36	0.67	0.49	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.03
query40	0.17	0.14	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.81 s
Total hot run time: 31.39 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 48.88% (13091/26780)
Line Coverage 38.44% (112853/293557)
Region Coverage 37.25% (57388/154055)
Branch Coverage 32.32% (28829/89192)

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 18, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit c33fad6 into apache:master Mar 18, 2025
33 of 35 checks passed
github-actions bot pushed a commit that referenced this pull request Mar 18, 2025
…placed by fake version when merging tmp rowset in sort SC (#49193)

### What problem does this PR solve?

When converting historical rowsets in sort schema change, it may write
many temp rowsets and merge them into one rowset later if memory is not
enough. However, these rowsets have fake versions which are like
`[2^29+x, 2^29+x]`, so the values of `__DORIS_VERSION_COL__` in these
temp rowsets will be wrongly replaced by these fake version(see
#16509) in `Merger::vmerge_rowsets`
when merging them into a single rowset.

This PR modify these fake versions to avoid it.
dataroaring pushed a commit that referenced this pull request Mar 19, 2025
…e wrongly replaced by fake version when merging tmp rowset in sort SC #49193 (#49219)

Cherry-picked from #49193

Co-authored-by: bobhan1 <baohan@selectdb.com>
yiguolei pushed a commit that referenced this pull request Mar 19, 2025
…__` be wrongly replaced by fake version when merging tmp rowset in sort SC #49193 (#49222)

pick #49193
@gavinchou gavinchou mentioned this pull request Apr 23, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…placed by fake version when merging tmp rowset in sort SC (apache#49193)

### What problem does this PR solve?

When converting historical rowsets in sort schema change, it may write
many temp rowsets and merge them into one rowset later if memory is not
enough. However, these rowsets have fake versions which are like
`[2^29+x, 2^29+x]`, so the values of `__DORIS_VERSION_COL__` in these
temp rowsets will be wrongly replaced by these fake version(see
apache#16509) in `Merger::vmerge_rowsets`
when merging them into a single rowset.

This PR modify these fake versions to avoid it.
deardeng pushed a commit to deardeng/incubator-doris that referenced this pull request Dec 19, 2025
…__` be wrongly replaced by fake version when merging tmp rowset in sort SC apache#49193 (apache#49222) (apache#335)

pick apache#49193
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.5-merged p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants