Skip to content

Conversation

@hubgeter
Copy link
Contributor

bp #48723

What problem does this PR solve?

Problem Summary:
Supports native reader reading tables after the top-level schema of paimon is changed, but does not support tables after the internal schema of struct is changed.

change top-level schema(support):

--spark sql 
ALTER TABLE table_name ADD COLUMNS (c1 INT,c2 STRING);
ALTER TABLE table_name RENAME COLUMN c0 TO c1;
ALTER TABLE table_name DROP COLUMNS (c1, c2);
ALTER TABLE table_name ADD COLUMN c INT FIRST;
ALTER TABLE table_name ADD COLUMN c INT AFTER b;
ALTER TABLE table_name ALTER COLUMN col_a FIRST;
ALTER TABLE table_name ALTER COLUMN col_a AFTER col_b;

change internal schema of struct schema(not support, will support in the next PR):

--spark sql 
ALTER TABLE table_name ADD COLUMN        v.value.f3 STRING;
ALTER TABLE table_name RENAME COLUMN v.f1 to f100;
ALTER TABLE table_name DROP COLUMN      v.value.f3 ;
ALTER TABLE table_name ALTER COLUMN      v.col_a FIRST;

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hubgeter hubgeter requested a review from morrySnow as a code owner June 24, 2025 03:22
@Thearas
Copy link
Contributor

Thearas commented Jun 24, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.23% (1117/1342)
Line Coverage 66.47% (18978/28553)
Region Coverage 66.23% (9422/14227)
Branch Coverage 56.18% (5100/9078)

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.20% (1119/1345)
Line Coverage 66.46% (18998/28586)
Region Coverage 66.18% (9424/14239)
Branch Coverage 56.15% (5098/9080)

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.25% (1123/1349)
Line Coverage 66.76% (19336/28962)
Region Coverage 66.50% (9577/14402)
Branch Coverage 56.51% (5205/9210)

@doris-robot
Copy link

TPC-H: Total hot run time: 39903 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit bef3bbfa1754bf0c7ae6fa8f5f677fa4574b1fef, data reload: false

------ Round 1 ----------------------------------
q1	17652	7061	6646	6646
q2	2069	175	170	170
q3	10645	1084	1200	1084
q4	10564	749	704	704
q5	7767	2929	2764	2764
q6	219	136	136	136
q7	998	635	622	622
q8	9352	1997	2044	1997
q9	6609	6454	6479	6454
q10	7027	2297	2302	2297
q11	458	261	259	259
q12	398	214	213	213
q13	17781	2980	2984	2980
q14	244	204	226	204
q15	513	468	472	468
q16	472	378	372	372
q17	1007	671	571	571
q18	7574	6692	6618	6618
q19	1350	987	977	977
q20	466	210	208	208
q21	4150	3227	3195	3195
q22	1106	1008	964	964
Total cold run time: 108421 ms
Total hot run time: 39903 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6778	6589	6531	6531
q2	338	229	234	229
q3	2922	2809	2964	2809
q4	2027	1821	1825	1821
q5	5783	5778	5734	5734
q6	208	128	132	128
q7	2197	1775	1802	1775
q8	3409	3609	3515	3515
q9	8958	8906	8960	8906
q10	3567	3571	3525	3525
q11	607	494	496	494
q12	808	570	606	570
q13	8319	3188	3128	3128
q14	304	270	273	270
q15	512	472	463	463
q16	502	458	444	444
q17	1858	1593	1581	1581
q18	8269	7927	7692	7692
q19	1715	1510	1499	1499
q20	2119	1847	1807	1807
q21	5175	4930	4951	4930
q22	1138	1075	1027	1027
Total cold run time: 67513 ms
Total hot run time: 58878 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197340 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit bef3bbfa1754bf0c7ae6fa8f5f677fa4574b1fef, data reload: false

query1	1277	925	905	905
query2	6322	1961	1944	1944
query3	10963	4353	4406	4353
query4	61795	29107	23586	23586
query5	5267	464	486	464
query6	386	179	186	179
query7	5414	318	312	312
query8	315	226	227	226
query9	8441	2612	2592	2592
query10	467	274	262	262
query11	17757	15471	15565	15471
query12	162	106	104	104
query13	1419	454	427	427
query14	9785	7417	7049	7049
query15	211	178	190	178
query16	7167	523	522	522
query17	1172	583	601	583
query18	1900	329	315	315
query19	214	160	169	160
query20	121	114	116	114
query21	210	105	106	105
query22	4683	4629	4599	4599
query23	34521	33865	34518	33865
query24	6218	2928	2926	2926
query25	544	439	427	427
query26	666	183	170	170
query27	1955	363	367	363
query28	4091	2136	2148	2136
query29	708	461	455	455
query30	238	160	155	155
query31	998	842	822	822
query32	73	57	59	57
query33	485	306	303	303
query34	932	500	511	500
query35	826	718	758	718
query36	1060	962	974	962
query37	115	69	70	69
query38	4128	3978	3967	3967
query39	1485	1461	1467	1461
query40	198	103	101	101
query41	47	46	52	46
query42	113	105	103	103
query43	522	483	492	483
query44	1184	863	829	829
query45	189	181	168	168
query46	1189	749	724	724
query47	2069	1967	1953	1953
query48	433	347	371	347
query49	754	389	394	389
query50	832	438	444	438
query51	7373	7250	7187	7187
query52	100	95	90	90
query53	259	185	187	185
query54	581	472	457	457
query55	84	77	83	77
query56	261	244	251	244
query57	1334	1233	1238	1233
query58	230	218	222	218
query59	3222	3002	2897	2897
query60	279	265	263	263
query61	116	114	137	114
query62	790	722	675	675
query63	215	188	184	184
query64	1403	695	630	630
query65	3476	3228	3173	3173
query66	706	294	289	289
query67	15867	15589	15638	15589
query68	4148	605	583	583
query69	433	262	261	261
query70	1169	1082	1151	1082
query71	334	262	254	254
query72	6347	4012	4039	4012
query73	755	358	364	358
query74	10079	8947	9042	8947
query75	3362	2665	2639	2639
query76	2023	1154	1121	1121
query77	488	276	264	264
query78	10490	9647	9555	9555
query79	1476	604	611	604
query80	871	438	432	432
query81	503	212	220	212
query82	1260	95	87	87
query83	239	153	147	147
query84	284	81	77	77
query85	909	309	290	290
query86	320	290	282	282
query87	4382	4230	4277	4230
query88	3849	2369	2356	2356
query89	416	294	292	292
query90	2017	186	187	186
query91	143	107	110	107
query92	65	50	50	50
query93	1906	563	562	562
query94	750	290	295	290
query95	358	258	256	256
query96	612	279	280	279
query97	3275	3150	3141	3141
query98	216	204	194	194
query99	1606	1310	1298	1298
Total cold run time: 313402 ms
Total hot run time: 197340 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.01 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit bef3bbfa1754bf0c7ae6fa8f5f677fa4574b1fef, data reload: false

query1	0.04	0.03	0.03
query2	0.09	0.04	0.04
query3	0.22	0.05	0.05
query4	1.64	0.09	0.09
query5	0.52	0.52	0.51
query6	1.12	0.75	0.74
query7	0.03	0.02	0.02
query8	0.06	0.04	0.04
query9	0.56	0.53	0.49
query10	0.57	0.55	0.55
query11	0.17	0.12	0.12
query12	0.16	0.13	0.12
query13	0.60	0.61	0.61
query14	0.77	0.80	0.83
query15	0.85	0.84	0.83
query16	0.37	0.38	0.39
query17	1.05	1.03	1.03
query18	0.18	0.17	0.19
query19	1.89	1.77	1.79
query20	0.01	0.01	0.01
query21	15.40	0.68	0.67
query22	3.64	6.21	2.96
query23	18.23	1.50	1.42
query24	2.19	0.22	0.24
query25	0.16	0.08	0.08
query26	0.28	0.18	0.18
query27	0.09	0.08	0.08
query28	13.29	0.59	0.57
query29	12.68	3.32	3.29
query30	0.24	0.06	0.06
query31	2.86	0.40	0.40
query32	3.23	0.48	0.49
query33	2.98	3.03	3.02
query34	16.95	4.58	4.53
query35	4.62	4.55	4.65
query36	0.66	0.47	0.48
query37	0.21	0.16	0.16
query38	0.17	0.16	0.16
query39	0.05	0.04	0.04
query40	0.17	0.13	0.14
query41	0.10	0.05	0.05
query42	0.07	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 109.22 s
Total hot run time: 32.01 s

@hubgeter
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.25% (1123/1349)
Line Coverage 66.76% (19335/28962)
Region Coverage 66.46% (9571/14402)
Branch Coverage 56.48% (5202/9210)

@doris-robot
Copy link

TPC-H: Total hot run time: 39954 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 781ca8792c4de8928ad2b7c21be28a1a9fb0cba6, data reload: false

------ Round 1 ----------------------------------
q1	17599	6753	6652	6652
q2	2088	172	172	172
q3	10648	1074	1168	1074
q4	10562	780	674	674
q5	7750	2957	2884	2884
q6	218	134	137	134
q7	969	620	598	598
q8	9371	1939	1982	1939
q9	6679	6427	6430	6427
q10	6972	2318	2290	2290
q11	468	255	255	255
q12	396	201	203	201
q13	17796	2985	2996	2985
q14	263	208	214	208
q15	522	458	458	458
q16	462	368	375	368
q17	994	541	646	541
q18	7467	6718	6715	6715
q19	1323	1001	1024	1001
q20	502	206	196	196
q21	3969	3194	3200	3194
q22	1094	988	1010	988
Total cold run time: 108112 ms
Total hot run time: 39954 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6608	6633	6598	6598
q2	329	231	239	231
q3	2903	2778	2872	2778
q4	2023	1771	1834	1771
q5	5732	5802	5820	5802
q6	221	130	125	125
q7	2238	1807	1869	1807
q8	3403	3579	3546	3546
q9	9000	8867	9022	8867
q10	3561	3529	3523	3523
q11	602	503	489	489
q12	820	576	643	576
q13	8313	3135	3194	3135
q14	311	270	284	270
q15	523	464	464	464
q16	478	428	443	428
q17	1865	1626	1613	1613
q18	8215	7869	7855	7855
q19	1690	1435	1523	1435
q20	2116	1837	1801	1801
q21	5244	4936	5021	4936
q22	1103	1036	1027	1027
Total cold run time: 67298 ms
Total hot run time: 59077 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196410 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 781ca8792c4de8928ad2b7c21be28a1a9fb0cba6, data reload: false

query1	1313	930	919	919
query2	6430	1912	1856	1856
query3	10809	4312	4090	4090
query4	61252	29379	23535	23535
query5	5226	440	447	440
query6	415	182	180	180
query7	5480	309	313	309
query8	315	235	228	228
query9	8459	2598	2592	2592
query10	476	267	262	262
query11	17514	15180	16031	15180
query12	164	108	109	108
query13	1458	442	440	440
query14	10842	7317	7128	7128
query15	206	186	173	173
query16	7143	480	479	479
query17	1120	580	576	576
query18	2021	307	309	307
query19	205	166	155	155
query20	120	110	113	110
query21	210	107	108	107
query22	4673	4417	4603	4417
query23	34630	33981	34358	33981
query24	6209	2872	2991	2872
query25	563	442	421	421
query26	655	171	172	171
query27	1876	365	350	350
query28	4169	2170	2178	2170
query29	698	450	460	450
query30	243	153	161	153
query31	1036	819	853	819
query32	74	63	59	59
query33	456	317	310	310
query34	924	515	521	515
query35	839	746	746	746
query36	1093	955	979	955
query37	112	65	70	65
query38	4050	3978	4018	3978
query39	1543	1499	1477	1477
query40	207	98	99	98
query41	46	44	44	44
query42	107	100	101	100
query43	540	500	475	475
query44	1176	803	809	803
query45	185	166	177	166
query46	1138	721	748	721
query47	2048	1955	1905	1905
query48	444	342	339	339
query49	716	384	386	384
query50	820	426	423	423
query51	7451	7334	7280	7280
query52	105	86	87	86
query53	260	201	188	188
query54	593	457	473	457
query55	77	77	80	77
query56	279	248	242	242
query57	1314	1222	1238	1222
query58	230	205	213	205
query59	3271	3034	2835	2835
query60	278	244	255	244
query61	112	142	107	107
query62	785	670	662	662
query63	219	182	183	182
query64	1374	641	631	631
query65	3285	3185	3195	3185
query66	704	310	309	309
query67	15823	15680	15469	15469
query68	4183	606	581	581
query69	422	259	257	257
query70	1125	1087	1122	1087
query71	355	258	251	251
query72	6343	4039	4102	4039
query73	756	361	357	357
query74	10599	9059	9293	9059
query75	3362	2622	2696	2622
query76	2037	1153	1085	1085
query77	506	276	265	265
query78	10566	9567	9580	9567
query79	1158	598	595	595
query80	840	421	422	421
query81	513	221	218	218
query82	206	90	87	87
query83	165	145	149	145
query84	292	85	78	78
query85	913	309	299	299
query86	334	294	299	294
query87	4416	4259	4197	4197
query88	4028	2377	2351	2351
query89	414	289	286	286
query90	1999	182	185	182
query91	140	105	102	102
query92	65	52	50	50
query93	1422	551	555	551
query94	739	295	300	295
query95	360	249	252	249
query96	604	282	281	281
query97	3343	3152	3151	3151
query98	212	201	195	195
query99	1917	1319	1291	1291
Total cold run time: 312942 ms
Total hot run time: 196410 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.34 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 781ca8792c4de8928ad2b7c21be28a1a9fb0cba6, data reload: false

query1	0.03	0.03	0.04
query2	0.10	0.05	0.04
query3	0.23	0.05	0.06
query4	1.64	0.07	0.08
query5	0.50	0.50	0.52
query6	1.14	0.75	0.76
query7	0.02	0.01	0.01
query8	0.05	0.05	0.05
query9	0.56	0.52	0.50
query10	0.54	0.56	0.58
query11	0.16	0.12	0.12
query12	0.15	0.12	0.12
query13	0.61	0.60	0.60
query14	0.79	0.80	0.82
query15	0.84	0.84	0.84
query16	0.38	0.37	0.36
query17	1.07	1.06	1.07
query18	0.18	0.18	0.19
query19	1.86	1.77	1.80
query20	0.01	0.01	0.01
query21	15.39	0.66	0.64
query22	4.20	6.32	2.34
query23	18.28	1.43	1.36
query24	2.22	0.22	0.23
query25	0.16	0.10	0.08
query26	0.27	0.19	0.18
query27	0.08	0.08	0.07
query28	13.28	0.60	0.57
query29	12.67	3.37	3.36
query30	0.25	0.06	0.05
query31	2.87	0.40	0.40
query32	3.23	0.48	0.48
query33	2.96	2.99	3.04
query34	16.94	4.52	4.50
query35	4.62	4.67	4.60
query36	0.67	0.49	0.47
query37	0.19	0.16	0.17
query38	0.17	0.16	0.15
query39	0.05	0.04	0.04
query40	0.17	0.13	0.13
query41	0.11	0.06	0.05
query42	0.06	0.04	0.05
query43	0.05	0.04	0.04
Total cold run time: 109.75 s
Total hot run time: 31.34 s

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.43% (1128/1352)
Line Coverage 67.27% (19649/29209)
Region Coverage 66.93% (9685/14470)
Branch Coverage 56.87% (5266/9260)

…ge table. (apache#48723)

Problem Summary:
Supports native reader reading tables after the top-level schema of
paimon is changed, but does not support tables after the internal schema
of struct is changed.

change  top-level schema(support):
```sql
--spark sql
ALTER TABLE table_name ADD COLUMNS (c1 INT,c2 STRING);
ALTER TABLE table_name RENAME COLUMN c0 TO c1;
ALTER TABLE table_name DROP COLUMNS (c1, c2);
ALTER TABLE table_name ADD COLUMN c INT FIRST;
ALTER TABLE table_name ADD COLUMN c INT AFTER b;
ALTER TABLE table_name ALTER COLUMN col_a FIRST;
ALTER TABLE table_name ALTER COLUMN col_a AFTER col_b;
```

change internal schema of struct schema(not support, will support in the
next PR):
```sql
--spark sql
ALTER TABLE table_name ADD COLUMN        v.value.f3 STRING;
ALTER TABLE table_name RENAME COLUMN v.f1 to f100;
ALTER TABLE table_name DROP COLUMN      v.value.f3 ;
ALTER TABLE table_name ALTER COLUMN      v.col_a FIRST;
```
@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.46% (1130/1354)
Line Coverage 67.24% (19649/29222)
Region Coverage 67.26% (9978/14835)
Branch Coverage 56.91% (5270/9260)

@morrySnow morrySnow merged commit 384fb2b into apache:branch-3.1 Jun 28, 2025
18 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants