Skip to content

Conversation

@Mryange
Copy link
Contributor

@Mryange Mryange commented Mar 17, 2025

What problem does this PR solve?

The url_encode function previously performed a modulus operation on a signed number. Converting it to an unsigned number will fix the issue.

before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+

The strright function did not calculate the length according to the number of UTF-8 characters.

before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+

he case of inputting a UTF-8 character was not considered.

mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Contributor Author

Mryange commented Mar 17, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32520 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 10872b968e9cb14569f5a192ea61601cf5cbd83e, data reload: false

------ Round 1 ----------------------------------
q1	17578	5247	5057	5057
q2	2055	299	164	164
q3	10604	1295	769	769
q4	10204	1059	541	541
q5	7549	2347	2425	2347
q6	185	167	130	130
q7	923	741	630	630
q8	9307	1347	1093	1093
q9	4920	4659	4737	4659
q10	6803	2309	1918	1918
q11	463	277	246	246
q12	350	351	223	223
q13	17784	3679	3082	3082
q14	243	229	207	207
q15	538	495	474	474
q16	630	596	579	579
q17	584	891	346	346
q18	6884	6593	6386	6386
q19	1225	951	546	546
q20	310	331	199	199
q21	3002	2133	1943	1943
q22	1059	1057	981	981
Total cold run time: 103200 ms
Total hot run time: 32520 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5150	5128	5108	5108
q2	241	327	229	229
q3	2180	2663	2304	2304
q4	1497	1838	1404	1404
q5	4259	4125	4184	4125
q6	215	167	126	126
q7	1909	1916	1834	1834
q8	2631	2661	2611	2611
q9	7256	7305	7304	7304
q10	3039	3264	2777	2777
q11	577	516	493	493
q12	728	769	598	598
q13	3449	3925	3299	3299
q14	305	296	276	276
q15	533	511	476	476
q16	648	691	640	640
q17	1163	1648	1351	1351
q18	7992	7727	7521	7521
q19	830	956	1176	956
q20	1995	2019	1877	1877
q21	5494	5014	4743	4743
q22	1105	1049	1049	1049
Total cold run time: 53196 ms
Total hot run time: 51101 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192556 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 10872b968e9cb14569f5a192ea61601cf5cbd83e, data reload: false

query1	1414	1021	1005	1005
query2	6260	1916	1960	1916
query3	11155	4558	4711	4558
query4	25672	23726	23291	23291
query5	4745	666	490	490
query6	319	207	203	203
query7	3985	506	295	295
query8	298	250	242	242
query9	8557	2620	2612	2612
query10	473	316	280	280
query11	15557	15233	14875	14875
query12	171	113	109	109
query13	1567	546	398	398
query14	9721	6552	6158	6158
query15	207	188	171	171
query16	7598	673	505	505
query17	1279	739	562	562
query18	2011	404	325	325
query19	193	179	160	160
query20	127	134	122	122
query21	210	125	115	115
query22	4708	4701	4668	4668
query23	34566	33865	33470	33470
query24	7662	2448	2389	2389
query25	491	478	393	393
query26	924	280	158	158
query27	2323	511	333	333
query28	4540	2471	2455	2455
query29	644	596	435	435
query30	279	230	200	200
query31	909	888	790	790
query32	72	63	63	63
query33	541	373	323	323
query34	782	858	491	491
query35	816	852	753	753
query36	978	995	905	905
query37	115	103	76	76
query38	4483	4183	4244	4183
query39	1524	1482	1474	1474
query40	216	119	106	106
query41	54	53	52	52
query42	120	112	109	109
query43	526	525	483	483
query44	1299	815	813	813
query45	183	173	168	168
query46	849	1025	636	636
query47	1857	1939	1838	1838
query48	386	418	313	313
query49	729	521	430	430
query50	729	756	426	426
query51	4295	4335	4284	4284
query52	110	104	96	96
query53	234	270	196	196
query54	488	487	407	407
query55	82	80	82	80
query56	279	269	268	268
query57	1245	1210	1128	1128
query58	247	248	234	234
query59	2800	2947	2877	2877
query60	305	307	277	277
query61	145	140	141	140
query62	797	742	665	665
query63	235	196	192	192
query64	3562	1067	696	696
query65	4533	4495	4522	4495
query66	804	406	308	308
query67	15751	15681	15259	15259
query68	8932	868	503	503
query69	474	294	259	259
query70	1192	1138	1117	1117
query71	480	295	256	256
query72	5755	3768	3649	3649
query73	766	697	367	367
query74	9023	9383	8980	8980
query75	3960	3141	2706	2706
query76	3654	1178	753	753
query77	768	358	272	272
query78	10016	10392	9320	9320
query79	2546	816	582	582
query80	608	511	438	438
query81	487	258	223	223
query82	705	121	99	99
query83	173	166	153	153
query84	242	102	73	73
query85	825	342	299	299
query86	380	310	264	264
query87	4388	4673	4314	4314
query88	3643	2251	2216	2216
query89	416	315	299	299
query90	1807	212	210	210
query91	146	137	106	106
query92	74	60	55	55
query93	1758	1075	582	582
query94	651	409	297	297
query95	354	273	251	251
query96	479	558	278	278
query97	3325	3413	3258	3258
query98	223	219	198	198
query99	1383	1395	1252	1252
Total cold run time: 280369 ms
Total hot run time: 192556 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.03 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 10872b968e9cb14569f5a192ea61601cf5cbd83e, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.03	0.03
query3	0.24	0.06	0.06
query4	1.63	0.10	0.10
query5	0.55	0.54	0.54
query6	1.18	0.71	0.72
query7	0.03	0.02	0.01
query8	0.04	0.04	0.03
query9	0.58	0.51	0.52
query10	0.59	0.61	0.59
query11	0.16	0.11	0.10
query12	0.14	0.11	0.11
query13	0.62	0.61	0.59
query14	2.69	2.81	2.72
query15	0.91	0.85	0.83
query16	0.39	0.37	0.39
query17	1.01	1.00	1.03
query18	0.22	0.19	0.20
query19	1.94	1.82	2.04
query20	0.01	0.01	0.01
query21	15.36	0.92	0.58
query22	0.75	1.13	0.64
query23	15.01	1.36	0.60
query24	6.96	1.30	0.97
query25	0.50	0.27	0.16
query26	0.53	0.16	0.13
query27	0.05	0.05	0.04
query28	10.27	0.87	0.41
query29	12.53	4.03	3.37
query30	0.25	0.09	0.07
query31	2.83	0.56	0.39
query32	3.22	0.55	0.46
query33	3.00	2.99	3.04
query34	15.62	5.14	4.52
query35	4.60	4.59	4.54
query36	0.67	0.50	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.02
query40	0.16	0.13	0.14
query41	0.09	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 105.69 s
Total hot run time: 31.03 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 53.85% (21/39) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 48.37% (12942/26756)
Line Coverage 37.85% (111019/293279)
Region Coverage 36.84% (56682/153878)
Branch Coverage 32.05% (28546/89054)

@Mryange
Copy link
Contributor Author

Mryange commented Mar 20, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34030 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit aec8d41843779e095ea6b8bc1036c5f11631b347, data reload: false

------ Round 1 ----------------------------------
q1	23936	5044	5013	5013
q2	2052	287	164	164
q3	10423	1226	662	662
q4	10226	995	525	525
q5	7834	2399	2366	2366
q6	273	161	133	133
q7	916	759	615	615
q8	9309	1267	1009	1009
q9	7542	5137	5153	5137
q10	6858	2338	1882	1882
q11	483	281	264	264
q12	347	356	216	216
q13	17799	3706	3085	3085
q14	237	229	208	208
q15	554	483	485	483
q16	626	628	586	586
q17	571	868	330	330
q18	7928	7195	7161	7161
q19	1520	966	550	550
q20	313	334	211	211
q21	3866	3432	2469	2469
q22	1054	1023	961	961
Total cold run time: 114667 ms
Total hot run time: 34030 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5216	5117	5141	5117
q2	234	339	236	236
q3	2120	2675	2316	2316
q4	1410	1833	1384	1384
q5	4547	4463	4429	4429
q6	217	165	123	123
q7	1960	1894	1734	1734
q8	2581	2624	2576	2576
q9	7291	7099	7167	7099
q10	3008	3206	2757	2757
q11	563	517	499	499
q12	678	771	611	611
q13	3564	3914	3238	3238
q14	292	286	271	271
q15	523	472	473	472
q16	648	702	644	644
q17	1139	1647	1320	1320
q18	7645	7531	7451	7451
q19	792	806	853	806
q20	1932	1998	1832	1832
q21	5297	4815	4647	4647
q22	1069	1033	971	971
Total cold run time: 52726 ms
Total hot run time: 50533 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186701 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit aec8d41843779e095ea6b8bc1036c5f11631b347, data reload: false

query1	985	487	481	481
query2	6547	1976	1965	1965
query3	6812	225	220	220
query4	26670	23208	23101	23101
query5	4316	636	481	481
query6	271	187	182	182
query7	4613	498	271	271
query8	301	257	243	243
query9	8614	2580	2581	2580
query10	440	304	257	257
query11	15496	15215	14969	14969
query12	163	108	111	108
query13	1657	511	407	407
query14	10044	6180	6648	6180
query15	206	187	179	179
query16	7443	642	471	471
query17	1213	733	560	560
query18	1978	403	307	307
query19	196	190	159	159
query20	124	125	128	125
query21	214	123	105	105
query22	4433	4532	4330	4330
query23	33768	33043	33072	33043
query24	7661	2361	2373	2361
query25	517	443	400	400
query26	1224	264	141	141
query27	2202	489	317	317
query28	3971	2433	2385	2385
query29	762	552	422	422
query30	284	219	194	194
query31	927	858	803	803
query32	73	65	65	65
query33	558	362	300	300
query34	778	809	499	499
query35	800	812	733	733
query36	931	982	881	881
query37	123	100	79	79
query38	4037	4137	4027	4027
query39	1441	1393	1386	1386
query40	209	123	100	100
query41	56	54	52	52
query42	121	105	103	103
query43	511	504	473	473
query44	1279	795	795	795
query45	173	169	161	161
query46	830	1022	611	611
query47	1759	1822	1746	1746
query48	364	399	300	300
query49	777	494	433	433
query50	708	732	404	404
query51	4216	4168	4033	4033
query52	107	106	96	96
query53	216	265	184	184
query54	474	482	399	399
query55	80	78	83	78
query56	261	278	261	261
query57	1143	1168	1107	1107
query58	244	238	239	238
query59	2712	2748	2670	2670
query60	290	273	262	262
query61	127	123	126	123
query62	791	758	670	670
query63	225	183	179	179
query64	4323	985	686	686
query65	4393	4320	4321	4320
query66	1153	443	300	300
query67	15892	15465	15260	15260
query68	8177	867	502	502
query69	474	308	259	259
query70	1219	1141	1129	1129
query71	475	290	272	272
query72	5788	5046	4844	4844
query73	681	568	335	335
query74	8929	9101	8967	8967
query75	3927	3273	2707	2707
query76	3754	1174	764	764
query77	777	373	294	294
query78	10088	10097	9329	9329
query79	2917	812	561	561
query80	712	498	504	498
query81	497	256	222	222
query82	695	122	97	97
query83	176	170	156	156
query84	245	93	76	76
query85	807	349	306	306
query86	377	294	298	294
query87	4469	4423	4295	4295
query88	3729	2206	2259	2206
query89	388	312	269	269
query90	1835	205	210	205
query91	141	141	112	112
query92	77	58	59	58
query93	1932	1060	580	580
query94	655	406	306	306
query95	358	260	260	260
query96	483	553	276	276
query97	3368	3372	3284	3284
query98	230	207	205	205
query99	1418	1432	1272	1272
Total cold run time: 276025 ms
Total hot run time: 186701 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.57 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit aec8d41843779e095ea6b8bc1036c5f11631b347, data reload: false

query1	0.04	0.03	0.04
query2	0.11	0.10	0.10
query3	0.26	0.19	0.19
query4	1.59	0.20	0.19
query5	0.58	0.58	0.58
query6	1.18	0.73	0.71
query7	0.03	0.02	0.02
query8	0.04	0.04	0.04
query9	0.59	0.54	0.52
query10	0.58	0.57	0.56
query11	0.15	0.11	0.11
query12	0.15	0.11	0.10
query13	0.62	0.60	0.59
query14	2.77	2.69	2.67
query15	0.92	0.84	0.84
query16	0.38	0.36	0.36
query17	1.01	1.01	1.04
query18	0.22	0.19	0.20
query19	1.91	1.86	1.90
query20	0.01	0.01	0.01
query21	15.36	0.92	0.55
query22	0.76	1.19	0.92
query23	14.70	1.34	0.62
query24	7.49	1.01	1.09
query25	0.53	0.41	0.06
query26	0.59	0.16	0.14
query27	0.08	0.05	0.05
query28	9.79	0.84	0.41
query29	12.53	4.01	3.35
query30	0.25	0.09	0.06
query31	2.82	0.58	0.39
query32	3.23	0.56	0.46
query33	3.07	3.09	3.01
query34	15.58	5.12	4.52
query35	4.54	4.57	4.56
query36	0.67	0.50	0.48
query37	0.09	0.06	0.07
query38	0.05	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.13
query41	0.09	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 105.62 s
Total hot run time: 31.57 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 53.85% (21/39) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 49.98% (13370/26751)
Line Coverage 39.41% (115611/293353)
Region Coverage 38.13% (58716/153971)
Branch Coverage 33.26% (29628/89084)

@LiBinfeng-01
Copy link
Contributor

need also check this case: select strright('привет', 2147483648)

Gabriel39
Gabriel39 previously approved these changes May 6, 2025
@Mryange
Copy link
Contributor Author

Mryange commented May 6, 2025

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented May 6, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels May 6, 2025
@github-actions
Copy link
Contributor

github-actions bot commented May 6, 2025

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 33839 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit aec8d41843779e095ea6b8bc1036c5f11631b347, data reload: false

------ Round 1 ----------------------------------
q1	26607	5171	4998	4998
q2	2074	291	187	187
q3	10464	1266	719	719
q4	10249	999	518	518
q5	8129	2490	2318	2318
q6	190	164	135	135
q7	911	759	610	610
q8	9321	1288	1071	1071
q9	6802	5079	5125	5079
q10	6815	2308	1861	1861
q11	490	278	257	257
q12	343	348	212	212
q13	17758	3666	3094	3094
q14	244	221	211	211
q15	538	492	492	492
q16	423	431	372	372
q17	598	840	377	377
q18	7666	7342	7145	7145
q19	1274	936	552	552
q20	338	337	228	228
q21	4091	3345	2437	2437
q22	1014	1016	966	966
Total cold run time: 116339 ms
Total hot run time: 33839 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5073	5109	5056	5056
q2	240	328	238	238
q3	2187	2619	2272	2272
q4	1381	1781	1463	1463
q5	4516	4439	4355	4355
q6	221	164	125	125
q7	1981	1934	1742	1742
q8	2578	2462	2588	2462
q9	7174	7192	7255	7192
q10	2965	3181	2720	2720
q11	572	516	486	486
q12	688	746	603	603
q13	3479	3920	3362	3362
q14	271	299	279	279
q15	520	473	468	468
q16	453	484	461	461
q17	1133	1531	1313	1313
q18	7844	7556	7367	7367
q19	807	800	807	800
q20	1932	2028	1802	1802
q21	5021	4690	4564	4564
q22	1064	1035	953	953
Total cold run time: 52100 ms
Total hot run time: 50083 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185699 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit aec8d41843779e095ea6b8bc1036c5f11631b347, data reload: false

query1	1006	480	493	480
query2	6568	1855	1805	1805
query3	6755	217	219	217
query4	26497	23272	23259	23259
query5	4319	629	442	442
query6	314	203	192	192
query7	4620	487	280	280
query8	281	241	254	241
query9	8621	2560	2571	2560
query10	476	349	270	270
query11	15719	15105	14830	14830
query12	159	110	107	107
query13	1658	545	416	416
query14	8894	6128	6254	6128
query15	205	201	173	173
query16	7156	649	472	472
query17	1179	729	580	580
query18	1986	403	314	314
query19	203	193	162	162
query20	130	121	125	121
query21	221	126	111	111
query22	4185	4335	4138	4138
query23	33783	33088	33037	33037
query24	8537	2453	2401	2401
query25	547	453	398	398
query26	1244	278	156	156
query27	2731	489	331	331
query28	4325	2076	2041	2041
query29	759	565	438	438
query30	288	256	190	190
query31	923	830	760	760
query32	70	67	60	60
query33	559	403	303	303
query34	794	846	529	529
query35	767	833	735	735
query36	956	955	883	883
query37	116	103	98	98
query38	4141	4154	4065	4065
query39	1472	1429	1411	1411
query40	217	124	107	107
query41	59	56	54	54
query42	119	113	108	108
query43	496	494	468	468
query44	1327	791	803	791
query45	178	177	167	167
query46	839	1024	638	638
query47	1774	1779	1755	1755
query48	372	436	299	299
query49	782	517	428	428
query50	682	697	398	398
query51	4100	4104	4052	4052
query52	112	102	97	97
query53	229	252	185	185
query54	610	575	503	503
query55	83	81	87	81
query56	329	312	293	293
query57	1132	1137	1107	1107
query58	264	249	277	249
query59	2573	2687	2562	2562
query60	335	326	311	311
query61	131	149	129	129
query62	809	715	682	682
query63	232	190	183	183
query64	4349	1054	688	688
query65	4311	4203	4215	4203
query66	1151	434	313	313
query67	15773	15661	15334	15334
query68	8040	901	507	507
query69	528	304	263	263
query70	1163	1142	1118	1118
query71	466	319	291	291
query72	5642	4674	4811	4674
query73	735	638	342	342
query74	8897	9173	8680	8680
query75	3910	3246	2678	2678
query76	3759	1278	756	756
query77	783	390	288	288
query78	9951	10125	9246	9246
query79	1936	822	573	573
query80	596	542	444	444
query81	476	249	217	217
query82	448	132	98	98
query83	255	252	250	250
query84	267	101	90	90
query85	795	435	326	326
query86	337	307	287	287
query87	4340	4452	4365	4365
query88	3399	2223	2208	2208
query89	386	308	279	279
query90	1936	221	224	221
query91	145	143	113	113
query92	80	61	60	60
query93	1187	946	577	577
query94	668	403	308	308
query95	380	286	287	286
query96	490	561	278	278
query97	3154	3267	3083	3083
query98	235	210	204	204
query99	1447	1414	1270	1270
Total cold run time: 273363 ms
Total hot run time: 185699 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.47 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit aec8d41843779e095ea6b8bc1036c5f11631b347, data reload: false

query1	0.04	0.04	0.03
query2	0.13	0.10	0.11
query3	0.25	0.20	0.20
query4	1.59	0.20	0.19
query5	0.60	0.58	0.59
query6	1.21	0.72	0.74
query7	0.02	0.02	0.02
query8	0.05	0.04	0.04
query9	0.58	0.53	0.52
query10	0.58	0.57	0.56
query11	0.16	0.10	0.11
query12	0.15	0.11	0.12
query13	0.62	0.61	0.60
query14	0.78	0.80	0.80
query15	0.87	0.85	0.87
query16	0.39	0.38	0.39
query17	1.02	1.10	1.05
query18	0.22	0.21	0.20
query19	1.97	1.84	1.81
query20	0.01	0.01	0.02
query21	15.40	0.91	0.55
query22	0.75	1.21	0.71
query23	14.83	1.37	0.60
query24	6.70	1.47	1.01
query25	0.49	0.28	0.08
query26	0.62	0.16	0.13
query27	0.05	0.05	0.05
query28	10.07	0.87	0.45
query29	12.56	3.93	3.29
query30	0.26	0.09	0.06
query31	2.82	0.59	0.39
query32	3.24	0.55	0.47
query33	3.07	3.01	3.13
query34	15.68	5.12	4.51
query35	4.49	4.58	4.49
query36	0.67	0.50	0.49
query37	0.08	0.06	0.07
query38	0.06	0.05	0.03
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.04	0.02
Total cold run time: 103.44 s
Total hot run time: 29.47 s

@Mryange Mryange force-pushed the fix-utf8-3-to-1 branch from aec8d41 to 7d687b3 Compare May 6, 2025 04:04
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label May 6, 2025
@Mryange
Copy link
Contributor Author

Mryange commented May 6, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34003 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7d687b3f3610e9922b8e034690972b8276b192ef, data reload: false

------ Round 1 ----------------------------------
q1	26539	5103	5058	5058
q2	2099	299	189	189
q3	10388	1280	718	718
q4	10237	1004	534	534
q5	8460	2444	2353	2353
q6	262	164	131	131
q7	909	747	607	607
q8	9315	1341	1060	1060
q9	6891	5149	5084	5084
q10	6836	2332	1907	1907
q11	466	286	268	268
q12	347	356	216	216
q13	17789	3627	3055	3055
q14	237	227	220	220
q15	545	504	498	498
q16	419	430	368	368
q17	607	856	363	363
q18	7824	7111	7173	7111
q19	1350	978	575	575
q20	337	354	236	236
q21	4013	3411	2480	2480
q22	1020	999	972	972
Total cold run time: 116890 ms
Total hot run time: 34003 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5108	5072	5088	5072
q2	246	328	230	230
q3	2145	2667	2275	2275
q4	1382	1798	1478	1478
q5	4574	4434	4338	4338
q6	212	173	124	124
q7	1962	1888	1728	1728
q8	2607	2618	2541	2541
q9	7123	7226	7169	7169
q10	2994	3180	2758	2758
q11	579	519	498	498
q12	677	766	616	616
q13	3488	3848	3192	3192
q14	268	299	258	258
q15	536	497	512	497
q16	453	489	453	453
q17	1160	1505	1433	1433
q18	7865	7637	7424	7424
q19	837	862	978	862
q20	1905	1999	1798	1798
q21	4927	4602	4477	4477
q22	1071	1001	972	972
Total cold run time: 52119 ms
Total hot run time: 50193 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185172 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7d687b3f3610e9922b8e034690972b8276b192ef, data reload: false

query1	1003	469	495	469
query2	6565	1849	1816	1816
query3	6767	215	214	214
query4	26056	23265	22962	22962
query5	4353	590	446	446
query6	303	199	201	199
query7	4613	508	279	279
query8	295	258	279	258
query9	8556	2569	2575	2569
query10	497	321	254	254
query11	15812	15096	14755	14755
query12	159	110	102	102
query13	1644	501	410	410
query14	8851	6118	6278	6118
query15	208	191	169	169
query16	7158	636	446	446
query17	1211	737	588	588
query18	1979	409	296	296
query19	196	193	160	160
query20	119	118	119	118
query21	218	132	110	110
query22	4061	4253	4037	4037
query23	33925	33106	32865	32865
query24	8553	2436	2409	2409
query25	585	472	415	415
query26	1237	325	155	155
query27	2697	538	330	330
query28	4289	2076	2063	2063
query29	774	556	435	435
query30	282	216	182	182
query31	889	849	795	795
query32	75	66	60	60
query33	547	359	305	305
query34	790	908	523	523
query35	803	798	728	728
query36	942	1004	870	870
query37	117	99	77	77
query38	4197	4278	4089	4089
query39	1437	1420	1361	1361
query40	210	117	106	106
query41	62	54	55	54
query42	118	106	108	106
query43	506	499	471	471
query44	1338	795	793	793
query45	175	175	170	170
query46	841	1027	626	626
query47	1776	1798	1746	1746
query48	372	406	301	301
query49	790	530	421	421
query50	662	667	408	408
query51	4118	4106	4069	4069
query52	112	112	98	98
query53	224	257	194	194
query54	591	563	508	508
query55	89	82	81	81
query56	299	297	314	297
query57	1140	1160	1100	1100
query58	261	251	249	249
query59	2569	2696	2664	2664
query60	327	332	309	309
query61	159	127	124	124
query62	796	736	651	651
query63	224	189	187	187
query64	4364	1018	671	671
query65	4304	4258	4200	4200
query66	1203	410	310	310
query67	15904	15628	15471	15471
query68	8082	886	511	511
query69	479	301	272	272
query70	1143	1129	1146	1129
query71	450	312	297	297
query72	5537	4698	4633	4633
query73	701	583	340	340
query74	8907	9144	8642	8642
query75	3916	3222	2710	2710
query76	3761	1191	756	756
query77	797	369	289	289
query78	9831	10000	9414	9414
query79	3750	822	539	539
query80	626	505	457	457
query81	461	253	211	211
query82	558	121	100	100
query83	286	315	229	229
query84	302	100	83	83
query85	774	346	316	316
query86	338	313	265	265
query87	4446	4380	4318	4318
query88	2782	2200	2197	2197
query89	443	305	278	278
query90	1930	207	206	206
query91	144	139	112	112
query92	75	60	61	60
query93	2179	991	571	571
query94	676	402	283	283
query95	368	295	278	278
query96	486	569	277	277
query97	3147	3172	3093	3093
query98	229	219	205	205
query99	1445	1405	1279	1279
Total cold run time: 275346 ms
Total hot run time: 185172 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.11 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7d687b3f3610e9922b8e034690972b8276b192ef, data reload: false

query1	0.04	0.04	0.03
query2	0.13	0.10	0.11
query3	0.26	0.20	0.20
query4	1.58	0.19	0.10
query5	0.56	0.55	0.56
query6	1.19	0.71	0.72
query7	0.03	0.02	0.02
query8	0.04	0.03	0.04
query9	0.58	0.52	0.50
query10	0.58	0.58	0.56
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.61	0.59	0.59
query14	0.78	0.80	0.81
query15	0.87	0.85	0.87
query16	0.36	0.38	0.38
query17	1.05	1.04	1.03
query18	0.21	0.20	0.20
query19	1.87	1.79	1.86
query20	0.02	0.01	0.01
query21	15.41	0.88	0.53
query22	0.78	1.05	0.79
query23	14.96	1.39	0.61
query24	6.56	2.33	0.81
query25	0.49	0.20	0.07
query26	0.51	0.16	0.13
query27	0.06	0.05	0.04
query28	9.28	0.94	0.44
query29	12.53	3.94	3.26
query30	0.25	0.09	0.07
query31	2.82	0.60	0.38
query32	3.22	0.55	0.47
query33	3.00	3.03	3.05
query34	15.75	5.06	4.46
query35	4.55	4.56	4.54
query36	0.65	0.50	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.18	0.15	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 102.37 s
Total hot run time: 29.11 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 53.85% (21/39) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 55.10% (14917/27071)
Line Coverage 44.25% (131225/296545)
Region Coverage 42.91% (66903/155900)
Branch Coverage 37.51% (33763/90002)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 74.36% (29/39) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.37% (21353/26569)
Line Coverage 74.44% (220370/296032)
Region Coverage 72.70% (132562/182341)
Branch Coverage 66.05% (67748/102578)

@Mryange
Copy link
Contributor Author

Mryange commented May 6, 2025

run p0

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 74.36% (29/39) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.38% (21357/26569)
Line Coverage 74.46% (220430/296032)
Region Coverage 72.76% (132670/182341)
Branch Coverage 66.11% (67811/102578)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 7, 2025
@github-actions
Copy link
Contributor

github-actions bot commented May 7, 2025

PR approved by at least one committer and no changes requested.

@BiteTheDDDDt BiteTheDDDDt merged commit db4810b into apache:master May 7, 2025
25 of 27 checks passed
github-actions bot pushed a commit that referenced this pull request May 7, 2025
…ght, append_trailing_char_if_absent (#49127)

### What problem does this PR solve?
 
The url_encode function previously performed a modulus operation on a
signed number. Converting it to an unsigned number will fix the issue.
```
before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+
```

The strright function did not calculate the length according to the
number of UTF-8 characters.
```
before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+
```

he case of inputting a UTF-8 character was not considered.
```
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+
```

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Mryange added a commit to Mryange/doris that referenced this pull request May 7, 2025
…ght, append_trailing_char_if_absent (apache#49127)

### What problem does this PR solve?
 
The url_encode function previously performed a modulus operation on a
signed number. Converting it to an unsigned number will fix the issue.
```
before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+
```

The strright function did not calculate the length according to the
number of UTF-8 characters.
```
before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+
```

he case of inputting a UTF-8 character was not considered.
```
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+
```

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Mryange added a commit to Mryange/doris that referenced this pull request May 7, 2025
…ght, append_trailing_char_if_absent (apache#49127)

The url_encode function previously performed a modulus operation on a
signed number. Converting it to an unsigned number will fix the issue.
```
before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+
```

The strright function did not calculate the length according to the
number of UTF-8 characters.
```
before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+
```

he case of inputting a UTF-8 character was not considered.
```
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+
```

Problem Summary:

None

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
yiguolei pushed a commit that referenced this pull request May 7, 2025
…ncode, strright, append_trailing_char_if_absent #49127 (#50660)

…ght, append_trailing_char_if_absent (#49127)

The url_encode function previously performed a modulus operation on a
signed number. Converting it to an unsigned number will fix the issue.
```
before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+
```

The strright function did not calculate the length according to the
number of UTF-8 characters.
```
before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+
```

he case of inputting a UTF-8 character was not considered.
```
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+
```
dataroaring pushed a commit that referenced this pull request May 8, 2025
@yiguolei yiguolei mentioned this pull request May 13, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…ght, append_trailing_char_if_absent (apache#49127)

### What problem does this PR solve?
 
The url_encode function previously performed a modulus operation on a
signed number. Converting it to an unsigned number will fix the issue.
```
before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+
```

The strright function did not calculate the length according to the
number of UTF-8 characters.
```
before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+
```

he case of inputting a UTF-8 character was not considered.
```
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+
```

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@gavinchou gavinchou mentioned this pull request Jun 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.10-merged dev/3.0.6-merged p0_w reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants