Skip to content

Conversation

@BePPPower
Copy link
Contributor

@BePPPower BePPPower commented Apr 27, 2025

picked from: #49036

Problem Summary:

The output format of complex data types are different between Hive and
Doris, such as array, map and struct.
When user migrate from Hive to Doris, they expect the same format so
that they don't need to modify their business code.

This PR mainly changes:

Add a new option to  session variable `serde_dialect`: If set to hive,
the output format returned to MySQL client of some datatypes will be
changed:

Array
Doris: ["abc", "def", "", null, 1]
Hive: ["abc","def","",null,true]

Map
Doris: {"k1":null, "k2":"v3"}
Hive: {"k1":null,"k2":"v3"}

Struct
Doris: {"s_id":100, "s_name":"abc , "", "s_address":null}
Hive: {"s_id":100,"s_name":"abc ,"","s_address":null}

Related apache#37039
@BePPPower BePPPower requested a review from dataroaring as a code owner April 27, 2025 09:28
@Thearas
Copy link
Contributor

Thearas commented Apr 27, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman
Copy link
Contributor

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 28, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TeamCity cloud ut coverage result:
Function Coverage: 83.05% (1088/1310)
Line Coverage: 66.01% (18011/27285)
Region Coverage: 65.49% (8863/13534)
Branch Coverage: 55.43% (4787/8636)
Coverage Report: http://coverage.selectdb-in.cc/coverage/86459dc6c5801b790776bdea34e5add4476673ed_86459dc6c5801b790776bdea34e5add4476673ed_cloud/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40445 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 86459dc6c5801b790776bdea34e5add4476673ed, data reload: false

------ Round 1 ----------------------------------
q1	17597	7053	6721	6721
q2	2072	183	189	183
q3	10614	1113	1146	1113
q4	10503	734	836	734
q5	7763	2904	2913	2904
q6	224	138	132	132
q7	997	603	623	603
q8	9370	1987	2057	1987
q9	6651	6435	6454	6435
q10	6999	2323	2302	2302
q11	457	270	265	265
q12	402	218	208	208
q13	17783	2997	2997	2997
q14	261	211	217	211
q15	519	454	464	454
q16	645	581	589	581
q17	986	582	604	582
q18	7555	6593	6637	6593
q19	1396	1150	1054	1054
q20	492	208	208	208
q21	3925	3177	3331	3177
q22	1082	1001	1016	1001
Total cold run time: 108293 ms
Total hot run time: 40445 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6610	6640	6625	6625
q2	326	229	229	229
q3	2908	2825	2955	2825
q4	2029	1847	1843	1843
q5	5791	5758	5794	5758
q6	215	131	129	129
q7	2224	1839	1801	1801
q8	3410	3577	3523	3523
q9	8938	8854	8922	8854
q10	3548	3526	3518	3518
q11	587	493	521	493
q12	821	631	599	599
q13	10037	3156	3178	3156
q14	304	270	273	270
q15	509	461	460	460
q16	681	660	657	657
q17	1855	1637	1617	1617
q18	8315	7831	7806	7806
q19	1717	1552	1594	1552
q20	2057	1854	1860	1854
q21	5568	5370	5325	5325
q22	1144	1062	1045	1045
Total cold run time: 69594 ms
Total hot run time: 59939 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 22.86% (8/35) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 40.21% (10558/26255)
Line Coverage 30.98% (89366/288423)
Region Coverage 30.12% (46093/153053)
Branch Coverage 26.63% (23566/88510)

@doris-robot
Copy link

TPC-DS: Total hot run time: 197864 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 86459dc6c5801b790776bdea34e5add4476673ed, data reload: false

query1	1306	917	917	917
query2	6215	2059	2022	2022
query3	10845	4468	4362	4362
query4	60903	29271	23376	23376
query5	5219	465	448	448
query6	395	186	178	178
query7	5476	315	317	315
query8	311	224	223	223
query9	8403	2640	2631	2631
query10	469	280	258	258
query11	17428	15344	15462	15344
query12	159	102	110	102
query13	1411	459	427	427
query14	10461	7110	7117	7110
query15	209	177	175	175
query16	7184	453	508	453
query17	1186	596	585	585
query18	1865	323	321	321
query19	217	163	153	153
query20	113	110	107	107
query21	213	114	105	105
query22	4780	4561	4703	4561
query23	35229	34052	34377	34052
query24	6308	2954	2952	2952
query25	557	425	438	425
query26	656	179	166	166
query27	1870	369	353	353
query28	4223	2480	2450	2450
query29	671	441	420	420
query30	244	165	156	156
query31	996	836	828	828
query32	67	54	57	54
query33	449	275	280	275
query34	908	506	509	506
query35	883	756	768	756
query36	1096	969	961	961
query37	113	68	74	68
query38	4101	3952	3966	3952
query39	1509	1493	1448	1448
query40	196	114	104	104
query41	50	49	47	47
query42	118	99	96	96
query43	535	500	499	499
query44	1200	830	828	828
query45	186	173	172	172
query46	1166	723	721	721
query47	2039	1904	1938	1904
query48	482	374	390	374
query49	735	383	392	383
query50	852	430	420	420
query51	7267	7286	7297	7286
query52	106	90	90	90
query53	268	188	188	188
query54	571	461	470	461
query55	78	79	81	79
query56	258	240	254	240
query57	1240	1174	1140	1140
query58	217	228	215	215
query59	3235	3084	3043	3043
query60	274	243	266	243
query61	122	109	107	107
query62	774	699	706	699
query63	231	193	192	192
query64	1399	673	635	635
query65	3268	3218	3210	3210
query66	716	303	302	302
query67	16115	15532	15583	15532
query68	4356	583	575	575
query69	441	269	268	268
query70	1180	1048	1117	1048
query71	339	266	257	257
query72	6094	4085	4011	4011
query73	755	345	360	345
query74	10469	9151	9139	9139
query75	3353	2675	2619	2619
query76	2044	1102	1105	1102
query77	475	276	266	266
query78	10700	9611	9673	9611
query79	1565	593	586	586
query80	871	429	431	429
query81	514	232	238	232
query82	1293	87	90	87
query83	241	150	148	148
query84	286	81	75	75
query85	893	294	300	294
query86	321	287	285	285
query87	4468	4315	4248	4248
query88	3837	2375	2354	2354
query89	422	297	303	297
query90	2027	181	179	179
query91	178	145	145	145
query92	66	49	49	49
query93	2041	564	569	564
query94	756	296	290	290
query95	353	279	259	259
query96	609	282	278	278
query97	3298	3163	3192	3163
query98	208	210	190	190
query99	1564	1339	1266	1266
Total cold run time: 314363 ms
Total hot run time: 197864 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.91 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 86459dc6c5801b790776bdea34e5add4476673ed, data reload: false

query1	0.03	0.03	0.03
query2	0.09	0.04	0.04
query3	0.23	0.06	0.05
query4	1.66	0.07	0.08
query5	0.51	0.53	0.52
query6	1.14	0.75	0.75
query7	0.02	0.01	0.02
query8	0.05	0.05	0.05
query9	0.54	0.51	0.50
query10	0.56	0.56	0.55
query11	0.16	0.12	0.12
query12	0.16	0.13	0.13
query13	0.61	0.59	0.60
query14	2.75	2.84	2.89
query15	0.91	0.84	0.83
query16	0.37	0.37	0.37
query17	1.00	1.06	1.06
query18	0.18	0.18	0.18
query19	1.95	1.84	2.00
query20	0.02	0.01	0.02
query21	15.36	0.65	0.66
query22	3.80	7.80	1.73
query23	18.26	1.42	1.38
query24	2.18	0.23	0.24
query25	0.15	0.09	0.08
query26	0.28	0.17	0.18
query27	0.08	0.08	0.08
query28	13.22	0.59	0.57
query29	12.63	3.41	3.38
query30	0.24	0.06	0.06
query31	2.86	0.40	0.40
query32	3.24	0.48	0.48
query33	3.01	3.04	2.99
query34	16.67	4.52	4.51
query35	4.62	4.58	4.54
query36	0.66	0.50	0.48
query37	0.20	0.16	0.17
query38	0.16	0.16	0.15
query39	0.06	0.04	0.05
query40	0.16	0.14	0.14
query41	0.10	0.06	0.06
query42	0.06	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 110.99 s
Total hot run time: 32.91 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 8b14577 into apache:branch-3.0 May 6, 2025
21 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants