Skip to content

Conversation

@zhiqiang-hhhh
Copy link
Contributor

cherry pick from #38608

…8608)

* Target

Fix unstable result of hist function when involving null value.

* Reproduce

test result of
`regression-test/suites/query_p0/sql_functions/aggregate_functions/test_aggregate_all_functions2.groovy`
is unstable, sql `SELECT histogram(k7, 5) FROM baseall` will sometimes
acts like the second argument is not passed in.

* Root reason

We have short-circuit in AggregateFunctionNullVariadicInline, when this
row is NULL, the value will not be added by the nested function.
Implementation of histogram relies on its add method to get its seconds
argument, when we have an all null value block, histogram will not get
its seconds arg even if sql is like `select(k7, 5)`, so a max_bucket_num
with default value 128 is serialized. When we do merging, and happens to
deserialize the above block at last, the max_bucket_num in merge stage
will be assigned to 128, and this leads to the wrong result.

* Fix by

Init value of max_bucket_num is assigned to 0, when we do merging, we
will discard this aggregated data if its max_bucket_num is 0.
@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.73% (8100/21466)
Line Coverage: 29.37% (66332/225858)
Region Coverage: 28.88% (34233/118537)
Branch Coverage: 24.77% (17591/71030)
Coverage Report: http://coverage.selectdb-in.cc/coverage/06834ec9bbdf616ca11834b68a8171f55440842f_06834ec9bbdf616ca11834b68a8171f55440842f/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 50176 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 06834ec9bbdf616ca11834b68a8171f55440842f, data reload: false

------ Round 1 ----------------------------------
q1	17663	4418	4409	4409
q2	2085	154	153	153
q3	10401	1917	1930	1917
q4	10339	1250	1311	1250
q5	8721	3896	3968	3896
q6	234	123	148	123
q7	2062	1604	1605	1604
q8	9316	2740	2734	2734
q9	10644	10418	10414	10414
q10	8684	3538	3528	3528
q11	419	252	247	247
q12	471	300	306	300
q13	18342	3976	4017	3976
q14	360	325	328	325
q15	502	457	457	457
q16	681	559	586	559
q17	1155	955	972	955
q18	7293	6813	6947	6813
q19	1791	1660	1633	1633
q20	552	319	311	311
q21	4416	4132	4124	4124
q22	542	448	450	448
Total cold run time: 116673 ms
Total hot run time: 50176 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4322	4296	4304	4296
q2	316	222	222	222
q3	4170	4171	4135	4135
q4	2749	2745	2767	2745
q5	7210	7103	7044	7044
q6	240	119	121	119
q7	3199	2833	2830	2830
q8	4353	4508	4490	4490
q9	16855	16896	16893	16893
q10	4228	4233	4295	4233
q11	761	698	691	691
q12	1036	871	862	862
q13	7293	3692	3751	3692
q14	446	459	430	430
q15	501	460	459	459
q16	740	674	688	674
q17	3834	3861	3817	3817
q18	8799	8665	8789	8665
q19	1708	1712	1683	1683
q20	2376	2148	2121	2121
q21	8499	8476	8449	8449
q22	1066	1033	1002	1002
Total cold run time: 84701 ms
Total hot run time: 79552 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 204178 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 06834ec9bbdf616ca11834b68a8171f55440842f, data reload: false

query1	940	393	420	393
query2	6551	2887	2468	2468
query3	6922	213	209	209
query4	21584	18323	18137	18137
query5	19742	6547	6542	6542
query6	287	218	219	218
query7	4161	308	321	308
query8	432	443	411	411
query9	3116	2649	2593	2593
query10	418	316	297	297
query11	11312	10824	10890	10824
query12	128	77	75	75
query13	5601	685	707	685
query14	17615	13307	13715	13307
query15	369	252	251	251
query16	6478	294	273	273
query17	1726	1450	875	875
query18	2327	424	415	415
query19	215	150	147	147
query20	73	76	77	76
query21	196	99	92	92
query22	5300	5086	5084	5084
query23	32556	32164	32016	32016
query24	7079	6526	6554	6526
query25	529	434	428	428
query26	530	161	163	161
query27	1889	298	295	295
query28	6147	2355	2319	2319
query29	2970	2824	2744	2744
query30	241	172	163	163
query31	899	738	724	724
query32	67	60	62	60
query33	403	252	262	252
query34	861	481	485	481
query35	1138	905	978	905
query36	1351	1000	1290	1000
query37	92	58	61	58
query38	3116	2940	2946	2940
query39	1391	1317	1333	1317
query40	210	96	95	95
query41	45	43	44	43
query42	80	84	82	82
query43	786	741	773	741
query44	1132	724	727	724
query45	250	238	235	235
query46	1255	954	1004	954
query47	1784	1890	1976	1890
query48	1014	699	721	699
query49	631	373	383	373
query50	870	641	615	615
query51	4769	4685	4666	4666
query52	98	95	84	84
query53	458	340	327	327
query54	2703	2499	2463	2463
query55	93	78	81	78
query56	230	228	217	217
query57	1226	1146	1091	1091
query58	222	210	214	210
query59	4158	3988	3937	3937
query60	222	205	217	205
query61	96	93	94	93
query62	809	507	458	458
query63	497	343	346	343
query64	2573	1538	1500	1500
query65	3636	3538	3574	3538
query66	816	386	384	384
query67	16126	15671	15080	15080
query68	9860	631	625	625
query69	592	372	373	372
query70	1933	1503	1505	1503
query71	418	312	328	312
query72	6496	3551	3509	3509
query73	745	332	322	322
query74	6306	5814	5840	5814
query75	5320	3679	3667	3667
query76	6230	1156	1225	1156
query77	994	265	256	256
query78	12630	11654	11857	11654
query79	7094	630	635	630
query80	1091	396	404	396
query81	489	234	249	234
query82	1688	97	96	96
query83	178	140	136	136
query84	259	71	70	70
query85	883	336	337	336
query86	327	298	301	298
query87	3246	3058	3059	3058
query88	4678	2307	2324	2307
query89	392	302	300	300
query90	1963	215	207	207
query91	173	141	141	141
query92	57	59	60	59
query93	5444	603	576	576
query94	713	212	211	211
query95	1123	1088	1077	1077
query96	651	332	324	324
query97	6514	6303	6421	6303
query98	189	183	170	170
query99	2891	912	865	865
Total cold run time: 315108 ms
Total hot run time: 204178 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.82 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 06834ec9bbdf616ca11834b68a8171f55440842f, data reload: false

query1	0.03	0.02	0.02
query2	0.06	0.03	0.03
query3	0.25	0.05	0.06
query4	1.76	0.06	0.07
query5	0.54	0.53	0.53
query6	1.26	0.62	0.62
query7	0.02	0.01	0.01
query8	0.04	0.03	0.02
query9	0.53	0.49	0.49
query10	0.53	0.52	0.51
query11	0.12	0.09	0.09
query12	0.12	0.09	0.09
query13	0.63	0.61	0.62
query14	0.79	0.78	0.78
query15	0.78	0.77	0.76
query16	0.36	0.36	0.37
query17	1.02	1.01	1.02
query18	0.23	0.24	0.24
query19	1.90	1.80	1.85
query20	0.01	0.00	0.00
query21	15.52	0.56	0.55
query22	1.96	2.20	1.56
query23	17.26	0.99	0.93
query24	5.03	1.04	1.13
query25	0.34	0.09	0.07
query26	0.57	0.16	0.17
query27	0.05	0.04	0.04
query28	8.00	0.76	0.72
query29	12.58	2.38	2.34
query30	0.59	0.54	0.53
query31	2.81	0.40	0.37
query32	3.40	0.50	0.49
query33	3.07	3.06	3.09
query34	15.27	4.77	4.78
query35	4.88	4.85	4.85
query36	1.06	1.02	1.01
query37	0.06	0.05	0.05
query38	0.04	0.02	0.02
query39	0.05	0.01	0.02
query40	0.16	0.14	0.14
query41	0.07	0.02	0.01
query42	0.02	0.02	0.01
query43	0.03	0.02	0.01
Total cold run time: 103.8 s
Total hot run time: 30.82 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 06834ec9bbdf616ca11834b68a8171f55440842f with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       22.1 seconds inserted 10000000 Rows, about 452K ops/s

@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2025

We're closing this PR because it hasn't been updated in a while.
This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and feel free a maintainer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Feb 2, 2025
@github-actions github-actions bot closed this Feb 3, 2025
@zhiqiang-hhhh zhiqiang-hhhh deleted the pick_38608_to_upstream_branch-2.0 branch February 3, 2025 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants