Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #48218

…unt. (#48218)

### What problem does this PR solve?
The previous pr (#46534) control the
memory use when sample analyzing a large partition table.
This PR make the maximum rows and partition count to sample
configurable. User could set the value larger if the NDV is not accurate
enough.
@github-actions github-actions bot requested a review from dataroaring as a code owner March 14, 2025 10:03
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Mar 14, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39984 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 98b130188cbcd8e18590a2a436fefef4013d48c2, data reload: false

------ Round 1 ----------------------------------
q1	17681	6722	6658	6658
q2	2080	164	179	164
q3	10597	1076	1219	1076
q4	10574	775	759	759
q5	7757	2831	2752	2752
q6	226	144	136	136
q7	1019	624	625	624
q8	9370	1959	2071	1959
q9	6542	6388	6376	6376
q10	6991	2286	2320	2286
q11	472	266	259	259
q12	399	212	211	211
q13	17792	3002	3020	3002
q14	238	203	216	203
q15	499	452	462	452
q16	669	596	584	584
q17	974	561	573	561
q18	7176	6665	6536	6536
q19	1404	1100	1001	1001
q20	497	210	202	202
q21	4004	3200	3245	3200
q22	1060	990	983	983
Total cold run time: 108021 ms
Total hot run time: 39984 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6601	6579	6646	6579
q2	326	235	232	232
q3	2915	2733	2876	2733
q4	2066	1828	1804	1804
q5	5741	5758	5712	5712
q6	210	128	130	128
q7	2256	1839	1854	1839
q8	3377	3541	3527	3527
q9	8838	8902	8860	8860
q10	3543	3512	3512	3512
q11	604	490	511	490
q12	831	600	617	600
q13	11059	3219	3185	3185
q14	304	282	274	274
q15	529	459	464	459
q16	723	652	640	640
q17	1870	1631	1625	1625
q18	8239	7754	7733	7733
q19	1666	1661	1577	1577
q20	2038	1887	1856	1856
q21	5449	5374	5370	5370
q22	1133	1069	1009	1009
Total cold run time: 70318 ms
Total hot run time: 59744 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197679 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 98b130188cbcd8e18590a2a436fefef4013d48c2, data reload: false

query1	1286	888	898	888
query2	6261	2040	2077	2040
query3	10828	4367	4358	4358
query4	60659	28570	23343	23343
query5	5229	449	428	428
query6	388	168	196	168
query7	5482	317	309	309
query8	315	224	219	219
query9	8357	2632	2626	2626
query10	473	267	273	267
query11	17839	15390	15582	15390
query12	169	112	103	103
query13	1407	470	462	462
query14	10374	7569	7562	7562
query15	202	183	176	176
query16	7218	472	464	464
query17	1196	600	588	588
query18	1929	317	329	317
query19	207	159	166	159
query20	119	116	110	110
query21	205	100	101	100
query22	4662	4373	4714	4373
query23	34516	33898	34397	33898
query24	6130	2966	2945	2945
query25	548	434	427	427
query26	663	173	170	170
query27	1850	352	358	352
query28	4303	2454	2458	2454
query29	714	475	476	475
query30	237	162	162	162
query31	989	801	832	801
query32	69	56	57	56
query33	419	295	300	295
query34	917	509	523	509
query35	867	726	724	724
query36	1096	983	981	981
query37	120	69	68	68
query38	4089	4020	4020	4020
query39	1532	1503	1484	1484
query40	204	99	99	99
query41	51	48	49	48
query42	114	102	104	102
query43	543	492	497	492
query44	1192	835	839	835
query45	191	170	175	170
query46	1184	719	735	719
query47	2017	1874	1934	1874
query48	492	392	391	391
query49	731	400	384	384
query50	855	434	438	434
query51	7377	7271	7288	7271
query52	112	97	95	95
query53	260	188	189	188
query54	579	468	459	459
query55	82	79	79	79
query56	272	246	256	246
query57	1275	1165	1145	1145
query58	223	234	210	210
query59	3226	3012	2756	2756
query60	272	262	256	256
query61	106	107	109	107
query62	763	671	663	663
query63	213	189	186	186
query64	1400	683	654	654
query65	3250	3185	3186	3185
query66	649	307	299	299
query67	16011	15574	15619	15574
query68	4341	591	583	583
query69	428	281	282	281
query70	1215	1138	1104	1104
query71	313	263	260	260
query72	6392	4078	4029	4029
query73	761	348	358	348
query74	10185	9245	8904	8904
query75	3384	2666	2629	2629
query76	2042	1165	1078	1078
query77	487	264	285	264
query78	10596	9574	9531	9531
query79	2021	606	605	605
query80	1307	429	428	428
query81	556	244	232	232
query82	1276	90	87	87
query83	162	139	139	139
query84	279	72	74	72
query85	1004	305	297	297
query86	357	279	308	279
query87	4449	4292	4516	4292
query88	3780	2391	2372	2372
query89	421	296	293	293
query90	1921	190	191	190
query91	179	146	150	146
query92	59	47	54	47
query93	2692	560	573	560
query94	758	301	301	301
query95	363	257	263	257
query96	647	292	286	286
query97	3293	3185	3106	3106
query98	214	215	199	199
query99	1648	1288	1303	1288
Total cold run time: 315038 ms
Total hot run time: 197679 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 98b130188cbcd8e18590a2a436fefef4013d48c2, data reload: false

query1	0.03	0.03	0.02
query2	0.07	0.03	0.03
query3	0.23	0.06	0.06
query4	1.62	0.11	0.10
query5	0.53	0.54	0.48
query6	1.13	0.73	0.73
query7	0.02	0.02	0.02
query8	0.04	0.05	0.03
query9	0.55	0.50	0.51
query10	0.55	0.55	0.56
query11	0.15	0.12	0.11
query12	0.14	0.12	0.11
query13	0.62	0.62	0.59
query14	2.73	2.73	2.75
query15	0.89	0.84	0.82
query16	0.37	0.38	0.37
query17	1.07	1.01	1.05
query18	0.23	0.21	0.23
query19	1.96	1.86	1.98
query20	0.02	0.01	0.01
query21	15.36	0.60	0.56
query22	2.87	2.80	1.54
query23	17.13	0.96	0.79
query24	3.10	1.50	1.87
query25	0.26	0.11	0.04
query26	0.56	0.15	0.13
query27	0.04	0.05	0.04
query28	9.18	0.57	0.46
query29	12.58	3.24	3.17
query30	0.25	0.06	0.06
query31	2.84	0.37	0.37
query32	3.29	0.46	0.46
query33	2.99	3.00	3.05
query34	16.89	4.49	4.47
query35	4.54	4.51	4.52
query36	0.67	0.49	0.47
query37	0.10	0.07	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.12	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.02	0.03
Total cold run time: 105.99 s
Total hot run time: 32.2 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit bc1f6b3 into branch-3.0 Mar 14, 2025
22 of 24 checks passed
@github-actions github-actions bot deleted the auto-pick-48218-branch-3.0 branch March 14, 2025 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants