Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #42399

…en analyzed for a long time. (#42399)

Support auto analyze columns that haven't been analyzed for a long time.
Add a very low priority job queue for auto analyze to process this kind
of columns.

The purpose of this change is to make sure all tables could be auto
analyzed within a certain time. In the earlier Doris versions, users
often encounter this kind of issues:
User load some new data to a large table everyday, but the change rate
(percentage of new data) is very low, because there is a large size of
old data. In this case, auto analyze for this table will not be
triggered for a very long time, because the default trigger threshold of
auto analyze is 40% (more than 40% of the data in a table is changed
since last analyze). This will probably cause a bad plan because
min/max/ndv statistics are outdated.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Dec 11, 2024
@doris-robot
Copy link

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40218 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a2602bbc1672400c5e3003812c2cb8558db06b9a, data reload: false

------ Round 1 ----------------------------------
q1	17567	7341	7268	7268
q2	2073	164	197	164
q3	10677	1051	1135	1051
q4	10545	720	762	720
q5	7721	2768	2757	2757
q6	235	144	141	141
q7	951	604	598	598
q8	9583	1916	1961	1916
q9	8040	6397	6387	6387
q10	6983	2273	2327	2273
q11	467	264	261	261
q12	401	209	214	209
q13	17795	2940	2987	2940
q14	231	209	204	204
q15	559	510	511	510
q16	670	609	584	584
q17	963	602	594	594
q18	7098	6499	6472	6472
q19	1373	1039	993	993
q20	445	192	190	190
q21	3912	3044	3093	3044
q22	1060	942	954	942
Total cold run time: 109349 ms
Total hot run time: 40218 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7278	7190	7185	7185
q2	323	226	225	225
q3	2849	2846	2839	2839
q4	1975	1816	1797	1797
q5	5641	5693	5681	5681
q6	217	141	143	141
q7	2165	1759	1768	1759
q8	3284	3552	3474	3474
q9	8745	8769	8849	8769
q10	3517	3511	3492	3492
q11	587	488	489	488
q12	804	637	608	608
q13	16467	3146	3077	3077
q14	302	269	259	259
q15	569	501	528	501
q16	716	664	661	661
q17	1883	1607	1621	1607
q18	8089	7840	7591	7591
q19	2478	1557	1590	1557
q20	2045	1876	1852	1852
q21	5347	5296	5323	5296
q22	1120	1038	979	979
Total cold run time: 76401 ms
Total hot run time: 59838 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195842 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a2602bbc1672400c5e3003812c2cb8558db06b9a, data reload: false

query1	1230	923	939	923
query2	6243	2097	2069	2069
query3	10823	4012	4002	4002
query4	65860	29496	23522	23522
query5	5515	449	423	423
query6	452	160	166	160
query7	6010	321	306	306
query8	317	227	227	227
query9	9264	2676	2646	2646
query10	491	263	258	258
query11	18077	15246	15842	15246
query12	170	100	102	100
query13	1562	422	420	420
query14	11271	6475	7370	6475
query15	209	174	176	174
query16	7231	483	484	483
query17	1313	581	604	581
query18	1832	324	313	313
query19	207	166	156	156
query20	118	116	110	110
query21	63	46	69	46
query22	4768	4633	4696	4633
query23	34537	34058	34066	34058
query24	6071	2866	2872	2866
query25	521	392	381	381
query26	686	171	163	163
query27	1980	298	299	298
query28	4375	2558	2489	2489
query29	690	437	421	421
query30	245	159	164	159
query31	1020	793	834	793
query32	64	52	51	51
query33	440	293	274	274
query34	953	505	517	505
query35	848	724	735	724
query36	1065	941	972	941
query37	117	74	71	71
query38	4132	3959	3959	3959
query39	1508	1491	1467	1467
query40	146	87	85	85
query41	47	46	48	46
query42	115	104	99	99
query43	545	508	489	489
query44	1170	792	803	792
query45	188	184	172	172
query46	1161	745	775	745
query47	2006	1885	1925	1885
query48	453	374	372	372
query49	747	393	386	386
query50	829	428	415	415
query51	7399	7319	6996	6996
query52	99	94	91	91
query53	254	182	189	182
query54	572	458	449	449
query55	77	73	73	73
query56	256	251	264	251
query57	1181	1095	1099	1095
query58	212	213	207	207
query59	3183	3227	2971	2971
query60	294	256	256	256
query61	134	133	132	132
query62	793	649	665	649
query63	213	193	195	193
query64	1759	710	628	628
query65	3242	3162	3158	3158
query66	713	292	301	292
query67	15708	15427	15535	15427
query68	4464	563	551	551
query69	440	251	254	251
query70	1167	1120	1113	1113
query71	402	264	249	249
query72	6593	4174	4256	4174
query73	766	337	338	337
query74	10241	8840	8949	8840
query75	3350	2610	2645	2610
query76	1961	1077	1088	1077
query77	508	260	267	260
query78	10714	9668	9606	9606
query79	8363	591	583	583
query80	2174	421	426	421
query81	549	239	239	239
query82	1217	120	114	114
query83	253	141	142	141
query84	286	105	78	78
query85	1617	307	295	295
query86	463	294	283	283
query87	4446	4212	4174	4174
query88	5687	2424	2436	2424
query89	557	287	291	287
query90	2112	187	180	180
query91	179	139	142	139
query92	66	45	47	45
query93	6597	543	536	536
query94	864	291	294	291
query95	352	244	253	244
query96	629	281	274	274
query97	3311	3126	3103	3103
query98	219	204	200	200
query99	1613	1307	1294	1294
Total cold run time: 337934 ms
Total hot run time: 195842 ms

@Jibing-Li Jibing-Li merged commit bf0d060 into branch-3.0 Dec 11, 2024
15 checks passed
@github-actions github-actions bot deleted the auto-pick-42399-branch-3.0 branch December 11, 2024 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants