Skip to content

Conversation

@qidaye
Copy link
Contributor

@qidaye qidaye commented Feb 28, 2025

What problem does this PR solve?

Currently, NGram bloom filter index only supports directly schema change, and users need to build indexes incrementally when using it.
The design goal is that ngrambf supports light_index_change, including local and cloud mode, which can incrementally add indexes or build indexes on stock data.
Inverted indexes are currently only supported in local mode for light_schema_change, cloud mode is still a directly SC, this time it does not involve inverted indexes, and its functionality remains unchanged.
After the completion of the function, the NGram BF index construction can be used in the following way, following the existing syntax, does not involve changes or additions.

# add new index
alter table t1 add index idx_ngram_k2 (`k2`) using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");
create index idx_ngram_k2 (`k2`) on t1 using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");

# build index on stock data
build index idx_ngram_k2 on t1;

# show build ngram index 
show alter table column;

# cancel build index 
cancel build index on t1;

NOTE: Currently, building an index by partition is not supported. If you want to build an index for stock data, you need to build it for all data, including new data written after the Add index has been added.

Build index by partition will be supported in next stage.

Release note

Support light index change for NGram bf index

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@qidaye
Copy link
Contributor Author

qidaye commented Feb 28, 2025

run buildall

@qidaye
Copy link
Contributor Author

qidaye commented Feb 28, 2025

run compile

@qidaye qidaye force-pushed the light_index_change_in_cloud branch from 679e17b to d54f726 Compare March 3, 2025 03:13
@qidaye
Copy link
Contributor Author

qidaye commented Mar 3, 2025

run buildall

2 similar comments
@qidaye
Copy link
Contributor Author

qidaye commented Mar 3, 2025

run buildall

@qidaye
Copy link
Contributor Author

qidaye commented Mar 3, 2025

run buildall

@qidaye qidaye force-pushed the light_index_change_in_cloud branch from d54f726 to 6091a62 Compare March 3, 2025 13:38
@qidaye
Copy link
Contributor Author

qidaye commented Mar 3, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31476 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6091a62938881967a724e19b8b38a4995f69f622, data reload: false

------ Round 1 ----------------------------------
q1	17684	5216	5085	5085
q2	2046	298	166	166
q3	10456	1223	756	756
q4	10242	1021	540	540
q5	7809	2371	2326	2326
q6	187	165	131	131
q7	903	746	607	607
q8	9305	1468	1050	1050
q9	4852	4714	4826	4714
q10	6809	2303	1878	1878
q11	481	271	254	254
q12	349	354	220	220
q13	17776	3647	3045	3045
q14	225	227	208	208
q15	511	456	456	456
q16	618	612	567	567
q17	569	855	338	338
q18	6541	6254	6144	6144
q19	1201	939	526	526
q20	308	330	195	195
q21	2797	2160	1964	1964
q22	359	331	306	306
Total cold run time: 102028 ms
Total hot run time: 31476 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5098	5068	5061	5061
q2	231	329	235	235
q3	2161	2682	2322	2322
q4	1446	1834	1369	1369
q5	4211	4059	4105	4059
q6	204	164	128	128
q7	1859	1815	1650	1650
q8	2577	2642	2538	2538
q9	7274	7244	7196	7196
q10	2966	3192	2795	2795
q11	579	514	508	508
q12	711	761	617	617
q13	3523	3883	3257	3257
q14	305	304	270	270
q15	493	454	447	447
q16	652	690	646	646
q17	1119	1597	1333	1333
q18	7689	7389	7269	7269
q19	800	865	906	865
q20	1958	2051	1832	1832
q21	5394	4960	4696	4696
q22	608	583	559	559
Total cold run time: 51858 ms
Total hot run time: 49652 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191500 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6091a62938881967a724e19b8b38a4995f69f622, data reload: false

query1	1308	979	947	947
query2	6221	1917	1831	1831
query3	11143	4775	4612	4612
query4	54142	25578	23704	23704
query5	5315	518	496	496
query6	384	180	182	180
query7	5268	526	291	291
query8	324	234	235	234
query9	7037	2553	2550	2550
query10	413	336	265	265
query11	15578	15308	14978	14978
query12	153	110	105	105
query13	1237	522	392	392
query14	10678	6580	6584	6580
query15	206	199	174	174
query16	7075	660	499	499
query17	1106	732	577	577
query18	1585	432	332	332
query19	209	203	183	183
query20	134	129	124	124
query21	211	132	114	114
query22	4604	4596	4257	4257
query23	34017	33349	33324	33324
query24	6203	2439	2442	2439
query25	442	476	400	400
query26	651	276	158	158
query27	1696	493	347	347
query28	3062	2452	2426	2426
query29	565	566	436	436
query30	218	195	163	163
query31	849	922	852	852
query32	71	61	60	60
query33	451	363	299	299
query34	764	876	499	499
query35	814	858	750	750
query36	970	999	895	895
query37	121	100	77	77
query38	4160	4391	4225	4225
query39	1495	1443	1467	1443
query40	236	117	102	102
query41	52	56	48	48
query42	122	116	116	116
query43	499	495	492	492
query44	1360	826	808	808
query45	185	180	179	179
query46	893	1093	667	667
query47	1846	1877	1765	1765
query48	383	416	304	304
query49	701	489	426	426
query50	713	769	426	426
query51	4403	4351	4320	4320
query52	113	114	102	102
query53	253	279	203	203
query54	482	494	415	415
query55	86	78	84	78
query56	266	266	257	257
query57	1173	1183	1116	1116
query58	234	243	250	243
query59	2781	2851	2536	2536
query60	304	270	275	270
query61	125	125	115	115
query62	743	751	698	698
query63	240	194	200	194
query64	1433	1069	677	677
query65	3394	3317	3248	3248
query66	732	393	314	314
query67	15985	15463	15355	15355
query68	7055	903	493	493
query69	544	292	262	262
query70	1164	1136	1117	1117
query71	512	307	271	271
query72	5733	3654	3829	3654
query73	1185	758	360	360
query74	8944	8990	8754	8754
query75	3829	3209	2699	2699
query76	4278	1210	749	749
query77	579	378	277	277
query78	10512	10527	9654	9654
query79	1436	861	582	582
query80	638	534	461	461
query81	498	285	261	261
query82	460	126	96	96
query83	210	229	155	155
query84	289	86	74	74
query85	732	348	304	304
query86	362	282	298	282
query87	4605	4486	4295	4295
query88	2891	2242	2201	2201
query89	399	316	285	285
query90	2023	200	194	194
query91	133	139	107	107
query92	69	60	54	54
query93	1195	1057	576	576
query94	680	424	304	304
query95	343	259	251	251
query96	483	553	267	267
query97	3320	3375	3304	3304
query98	219	209	205	205
query99	1406	1378	1260	1260
Total cold run time: 297343 ms
Total hot run time: 191500 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.62 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6091a62938881967a724e19b8b38a4995f69f622, data reload: false

query1	0.03	0.05	0.03
query2	0.07	0.04	0.03
query3	0.23	0.07	0.06
query4	1.63	0.10	0.09
query5	0.56	0.54	0.54
query6	1.19	0.72	0.71
query7	0.02	0.01	0.02
query8	0.04	0.03	0.04
query9	0.58	0.54	0.51
query10	0.57	0.55	0.56
query11	0.15	0.11	0.11
query12	0.14	0.11	0.12
query13	0.61	0.62	0.60
query14	2.80	2.68	2.69
query15	0.91	0.84	0.85
query16	0.38	0.37	0.38
query17	1.05	1.02	1.00
query18	0.21	0.20	0.19
query19	1.96	1.77	1.94
query20	0.01	0.01	0.01
query21	15.35	0.88	0.52
query22	0.74	1.07	0.61
query23	15.12	1.38	0.61
query24	7.01	1.54	0.95
query25	0.47	0.28	0.06
query26	0.50	0.16	0.13
query27	0.05	0.05	0.06
query28	9.50	0.87	0.43
query29	12.69	3.99	3.24
query30	0.25	0.09	0.07
query31	2.84	0.59	0.39
query32	3.22	0.54	0.46
query33	3.10	2.97	3.18
query34	15.76	5.19	4.51
query35	4.53	4.57	4.60
query36	0.66	0.50	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.13	0.13
query41	0.08	0.03	0.03
query42	0.04	0.02	0.03
query43	0.04	0.03	0.02
Total cold run time: 105.43 s
Total hot run time: 30.62 s

@qidaye qidaye force-pushed the light_index_change_in_cloud branch from 6091a62 to 7d08604 Compare March 5, 2025 12:56
@qidaye
Copy link
Contributor Author

qidaye commented Mar 5, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32307 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7d0860410e203bfbe3319d37bd06e82fe6884d1a, data reload: false

------ Round 1 ----------------------------------
q1	17606	5202	5047	5047
q2	2052	301	187	187
q3	10385	1322	705	705
q4	10217	1012	522	522
q5	7554	2322	2355	2322
q6	188	166	136	136
q7	899	738	592	592
q8	9293	1254	1087	1087
q9	4858	4657	4708	4657
q10	6801	2286	1886	1886
q11	501	277	266	266
q12	342	355	215	215
q13	17771	3695	3043	3043
q14	230	233	205	205
q15	551	493	485	485
q16	620	626	604	604
q17	563	868	339	339
q18	6928	6506	6350	6350
q19	1222	961	543	543
q20	322	327	199	199
q21	2837	2182	1957	1957
q22	1071	1065	960	960
Total cold run time: 102811 ms
Total hot run time: 32307 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5154	5078	5084	5078
q2	252	336	234	234
q3	2159	2668	2289	2289
q4	1438	1784	1382	1382
q5	4229	4097	4145	4097
q6	208	159	123	123
q7	1882	1868	1792	1792
q8	2598	2579	2678	2579
q9	7275	7157	7207	7157
q10	3018	3218	2812	2812
q11	555	503	484	484
q12	666	783	629	629
q13	3501	3877	3286	3286
q14	279	307	283	283
q15	530	481	479	479
q16	659	678	639	639
q17	1188	1644	1304	1304
q18	7742	7560	7532	7532
q19	821	821	919	821
q20	1974	2004	1924	1924
q21	5478	4892	4919	4892
q22	1077	1024	1017	1017
Total cold run time: 52683 ms
Total hot run time: 50833 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185622 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7d0860410e203bfbe3319d37bd06e82fe6884d1a, data reload: false

query1	1001	397	379	379
query2	6537	1930	1944	1930
query3	6797	211	213	211
query4	26899	23716	23699	23699
query5	4353	666	481	481
query6	313	205	203	203
query7	4595	501	288	288
query8	281	239	234	234
query9	8604	2535	2539	2535
query10	464	307	250	250
query11	15704	15268	14839	14839
query12	151	111	103	103
query13	1648	511	380	380
query14	8925	6190	6302	6190
query15	209	179	173	173
query16	7135	627	464	464
query17	1188	692	538	538
query18	1940	389	310	310
query19	186	179	151	151
query20	122	116	114	114
query21	212	123	100	100
query22	4291	4216	4190	4190
query23	33693	32967	33311	32967
query24	7744	2314	2370	2314
query25	598	485	418	418
query26	1236	272	162	162
query27	2388	509	331	331
query28	4154	2414	2373	2373
query29	771	557	415	415
query30	281	216	185	185
query31	928	858	780	780
query32	72	65	66	65
query33	566	347	305	305
query34	782	844	498	498
query35	807	804	741	741
query36	952	983	897	897
query37	119	132	76	76
query38	4176	4164	4085	4085
query39	1444	1391	1357	1357
query40	212	116	103	103
query41	55	53	53	53
query42	119	109	101	101
query43	510	537	493	493
query44	1263	775	796	775
query45	177	168	168	168
query46	828	1021	631	631
query47	1748	1821	1708	1708
query48	377	405	294	294
query49	772	516	428	428
query50	689	728	400	400
query51	4125	4134	4094	4094
query52	101	104	93	93
query53	229	259	185	185
query54	493	478	410	410
query55	80	88	83	83
query56	278	262	251	251
query57	1137	1114	1052	1052
query58	243	238	260	238
query59	2759	2898	2866	2866
query60	273	280	264	264
query61	119	143	128	128
query62	804	734	675	675
query63	235	187	190	187
query64	4538	1115	768	768
query65	4412	4328	4290	4290
query66	1146	421	328	328
query67	15740	15639	15273	15273
query68	8094	870	511	511
query69	469	296	255	255
query70	1203	1117	1099	1099
query71	409	291	278	278
query72	5567	3559	3784	3559
query73	833	736	344	344
query74	9026	9192	9000	9000
query75	3319	3142	2701	2701
query76	3251	1171	718	718
query77	701	366	289	289
query78	9857	10027	9342	9342
query79	2186	819	575	575
query80	653	519	444	444
query81	485	262	228	228
query82	539	130	100	100
query83	168	170	148	148
query84	237	94	77	77
query85	797	356	306	306
query86	380	316	283	283
query87	4536	4455	4384	4384
query88	3634	2161	2141	2141
query89	392	310	289	289
query90	1921	188	190	188
query91	140	150	115	115
query92	72	69	60	60
query93	1764	1030	568	568
query94	692	416	295	295
query95	354	281	252	252
query96	473	543	266	266
query97	3331	3386	3287	3287
query98	230	206	199	199
query99	1342	1373	1251	1251
Total cold run time: 272783 ms
Total hot run time: 185622 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.68 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7d0860410e203bfbe3319d37bd06e82fe6884d1a, data reload: false

query1	0.03	0.03	0.03
query2	0.08	0.03	0.04
query3	0.24	0.07	0.06
query4	1.62	0.11	0.10
query5	0.57	0.54	0.55
query6	1.19	0.73	0.71
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.59	0.54	0.52
query10	0.56	0.58	0.57
query11	0.15	0.11	0.10
query12	0.14	0.11	0.11
query13	0.61	0.60	0.60
query14	2.80	2.70	2.68
query15	0.92	0.84	0.84
query16	0.38	0.39	0.38
query17	1.02	1.03	1.05
query18	0.21	0.19	0.20
query19	1.85	1.75	1.98
query20	0.01	0.01	0.01
query21	15.35	0.90	0.54
query22	0.75	1.22	0.74
query23	14.81	1.39	0.61
query24	8.02	1.18	0.69
query25	0.58	0.23	0.14
query26	0.63	0.16	0.15
query27	0.06	0.06	0.05
query28	9.87	0.84	0.44
query29	12.58	3.94	3.23
query30	0.26	0.09	0.07
query31	2.81	0.58	0.38
query32	3.23	0.54	0.46
query33	2.96	3.01	3.03
query34	15.76	5.11	4.54
query35	4.62	4.53	4.53
query36	0.65	0.50	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.16	0.13	0.13
query41	0.09	0.02	0.03
query42	0.03	0.02	0.02
query43	0.03	0.04	0.02
Total cold run time: 106.45 s
Total hot run time: 30.68 s

@qidaye qidaye force-pushed the light_index_change_in_cloud branch from 7d08604 to 4ede4ef Compare March 5, 2025 14:21
@qidaye
Copy link
Contributor Author

qidaye commented Mar 5, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32126 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4ede4ef03d9471e0208fb5f451ee5aed66b78ba4, data reload: false

------ Round 1 ----------------------------------
q1	17576	5121	5008	5008
q2	2062	285	174	174
q3	10593	1218	741	741
q4	10200	995	525	525
q5	7638	2394	2298	2298
q6	187	168	135	135
q7	895	755	595	595
q8	9298	1255	1050	1050
q9	4781	4558	4934	4558
q10	6799	2319	1899	1899
q11	482	267	260	260
q12	345	347	216	216
q13	17765	3675	3063	3063
q14	254	231	212	212
q15	542	492	482	482
q16	609	602	590	590
q17	559	842	337	337
q18	6795	6394	6354	6354
q19	1274	940	523	523
q20	320	334	192	192
q21	2791	2180	1938	1938
q22	1090	1031	976	976
Total cold run time: 102855 ms
Total hot run time: 32126 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5093	5076	5051	5051
q2	238	331	233	233
q3	2178	2644	2263	2263
q4	1393	1833	1369	1369
q5	4241	4109	4109	4109
q6	214	159	121	121
q7	1887	1830	1742	1742
q8	2615	2670	2601	2601
q9	7185	7303	7143	7143
q10	3041	3200	2805	2805
q11	594	527	480	480
q12	664	761	629	629
q13	3362	3939	3214	3214
q14	277	290	275	275
q15	533	472	461	461
q16	660	651	632	632
q17	1111	1622	1311	1311
q18	7667	7748	7566	7566
q19	794	795	800	795
q20	2012	2024	1871	1871
q21	5533	5066	4836	4836
q22	1100	1070	1010	1010
Total cold run time: 52392 ms
Total hot run time: 50517 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192143 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4ede4ef03d9471e0208fb5f451ee5aed66b78ba4, data reload: false

query1	1409	1052	1023	1023
query2	6217	1966	1969	1966
query3	11145	4573	4525	4525
query4	53784	25728	23207	23207
query5	5147	614	490	490
query6	350	203	195	195
query7	4947	501	293	293
query8	319	250	247	247
query9	5687	2564	2548	2548
query10	432	305	249	249
query11	15176	15019	14892	14892
query12	164	118	111	111
query13	1090	527	408	408
query14	10825	6327	6925	6327
query15	204	210	184	184
query16	7166	682	484	484
query17	1086	726	601	601
query18	1561	427	337	337
query19	193	198	173	173
query20	129	127	122	122
query21	260	136	106	106
query22	4234	4506	4233	4233
query23	33867	33384	33320	33320
query24	6044	2433	2460	2433
query25	464	468	402	402
query26	739	268	157	157
query27	1897	522	360	360
query28	2772	2504	2423	2423
query29	594	557	448	448
query30	279	216	216	216
query31	877	864	789	789
query32	72	66	64	64
query33	463	349	333	333
query34	757	866	504	504
query35	822	870	764	764
query36	964	994	911	911
query37	122	102	75	75
query38	4251	4203	4127	4127
query39	1482	1472	1423	1423
query40	213	114	100	100
query41	52	51	52	51
query42	126	111	109	109
query43	513	525	508	508
query44	1303	788	798	788
query45	183	173	161	161
query46	864	1032	647	647
query47	1837	1914	1785	1785
query48	369	419	304	304
query49	711	502	419	419
query50	703	779	432	432
query51	4301	4337	4240	4240
query52	110	109	107	107
query53	223	266	190	190
query54	488	492	421	421
query55	87	84	82	82
query56	294	269	259	259
query57	1171	1199	1120	1120
query58	255	239	242	239
query59	2896	3054	2951	2951
query60	298	269	263	263
query61	132	151	115	115
query62	761	733	669	669
query63	234	196	197	196
query64	2293	1052	695	695
query65	4630	4420	4359	4359
query66	766	390	347	347
query67	15943	15841	15277	15277
query68	7146	877	502	502
query69	542	294	283	283
query70	1213	1143	1128	1128
query71	507	308	265	265
query72	5492	3759	3796	3759
query73	1464	741	348	348
query74	9225	9193	8828	8828
query75	3834	3148	2701	2701
query76	4154	1185	770	770
query77	700	371	276	276
query78	10102	9979	9310	9310
query79	2864	828	587	587
query80	727	535	443	443
query81	489	270	225	225
query82	666	126	96	96
query83	241	166	206	166
query84	288	86	73	73
query85	774	369	303	303
query86	378	293	286	286
query87	4596	4510	4557	4510
query88	3496	2202	2175	2175
query89	405	323	292	292
query90	1796	191	195	191
query91	142	138	111	111
query92	72	59	59	59
query93	2054	1038	579	579
query94	663	424	303	303
query95	348	267	255	255
query96	483	573	270	270
query97	3390	3395	3283	3283
query98	226	207	197	197
query99	1455	1382	1265	1265
Total cold run time: 299679 ms
Total hot run time: 192143 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.1 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4ede4ef03d9471e0208fb5f451ee5aed66b78ba4, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.06
query4	1.61	0.10	0.11
query5	0.55	0.53	0.55
query6	1.18	0.71	0.73
query7	0.03	0.02	0.02
query8	0.04	0.03	0.04
query9	0.60	0.53	0.54
query10	0.57	0.58	0.57
query11	0.15	0.10	0.11
query12	0.14	0.11	0.11
query13	0.62	0.60	0.60
query14	2.65	2.68	2.69
query15	0.92	0.85	0.85
query16	0.41	0.38	0.37
query17	1.02	1.04	1.04
query18	0.21	0.19	0.19
query19	1.92	1.77	2.00
query20	0.02	0.01	0.01
query21	15.35	0.91	0.56
query22	0.76	1.28	0.93
query23	14.71	1.36	0.63
query24	7.47	1.61	0.86
query25	0.48	0.38	0.07
query26	0.59	0.17	0.14
query27	0.06	0.05	0.05
query28	8.98	0.81	0.44
query29	12.54	3.95	3.30
query30	0.24	0.08	0.07
query31	2.82	0.58	0.39
query32	3.22	0.55	0.46
query33	3.04	3.01	3.06
query34	15.55	5.12	4.50
query35	4.58	4.57	4.56
query36	0.68	0.48	0.48
query37	0.09	0.06	0.06
query38	0.06	0.04	0.03
query39	0.03	0.03	0.03
query40	0.17	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.03	0.02	0.02
Total cold run time: 104.55 s
Total hot run time: 31.1 s

@qidaye qidaye force-pushed the light_index_change_in_cloud branch from 4ede4ef to f25084a Compare March 6, 2025 02:44
@qidaye
Copy link
Contributor Author

qidaye commented Mar 6, 2025

run buildall

@qidaye qidaye force-pushed the light_index_change_in_cloud branch from f25084a to 8723c95 Compare March 6, 2025 03:18
@qidaye
Copy link
Contributor Author

qidaye commented Mar 6, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32896 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8723c952f6958afffc29c7831ffc8a8f06611b60, data reload: false

------ Round 1 ----------------------------------
q1	17579	5284	5129	5129
q2	2046	320	172	172
q3	10382	1356	713	713
q4	10205	1041	539	539
q5	7539	2439	2416	2416
q6	190	175	138	138
q7	939	781	630	630
q8	9311	1312	1185	1185
q9	5046	4644	4817	4644
q10	6849	2340	1911	1911
q11	478	269	263	263
q12	349	364	221	221
q13	17765	3766	3114	3114
q14	242	244	206	206
q15	553	513	503	503
q16	620	617	578	578
q17	599	881	354	354
q18	7057	6542	6386	6386
q19	1963	988	548	548
q20	312	331	187	187
q21	3017	2342	2070	2070
q22	1037	1040	989	989
Total cold run time: 104078 ms
Total hot run time: 32896 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5243	5144	5175	5144
q2	244	333	229	229
q3	2190	2703	2322	2322
q4	1517	1883	1403	1403
q5	4295	4141	4201	4141
q6	208	164	125	125
q7	2032	1879	1767	1767
q8	2651	2594	2531	2531
q9	7300	7182	7170	7170
q10	3031	3223	2792	2792
q11	581	504	495	495
q12	683	755	623	623
q13	3450	3983	3339	3339
q14	279	309	284	284
q15	547	506	499	499
q16	646	673	620	620
q17	1150	1599	1352	1352
q18	7746	7625	7510	7510
q19	866	853	892	853
q20	2007	2035	1827	1827
q21	5526	4894	4969	4894
q22	1094	1006	989	989
Total cold run time: 53286 ms
Total hot run time: 50909 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185891 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 8723c952f6958afffc29c7831ffc8a8f06611b60, data reload: false

query1	992	399	379	379
query2	6534	1934	1987	1934
query3	6789	217	217	217
query4	26098	23403	23336	23336
query5	4360	640	487	487
query6	305	205	191	191
query7	4623	503	294	294
query8	295	251	239	239
query9	8664	2580	2581	2580
query10	456	322	258	258
query11	15688	15154	15332	15154
query12	173	113	108	108
query13	1672	524	402	402
query14	10427	6340	6638	6340
query15	206	190	168	168
query16	7659	618	428	428
query17	1331	714	549	549
query18	1980	402	297	297
query19	187	191	153	153
query20	123	112	113	112
query21	209	119	103	103
query22	4305	4343	4261	4261
query23	33948	32698	32865	32698
query24	7026	2366	2360	2360
query25	515	476	381	381
query26	1214	272	154	154
query27	2470	482	323	323
query28	4314	2441	2384	2384
query29	722	550	429	429
query30	293	222	194	194
query31	966	834	779	779
query32	100	64	62	62
query33	560	364	310	310
query34	809	850	508	508
query35	791	829	729	729
query36	942	976	900	900
query37	127	108	74	74
query38	4044	4175	4026	4026
query39	1441	1419	1397	1397
query40	205	117	101	101
query41	58	54	54	54
query42	120	104	102	102
query43	519	533	481	481
query44	1307	792	793	792
query45	176	175	164	164
query46	844	1020	620	620
query47	1742	1841	1710	1710
query48	368	404	297	297
query49	812	521	428	428
query50	684	735	413	413
query51	4169	4171	4175	4171
query52	112	110	99	99
query53	236	253	187	187
query54	500	487	411	411
query55	84	81	78	78
query56	288	267	252	252
query57	1110	1142	1073	1073
query58	250	238	239	238
query59	2632	2793	2761	2761
query60	285	293	278	278
query61	121	117	121	117
query62	814	708	692	692
query63	232	194	190	190
query64	4254	1056	652	652
query65	4427	4310	4322	4310
query66	1059	404	315	315
query67	15919	15424	15555	15424
query68	7734	887	497	497
query69	459	303	267	267
query70	1174	1115	1093	1093
query71	415	290	264	264
query72	5529	3784	3764	3764
query73	748	742	406	406
query74	9240	9115	8738	8738
query75	3197	3172	2698	2698
query76	3253	1189	751	751
query77	474	362	277	277
query78	10075	10210	9261	9261
query79	1445	831	596	596
query80	599	541	452	452
query81	517	255	226	226
query82	211	130	98	98
query83	181	180	158	158
query84	247	90	80	80
query85	755	355	358	355
query86	373	308	295	295
query87	4457	4556	4391	4391
query88	2886	2254	2232	2232
query89	388	310	283	283
query90	1935	218	217	217
query91	142	140	109	109
query92	83	62	60	60
query93	1255	1053	581	581
query94	683	408	309	309
query95	360	274	265	265
query96	495	550	278	278
query97	3338	3373	3287	3287
query98	229	204	200	200
query99	1345	1381	1295	1295
Total cold run time: 270907 ms
Total hot run time: 185891 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.13 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 8723c952f6958afffc29c7831ffc8a8f06611b60, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.03	0.04
query3	0.24	0.07	0.07
query4	1.61	0.11	0.11
query5	0.56	0.56	0.55
query6	1.18	0.73	0.73
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.59	0.54	0.51
query10	0.57	0.57	0.56
query11	0.16	0.10	0.10
query12	0.15	0.11	0.10
query13	0.62	0.60	0.60
query14	2.67	2.82	2.69
query15	0.95	0.86	0.84
query16	0.38	0.37	0.38
query17	1.01	1.04	1.07
query18	0.21	0.20	0.19
query19	1.89	2.01	1.79
query20	0.01	0.02	0.01
query21	15.36	0.86	0.53
query22	0.76	1.32	0.71
query23	14.74	1.42	0.64
query24	6.89	1.52	1.16
query25	0.49	0.25	0.07
query26	0.50	0.17	0.15
query27	0.05	0.05	0.05
query28	9.41	0.86	0.43
query29	12.57	3.96	3.29
query30	0.25	0.09	0.06
query31	2.83	0.60	0.39
query32	3.23	0.56	0.47
query33	3.00	3.01	3.06
query34	15.72	5.11	4.51
query35	4.54	4.51	4.50
query36	0.67	0.50	0.48
query37	0.09	0.06	0.06
query38	0.05	0.03	0.04
query39	0.03	0.02	0.03
query40	0.17	0.14	0.12
query41	0.08	0.02	0.02
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 104.5 s
Total hot run time: 31.13 s

@qidaye qidaye force-pushed the light_index_change_in_cloud branch from 8723c95 to 6a7dbbf Compare March 6, 2025 07:25
@airborne12 airborne12 force-pushed the light_index_change_in_cloud branch from 283baf0 to 0211cdd Compare May 7, 2025 06:21
@airborne12
Copy link
Member

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33875 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0211cdd6c1b032480c6b7d1f4f1945aaa3035f05, data reload: false

------ Round 1 ----------------------------------
q1	25989	5030	5007	5007
q2	2056	272	188	188
q3	10410	1225	679	679
q4	10225	1005	548	548
q5	7545	2377	2279	2279
q6	181	169	133	133
q7	931	722	632	632
q8	9327	1224	1098	1098
q9	6773	5057	5144	5057
q10	6837	2297	1877	1877
q11	492	286	274	274
q12	347	348	208	208
q13	18034	3675	3137	3137
q14	242	233	209	209
q15	527	491	489	489
q16	419	428	368	368
q17	590	848	352	352
q18	7614	7154	7165	7154
q19	1422	946	555	555
q20	333	338	226	226
q21	3921	2654	2430	2430
q22	1056	989	975	975
Total cold run time: 115271 ms
Total hot run time: 33875 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5089	5046	5107	5046
q2	234	320	235	235
q3	2179	2603	2297	2297
q4	1392	1772	1366	1366
q5	4416	4341	4379	4341
q6	223	174	132	132
q7	2030	1944	1753	1753
q8	2573	2582	2545	2545
q9	7190	7213	6976	6976
q10	3037	3184	2708	2708
q11	585	512	516	512
q12	660	772	590	590
q13	3489	3901	3340	3340
q14	275	292	279	279
q15	525	483	477	477
q16	453	490	425	425
q17	1144	1551	1379	1379
q18	7806	7399	7381	7381
q19	768	836	971	836
q20	1958	2030	1911	1911
q21	5113	4707	4667	4667
q22	1113	1091	1057	1057
Total cold run time: 52252 ms
Total hot run time: 50253 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192363 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0211cdd6c1b032480c6b7d1f4f1945aaa3035f05, data reload: false

query1	1410	1110	1093	1093
query2	6347	1831	1812	1812
query3	11032	4667	4521	4521
query4	52563	24958	23036	23036
query5	4962	508	464	464
query6	312	222	199	199
query7	4884	485	289	289
query8	316	260	238	238
query9	5373	2560	2555	2555
query10	439	327	270	270
query11	15007	15010	14882	14882
query12	159	105	108	105
query13	1048	529	389	389
query14	10104	6335	6251	6251
query15	211	208	174	174
query16	7095	675	529	529
query17	1101	754	619	619
query18	1613	428	330	330
query19	203	200	171	171
query20	125	121	139	121
query21	210	126	113	113
query22	4359	4481	4298	4298
query23	34214	33423	33646	33423
query24	6588	2439	2430	2430
query25	477	469	406	406
query26	713	277	150	150
query27	2375	517	333	333
query28	3336	2117	2085	2085
query29	584	571	443	443
query30	274	227	193	193
query31	870	865	783	783
query32	98	65	61	61
query33	458	362	307	307
query34	775	860	534	534
query35	795	830	753	753
query36	936	982	913	913
query37	111	105	77	77
query38	4285	4248	4189	4189
query39	1505	1470	1449	1449
query40	212	123	111	111
query41	66	62	59	59
query42	125	107	111	107
query43	504	503	479	479
query44	1312	816	810	810
query45	184	174	171	171
query46	861	1049	660	660
query47	1835	1857	1787	1787
query48	388	417	335	335
query49	691	492	437	437
query50	650	708	414	414
query51	4217	4224	4109	4109
query52	110	109	108	108
query53	229	260	189	189
query54	618	583	513	513
query55	79	82	83	82
query56	300	299	286	286
query57	1188	1227	1201	1201
query58	278	266	255	255
query59	2734	2811	2744	2744
query60	332	323	298	298
query61	135	128	126	126
query62	763	757	671	671
query63	236	192	187	187
query64	2040	1089	708	708
query65	4351	4222	4264	4222
query66	792	394	303	303
query67	15892	15555	15209	15209
query68	7066	886	506	506
query69	536	315	260	260
query70	1227	1112	1135	1112
query71	507	310	330	310
query72	5789	4807	4792	4792
query73	1446	628	346	346
query74	9139	9151	9079	9079
query75	3808	3197	2711	2711
query76	4220	1224	754	754
query77	617	352	277	277
query78	10047	10142	9290	9290
query79	2611	767	568	568
query80	640	504	442	442
query81	497	253	220	220
query82	462	129	95	95
query83	347	243	237	237
query84	291	104	85	85
query85	805	440	310	310
query86	381	305	285	285
query87	4411	4400	4299	4299
query88	3427	2200	2189	2189
query89	398	309	280	280
query90	1905	210	204	204
query91	144	157	109	109
query92	77	59	52	52
query93	1708	957	576	576
query94	681	398	301	301
query95	366	294	277	277
query96	486	555	275	275
query97	3149	3215	3094	3094
query98	230	212	200	200
query99	1455	1405	1259	1259
Total cold run time: 297262 ms
Total hot run time: 192363 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.98 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0211cdd6c1b032480c6b7d1f4f1945aaa3035f05, data reload: false

query1	0.04	0.04	0.03
query2	0.14	0.10	0.11
query3	0.24	0.18	0.20
query4	1.59	0.20	0.11
query5	0.57	0.56	0.54
query6	1.20	0.73	0.71
query7	0.03	0.02	0.02
query8	0.05	0.03	0.04
query9	0.58	0.52	0.51
query10	0.56	0.57	0.57
query11	0.16	0.11	0.12
query12	0.16	0.12	0.12
query13	0.62	0.59	0.60
query14	0.78	0.80	0.79
query15	0.88	0.83	0.86
query16	0.39	0.38	0.37
query17	1.04	1.00	1.01
query18	0.20	0.20	0.19
query19	1.92	1.82	1.77
query20	0.01	0.01	0.02
query21	15.39	0.89	0.54
query22	0.75	1.05	0.77
query23	14.90	1.38	0.59
query24	7.04	1.77	0.82
query25	0.47	0.12	0.09
query26	0.57	0.15	0.14
query27	0.05	0.05	0.05
query28	9.33	0.88	0.43
query29	12.59	3.95	3.27
query30	0.26	0.09	0.07
query31	2.83	0.58	0.37
query32	3.22	0.54	0.48
query33	3.03	3.00	3.06
query34	15.58	5.10	4.49
query35	4.48	4.55	4.48
query36	0.65	0.50	0.49
query37	0.08	0.06	0.07
query38	0.05	0.04	0.04
query39	0.03	0.03	0.02
query40	0.19	0.14	0.13
query41	0.08	0.03	0.03
query42	0.03	0.02	0.03
query43	0.04	0.04	0.02
Total cold run time: 102.8 s
Total hot run time: 28.98 s

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented May 8, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented May 8, 2025

PR approved by anyone and no changes requested.

Copy link
Contributor

@zzzxl1993 zzzxl1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@airborne12 airborne12 merged commit 99d4294 into apache:master May 8, 2025
28 of 30 checks passed
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…pache#48461)

### What problem does this PR solve?

Currently, NGram bloom filter index only supports directly schema
change, and users need to build indexes incrementally when using it.
The design goal is that ngrambf supports light_index_change, including
local and cloud mode, which can incrementally add indexes or build
indexes on stock data.
Inverted indexes are currently only supported in local mode for
light_schema_change, cloud mode is still a directly SC, this time it
does not involve inverted indexes, and its functionality remains
unchanged.
After the completion of the function, the NGram BF index construction
can be used in the following way, following the existing syntax, does
not involve changes or additions.

```sql
# add new index
alter table t1 add index idx_ngram_k2 (`k2`) using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");
create index idx_ngram_k2 (`k2`) on t1 using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");

# build index on stock data
build index idx_ngram_k2 on t1;

# show build ngram index 
show alter table column;

# cancel build index 
cancel build index on t1;
```

**NOTE:** Currently, building an index by partition is not supported. If
you want to build an index for stock data, you need to build it for all
data, including new data written after the Add index has been added.

Build index by partition will be supported in next stage.

### Release note

Support light index change for NGram bf index
airborne12 pushed a commit to airborne12/apache-doris that referenced this pull request Jul 7, 2025
…pache#48461)

Currently, NGram bloom filter index only supports directly schema
change, and users need to build indexes incrementally when using it.
The design goal is that ngrambf supports light_index_change, including
local and cloud mode, which can incrementally add indexes or build
indexes on stock data.
Inverted indexes are currently only supported in local mode for
light_schema_change, cloud mode is still a directly SC, this time it
does not involve inverted indexes, and its functionality remains
unchanged.
After the completion of the function, the NGram BF index construction
can be used in the following way, following the existing syntax, does
not involve changes or additions.

```sql
alter table t1 add index idx_ngram_k2 (`k2`) using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");
create index idx_ngram_k2 (`k2`) on t1 using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");

build index idx_ngram_k2 on t1;

show alter table column;

cancel build index on t1;
```

**NOTE:** Currently, building an index by partition is not supported. If
you want to build an index for stock data, you need to build it for all
data, including new data written after the Add index has been added.

Build index by partition will be supported in next stage.

Support light index change for NGram bf index
airborne12 pushed a commit to airborne12/apache-doris that referenced this pull request Jul 9, 2025
…pache#48461)

Currently, NGram bloom filter index only supports directly schema
change, and users need to build indexes incrementally when using it.
The design goal is that ngrambf supports light_index_change, including
local and cloud mode, which can incrementally add indexes or build
indexes on stock data.
Inverted indexes are currently only supported in local mode for
light_schema_change, cloud mode is still a directly SC, this time it
does not involve inverted indexes, and its functionality remains
unchanged.
After the completion of the function, the NGram BF index construction
can be used in the following way, following the existing syntax, does
not involve changes or additions.

```sql
alter table t1 add index idx_ngram_k2 (`k2`) using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");
create index idx_ngram_k2 (`k2`) on t1 using ngram_bf properties("bf_size" = "1024", "gram_size" = "3");

build index idx_ngram_k2 on t1;

show alter table column;

cancel build index on t1;
```

**NOTE:** Currently, building an index by partition is not supported. If
you want to build an index for stock data, you need to build it for all
data, including new data written after the Add index has been added.

Build index by partition will be supported in next stage.

Support light index change for NGram bf index
morrySnow pushed a commit that referenced this pull request Jul 9, 2025
…x without parser and ngram bf index #48461 #52251 (#52894)

cherry pick from #48461 and #52251

---------

Co-authored-by: qiye <jianliang5669@gmail.com>
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Jul 9, 2025
…x without parser and ngram bf index apache#48461 apache#52251 (apache#52894)

cherry pick from apache#48461 and apache#52251

---------

Co-authored-by: qiye <jianliang5669@gmail.com>
eldenmoon pushed a commit that referenced this pull request Aug 15, 2025
…4777)

this problem intro by #48461, should not remove this restriction.
github-actions bot pushed a commit that referenced this pull request Aug 15, 2025
…4777)

this problem intro by #48461, should not remove this restriction.
github-actions bot pushed a commit that referenced this pull request Aug 15, 2025
…4777)

this problem intro by #48461, should not remove this restriction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.7-merged dev/3.1.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants