Skip to content

Conversation

@hubgeter
Copy link
Contributor

Proposed changes

fix this bug:
Scenario : iceberg table uses orc storage format
sql : select * from iceberg_orc;
When executing this sql, position delete filter is not performed.

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

…rm position delete when reading the orc file without a predicate.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@hubgeter
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.67% (8982/25180)
Line Coverage: 27.32% (74256/271756)
Region Coverage: 26.56% (38371/144460)
Branch Coverage: 23.38% (19570/83704)
Coverage Report: http://coverage.selectdb-in.cc/coverage/12c5fd0dc6733e7c842abefa8238b1767c359651_12c5fd0dc6733e7c842abefa8238b1767c359651/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40233 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 12c5fd0dc6733e7c842abefa8238b1767c359651, data reload: false

------ Round 1 ----------------------------------
q1	17610	4387	4285	4285
q2	2037	191	182	182
q3	10508	1188	1149	1149
q4	10158	824	732	732
q5	7514	2659	2594	2594
q6	216	131	133	131
q7	1011	570	578	570
q8	9236	2110	2047	2047
q9	8845	6527	6465	6465
q10	8880	3685	3689	3685
q11	449	247	236	236
q12	439	212	208	208
q13	18549	2950	2970	2950
q14	253	224	214	214
q15	507	473	481	473
q16	483	387	377	377
q17	961	720	765	720
q18	8196	7343	7360	7343
q19	6834	1555	1502	1502
q20	643	312	303	303
q21	5014	3794	3994	3794
q22	338	273	281	273
Total cold run time: 118681 ms
Total hot run time: 40233 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4315	4182	4185	4182
q2	366	264	275	264
q3	2955	2736	2713	2713
q4	1850	1580	1553	1553
q5	5255	5254	5281	5254
q6	210	127	125	125
q7	2252	1829	1888	1829
q8	3179	3346	3352	3346
q9	8373	8304	8326	8304
q10	3906	3730	3726	3726
q11	577	486	471	471
q12	760	579	558	558
q13	16438	2961	2969	2961
q14	305	272	263	263
q15	521	485	466	466
q16	464	406	411	406
q17	1749	1494	1465	1465
q18	7555	7486	7374	7374
q19	1659	1579	1501	1501
q20	1970	1753	1774	1753
q21	4929	4696	4994	4696
q22	579	488	489	488
Total cold run time: 70167 ms
Total hot run time: 53698 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188082 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 12c5fd0dc6733e7c842abefa8238b1767c359651, data reload: false

query1	912	371	340	340
query2	7246	2280	2436	2280
query3	6657	209	214	209
query4	23057	21233	21173	21173
query5	4149	422	414	414
query6	262	172	172	172
query7	4591	313	291	291
query8	234	191	180	180
query9	8482	2449	2429	2429
query10	435	270	254	254
query11	14747	14353	14151	14151
query12	136	91	92	91
query13	1663	386	387	386
query14	10657	7742	8231	7742
query15	228	171	171	171
query16	7985	273	270	270
query17	1832	567	559	559
query18	2025	285	280	280
query19	195	160	159	159
query20	94	88	86	86
query21	193	136	125	125
query22	5136	4868	4846	4846
query23	34096	33667	33816	33667
query24	11777	2823	2945	2823
query25	653	372	358	358
query26	1778	164	154	154
query27	2994	325	330	325
query28	7467	2061	2055	2055
query29	1079	637	597	597
query30	294	150	148	148
query31	966	747	740	740
query32	97	57	52	52
query33	751	248	254	248
query34	1102	488	491	488
query35	841	668	682	668
query36	1107	913	897	897
query37	276	68	67	67
query38	2922	2779	2796	2779
query39	1608	1565	1565	1565
query40	278	125	123	123
query41	41	38	37	37
query42	105	94	97	94
query43	573	532	526	526
query44	1232	740	743	740
query45	264	245	246	245
query46	1065	742	714	714
query47	1938	1902	1880	1880
query48	379	303	300	300
query49	1187	400	387	387
query50	764	404	394	394
query51	6855	6809	6816	6809
query52	109	86	92	86
query53	348	283	293	283
query54	958	429	449	429
query55	74	72	74	72
query56	242	225	222	222
query57	1260	1164	1126	1126
query58	224	206	200	200
query59	3457	3379	3183	3183
query60	259	234	262	234
query61	90	92	86	86
query62	693	465	464	464
query63	308	281	281	281
query64	9740	7375	7372	7372
query65	3161	3104	3103	3103
query66	1385	346	337	337
query67	15498	15293	15389	15293
query68	4670	539	536	536
query69	478	302	306	302
query70	1122	1161	1119	1119
query71	409	257	281	257
query72	7250	2579	2350	2350
query73	706	327	326	326
query74	6586	6165	6123	6123
query75	3507	2673	2642	2642
query76	2998	1036	1041	1036
query77	406	273	267	267
query78	10669	10220	10194	10194
query79	2507	523	515	515
query80	899	431	432	431
query81	509	220	219	219
query82	873	93	90	90
query83	273	168	169	168
query84	291	88	84	84
query85	1544	272	256	256
query86	463	322	308	308
query87	3277	3088	3083	3083
query88	4377	2449	2456	2449
query89	468	374	405	374
query90	2015	187	195	187
query91	131	99	100	99
query92	58	51	50	50
query93	1761	513	500	500
query94	1159	185	187	185
query95	393	293	296	293
query96	591	278	269	269
query97	3234	2995	2998	2995
query98	237	218	211	211
query99	1146	905	918	905
Total cold run time: 288218 ms
Total hot run time: 188082 ms

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 14, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(block, columns_to_filter,
(*_delete_rows_filter_ptr)));
} else {
std::unique_ptr<IColumn::Filter> filter(new IColumn::Filter(block->rows(), 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need to create filter when there's no delete rows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_delete_rows_filter_ptr be used in transactional hive orc , not position delete.

@AshinGau
Copy link
Member

LGTM

@morningman morningman merged commit 788abf2 into apache:master May 15, 2024
hubgeter added a commit to hubgeter/doris that referenced this pull request May 15, 2024
…rm position delete when reading the orc file without a predicate. (apache#34814)

fix this bug:
Scenario : iceberg table uses orc storage format
sql : `select * from iceberg_orc;`
When executing this sql, position delete filter is not performed.
morningman pushed a commit that referenced this pull request May 15, 2024
…rm position delete when reading the orc file without a predicate. (#34814) (#34882)

bp #34814
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.3-merged dev/3.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants