Skip to content

Conversation

@keanji-x
Copy link
Contributor

@keanji-x keanji-x commented May 13, 2024

Proposed changes

lazy get ExpressionMap when comparing hypergraph

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39939 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 797161beeb0b45defc4a9786f69d15cad1ea168e, data reload: false

------ Round 1 ----------------------------------
q1	17625	4286	4248	4248
q2	2012	189	193	189
q3	10501	1276	1155	1155
q4	10457	818	784	784
q5	7487	2741	2636	2636
q6	218	134	132	132
q7	958	531	529	529
q8	9282	2096	2084	2084
q9	9325	6662	6611	6611
q10	9702	3740	3690	3690
q11	473	239	239	239
q12	506	211	215	211
q13	17777	2950	2946	2946
q14	263	218	225	218
q15	514	476	476	476
q16	507	387	378	378
q17	978	646	764	646
q18	8087	7360	7377	7360
q19	5244	1491	1490	1490
q20	655	306	317	306
q21	5053	3332	3964	3332
q22	362	279	282	279
Total cold run time: 117986 ms
Total hot run time: 39939 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4396	4221	4216	4216
q2	374	278	281	278
q3	2988	2757	2724	2724
q4	1927	1616	1638	1616
q5	5266	5279	5230	5230
q6	209	123	123	123
q7	1716	1372	1357	1357
q8	3185	3305	3318	3305
q9	8357	8322	8293	8293
q10	3861	3693	3686	3686
q11	570	495	498	495
q12	784	606	619	606
q13	16459	2946	2964	2946
q14	307	272	259	259
q15	512	477	493	477
q16	456	417	410	410
q17	1763	1499	1457	1457
q18	7487	7422	7319	7319
q19	1678	1543	1574	1543
q20	1979	1754	1757	1754
q21	4910	4941	4963	4941
q22	583	507	538	507
Total cold run time: 69767 ms
Total hot run time: 53542 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186290 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 797161beeb0b45defc4a9786f69d15cad1ea168e, data reload: false

query1	909	361	345	345
query2	6448	2429	2322	2322
query3	6658	205	212	205
query4	23162	21094	21071	21071
query5	4125	413	435	413
query6	251	174	169	169
query7	4588	294	296	294
query8	244	190	194	190
query9	8381	2431	2416	2416
query10	422	251	263	251
query11	14674	14356	14135	14135
query12	132	90	85	85
query13	1650	376	355	355
query14	9840	8490	6940	6940
query15	279	167	169	167
query16	8045	261	255	255
query17	1888	546	569	546
query18	2008	284	266	266
query19	241	147	148	147
query20	94	85	89	85
query21	187	134	123	123
query22	5200	4873	4919	4873
query23	34332	33579	33690	33579
query24	11873	2873	2793	2793
query25	631	377	366	366
query26	1733	153	153	153
query27	3018	324	327	324
query28	7609	2081	2047	2047
query29	989	642	592	592
query30	294	147	154	147
query31	974	745	731	731
query32	91	53	54	53
query33	752	247	239	239
query34	1055	479	489	479
query35	821	681	653	653
query36	1077	909	920	909
query37	141	66	73	66
query38	2885	2776	2771	2771
query39	1616	1541	1570	1541
query40	271	121	123	121
query41	42	40	38	38
query42	102	98	95	95
query43	567	532	530	530
query44	1233	727	730	727
query45	266	255	250	250
query46	1103	725	743	725
query47	1985	1919	1887	1887
query48	363	301	305	301
query49	1188	410	405	405
query50	783	386	393	386
query51	6841	6573	6587	6573
query52	106	88	89	88
query53	350	285	273	273
query54	980	451	442	442
query55	74	73	70	70
query56	253	226	237	226
query57	1279	1156	1189	1156
query58	233	225	212	212
query59	3377	3305	3233	3233
query60	267	246	240	240
query61	106	104	107	104
query62	672	464	463	463
query63	321	285	282	282
query64	9818	7490	7454	7454
query65	3172	3132	3107	3107
query66	1386	353	349	349
query67	15489	15438	15089	15089
query68	6122	546	529	529
query69	549	301	311	301
query70	1128	1129	1166	1129
query71	454	263	265	263
query72	7666	2540	2388	2388
query73	744	332	324	324
query74	6571	6178	6029	6029
query75	3843	2656	2630	2630
query76	3338	950	971	950
query77	620	260	260	260
query78	10756	10215	9995	9995
query79	3695	507	515	507
query80	1292	430	424	424
query81	502	222	219	219
query82	1036	95	100	95
query83	198	164	169	164
query84	271	89	126	89
query85	1465	261	257	257
query86	470	292	293	292
query87	3338	3146	3103	3103
query88	4390	2439	2428	2428
query89	494	374	385	374
query90	1984	188	184	184
query91	121	95	94	94
query92	58	50	47	47
query93	4634	505	488	488
query94	1159	174	181	174
query95	409	296	297	296
query96	598	273	264	264
query97	3202	2972	2988	2972
query98	246	220	209	209
query99	1318	890	891	890
Total cold run time: 294615 ms
Total hot run time: 186290 ms

@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41725 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit de229c5b5879e8a3f36c92176c9496ca58ce6691, data reload: false

------ Round 1 ----------------------------------
q1	6742	4241	4234	4234
q2	952	183	204	183
q3	5293	1121	1252	1121
q4	1010	816	776	776
q5	2625	2677	2642	2642
q6	223	136	136	136
q7	1075	737	691	691
q8	2012	2107	2095	2095
q9	6857	6818	6745	6745
q10	4084	3905	3934	3905
q11	384	249	253	249
q12	387	224	220	220
q13	16342	2943	3080	2943
q14	269	221	221	221
q15	524	466	488	466
q16	508	416	392	392
q17	964	721	722	721
q18	8300	7895	7906	7895
q19	2439	1567	1525	1525
q20	933	313	322	313
q21	10871	3978	4140	3978
q22	356	274	293	274
Total cold run time: 73150 ms
Total hot run time: 41725 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4367	4372	4440	4372
q2	385	257	263	257
q3	3137	2880	2777	2777
q4	1878	1598	1556	1556
q5	5350	5287	5257	5257
q6	211	120	123	120
q7	2207	1824	1922	1824
q8	3177	3334	3297	3297
q9	8386	8356	8333	8333
q10	3821	3665	3644	3644
q11	579	473	477	473
q12	757	588	595	588
q13	3535	2930	2953	2930
q14	270	254	250	250
q15	514	473	478	473
q16	450	414	412	412
q17	1739	1487	1476	1476
q18	7536	7561	7304	7304
q19	1639	1560	1583	1560
q20	1982	1776	1802	1776
q21	4963	4732	4744	4732
q22	578	477	480	477
Total cold run time: 57461 ms
Total hot run time: 53888 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187401 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit de229c5b5879e8a3f36c92176c9496ca58ce6691, data reload: false

query1	912	362	342	342
query2	6446	2312	2165	2165
query3	6654	203	211	203
query4	22826	21218	21123	21123
query5	4118	422	416	416
query6	254	166	169	166
query7	4591	286	298	286
query8	240	183	203	183
query9	8614	2396	2365	2365
query10	432	259	246	246
query11	14826	14200	14124	14124
query12	126	91	85	85
query13	1637	368	367	367
query14	9243	7716	8399	7716
query15	248	170	166	166
query16	8133	262	259	259
query17	1715	566	531	531
query18	2116	273	265	265
query19	231	148	142	142
query20	87	85	82	82
query21	195	127	132	127
query22	5056	4894	4881	4881
query23	34058	33420	33540	33420
query24	6606	2900	2878	2878
query25	557	376	355	355
query26	691	154	151	151
query27	1989	310	327	310
query28	4157	2057	2039	2039
query29	825	599	596	596
query30	251	158	150	150
query31	955	746	747	746
query32	90	55	55	55
query33	470	253	242	242
query34	851	479	498	479
query35	762	701	682	682
query36	1042	918	892	892
query37	103	73	67	67
query38	2921	2773	2802	2773
query39	1610	1616	1561	1561
query40	193	122	121	121
query41	43	40	39	39
query42	101	96	120	96
query43	570	525	512	512
query44	1064	725	733	725
query45	268	246	252	246
query46	1062	727	732	727
query47	1992	1875	1917	1875
query48	366	296	302	296
query49	864	398	393	393
query50	767	387	372	372
query51	6888	6735	6777	6735
query52	99	96	85	85
query53	352	279	277	277
query54	522	431	435	431
query55	74	70	73	70
query56	239	223	217	217
query57	1248	1145	1173	1145
query58	215	198	198	198
query59	3398	3244	3001	3001
query60	245	262	236	236
query61	95	88	84	84
query62	637	468	463	463
query63	306	278	283	278
query64	8526	7428	7397	7397
query65	3133	3112	3124	3112
query66	788	341	345	341
query67	15679	15127	15070	15070
query68	4484	542	524	524
query69	465	310	320	310
query70	1172	1138	1082	1082
query71	359	272	267	267
query72	7859	2639	2373	2373
query73	704	319	322	319
query74	6644	6662	6654	6654
query75	3334	2680	2710	2680
query76	2262	988	1036	988
query77	425	275	276	275
query78	11051	10506	10212	10212
query79	2634	518	517	517
query80	1778	446	504	446
query81	525	218	217	217
query82	708	91	94	91
query83	260	164	165	164
query84	269	80	82	80
query85	1800	277	260	260
query86	522	307	325	307
query87	3291	3168	3118	3118
query88	3981	2334	2330	2330
query89	475	371	390	371
query90	2026	180	186	180
query91	121	96	93	93
query92	56	51	50	50
query93	2954	511	498	498
query94	1193	175	180	175
query95	391	303	301	301
query96	601	269	266	266
query97	3191	2958	2993	2958
query98	232	218	214	214
query99	1222	903	917	903
Total cold run time: 273361 ms
Total hot run time: 187401 ms

@keanji-x
Copy link
Contributor Author

run buildall

@keanji-x
Copy link
Contributor Author

run buildall

queryToViewAllExpressionMapping.putAll(getQueryToViewFilterEdgeExpressionMapping());
return queryToViewAllExpressionMapping;
public Expression getQueryJoinExprFromView(Expression viewJoinExpr) {
return queryToViewJoinEdgeExpressionMappingSupplier.get().inverse().get(viewJoinExpr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inverse() need calc, Maybe we should construct viewToQueryJoinEdgeExpressionMapping field in context for performance

}

public Expression getQueryFilterExprFromView(Expression viewJoinExpr) {
return queryToViewFilterEdgeExpressionMappingSupplier.get().inverse().get(viewJoinExpr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same as above

public BiMap<Expression, Expression> getQueryToViewNodeExpressionMapping() {
return queryToViewNodeExpressionMappingSupplier.get();
public Expression getQueryNodeExpFromView(Expression viewJoinExpr) {
return queryToViewNodeExpressionMappingSupplier.get().inverse().get(viewJoinExpr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same as above

@keanji-x
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@keanji-x keanji-x changed the title [feat](Nereids): lazy get expression map [feat](Nereids): lazy get expression map when comparing hypergraph May 16, 2024
@morrySnow
Copy link
Contributor

run performance

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 17, 2024
@morrySnow morrySnow merged commit 5a2a013 into apache:master May 17, 2024
starocean999 pushed a commit that referenced this pull request Dec 3, 2024
…by materialized view fail (#44575)

Such as mv def is as following and query is the same
this is the same filter `l_orderkey is null or l_orderkey <> 1` but they
are in the different position. this would cause rewrite fail, this pr
fix this.
```sql
select 
  o_custkey, 
  o_orderdate, 
  o_shippriority, 
  o_comment, 
  o_orderkey, 
  orders.public_col as col1, 
  l_orderkey, 
  l_partkey, 
  l_suppkey, 
  lineitem.public_col as col2, 
  ps_partkey, 
  ps_suppkey, 
  partsupp.public_col as col3, 
  partsupp.public_col * 2 as col4, 
  o_orderkey + l_orderkey + ps_partkey * 2, 
  sum(
    o_orderkey + l_orderkey + ps_partkey * 2
  ), 
  count() as count_all 
from 
  (
    select 
      o_custkey, 
      o_orderdate, 
      o_shippriority, 
      o_comment, 
      o_orderkey, 
      orders.public_col as public_col 
    from 
      orders
  ) orders 
  left join (
    select 
      l_orderkey, 
      l_partkey, 
      l_suppkey, 
      lineitem.public_col as public_col 
    from 
      lineitem 
    where 
      l_orderkey is null 
      or l_orderkey <> 1
  ) lineitem on l_orderkey = o_orderkey 
  inner join (
    select 
      ps_partkey, 
      ps_suppkey, 
      partsupp.public_col as public_col 
    from 
      partsupp
  ) partsupp on ps_partkey = o_orderkey 
  and ps_suppkey = o_custkey 
where 
  l_orderkey is null 
  or l_orderkey <> 1 
group by 
  1, 
  2, 
  3, 
  4, 
  5, 
  6, 
  7, 
  8, 
  9, 
  10, 
  11, 
  12, 
  13, 
  14;
```
Related PR: #34753 
Fix filter position different but same causing rewritten by materialized
view fail
github-actions bot pushed a commit that referenced this pull request Dec 3, 2024
…by materialized view fail (#44575)

Such as mv def is as following and query is the same
this is the same filter `l_orderkey is null or l_orderkey <> 1` but they
are in the different position. this would cause rewrite fail, this pr
fix this.
```sql
select 
  o_custkey, 
  o_orderdate, 
  o_shippriority, 
  o_comment, 
  o_orderkey, 
  orders.public_col as col1, 
  l_orderkey, 
  l_partkey, 
  l_suppkey, 
  lineitem.public_col as col2, 
  ps_partkey, 
  ps_suppkey, 
  partsupp.public_col as col3, 
  partsupp.public_col * 2 as col4, 
  o_orderkey + l_orderkey + ps_partkey * 2, 
  sum(
    o_orderkey + l_orderkey + ps_partkey * 2
  ), 
  count() as count_all 
from 
  (
    select 
      o_custkey, 
      o_orderdate, 
      o_shippriority, 
      o_comment, 
      o_orderkey, 
      orders.public_col as public_col 
    from 
      orders
  ) orders 
  left join (
    select 
      l_orderkey, 
      l_partkey, 
      l_suppkey, 
      lineitem.public_col as public_col 
    from 
      lineitem 
    where 
      l_orderkey is null 
      or l_orderkey <> 1
  ) lineitem on l_orderkey = o_orderkey 
  inner join (
    select 
      ps_partkey, 
      ps_suppkey, 
      partsupp.public_col as public_col 
    from 
      partsupp
  ) partsupp on ps_partkey = o_orderkey 
  and ps_suppkey = o_custkey 
where 
  l_orderkey is null 
  or l_orderkey <> 1 
group by 
  1, 
  2, 
  3, 
  4, 
  5, 
  6, 
  7, 
  8, 
  9, 
  10, 
  11, 
  12, 
  13, 
  14;
```
Related PR: #34753 
Fix filter position different but same causing rewritten by materialized
view fail
github-actions bot pushed a commit that referenced this pull request Dec 3, 2024
…by materialized view fail (#44575)

Such as mv def is as following and query is the same
this is the same filter `l_orderkey is null or l_orderkey <> 1` but they
are in the different position. this would cause rewrite fail, this pr
fix this.
```sql
select 
  o_custkey, 
  o_orderdate, 
  o_shippriority, 
  o_comment, 
  o_orderkey, 
  orders.public_col as col1, 
  l_orderkey, 
  l_partkey, 
  l_suppkey, 
  lineitem.public_col as col2, 
  ps_partkey, 
  ps_suppkey, 
  partsupp.public_col as col3, 
  partsupp.public_col * 2 as col4, 
  o_orderkey + l_orderkey + ps_partkey * 2, 
  sum(
    o_orderkey + l_orderkey + ps_partkey * 2
  ), 
  count() as count_all 
from 
  (
    select 
      o_custkey, 
      o_orderdate, 
      o_shippriority, 
      o_comment, 
      o_orderkey, 
      orders.public_col as public_col 
    from 
      orders
  ) orders 
  left join (
    select 
      l_orderkey, 
      l_partkey, 
      l_suppkey, 
      lineitem.public_col as public_col 
    from 
      lineitem 
    where 
      l_orderkey is null 
      or l_orderkey <> 1
  ) lineitem on l_orderkey = o_orderkey 
  inner join (
    select 
      ps_partkey, 
      ps_suppkey, 
      partsupp.public_col as public_col 
    from 
      partsupp
  ) partsupp on ps_partkey = o_orderkey 
  and ps_suppkey = o_custkey 
where 
  l_orderkey is null 
  or l_orderkey <> 1 
group by 
  1, 
  2, 
  3, 
  4, 
  5, 
  6, 
  7, 
  8, 
  9, 
  10, 
  11, 
  12, 
  13, 
  14;
```
Related PR: #34753 
Fix filter position different but same causing rewritten by materialized
view fail
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.4-merged dev/3.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants