Skip to content

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Jan 6, 2025

Cherry-picked from #46099

…rimming SQL input (#46099)

- Currently, the SQL cache system in Doris may miss cache hits due to
semantically identical queries being treated as different because of:
  - Extra whitespace characters in the SQL query
  - SQL comments that don't affect the query execution
- For example, these queries are semantically identical but would
generate different cache keys:
  ```sql
  SELECT * FROM table;
  -- Same query with comments and extra spaces
  /* Comment */  SELECT   *   FROM   table  ;
  ```
- This PR improves the SQL cache hit rate by:
  - Trimming whitespace from SQL queries
  - Removing SQL comments before calculating the cache key MD5
- This ensures that queries that are semantically identical but differ
only in whitespace or comments will now hit the same cache entry,
improving cache efficiency and reducing unnecessary query executions
@Thearas
Copy link
Contributor

Thearas commented Jan 6, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Jan 6, 2025
@dataroaring dataroaring reopened this Jan 6, 2025
@Thearas
Copy link
Contributor

Thearas commented Jan 6, 2025

run buildall

1 similar comment
@924060929
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40799 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f1dd2005d5b56a63900fa77124fe7efc0255c509, data reload: false

------ Round 1 ----------------------------------
q1	17584	7343	7265	7265
q2	2051	182	190	182
q3	11022	1072	1273	1072
q4	10544	718	839	718
q5	7737	2832	2821	2821
q6	239	149	144	144
q7	981	613	617	613
q8	9369	1966	2048	1966
q9	6659	6386	6403	6386
q10	7039	2297	2359	2297
q11	477	263	259	259
q12	410	212	210	210
q13	17796	3016	3004	3004
q14	241	222	210	210
q15	551	511	508	508
q16	707	624	615	615
q17	975	609	542	542
q18	7227	6717	6775	6717
q19	1404	1059	1001	1001
q20	476	208	193	193
q21	4028	3075	3167	3075
q22	1126	1001	1007	1001
Total cold run time: 108643 ms
Total hot run time: 40799 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7219	7258	7202	7202
q2	318	230	235	230
q3	2905	2893	2936	2893
q4	2046	1852	1775	1775
q5	5681	5761	5730	5730
q6	222	137	142	137
q7	2279	1786	1856	1786
q8	3361	3524	3567	3524
q9	8815	8911	8879	8879
q10	3605	3547	3506	3506
q11	610	526	501	501
q12	844	602	582	582
q13	9208	3244	3206	3206
q14	314	292	274	274
q15	583	523	524	523
q16	707	673	704	673
q17	1828	1634	1617	1617
q18	8314	7864	7555	7555
q19	1663	1603	1610	1603
q20	2115	1885	1872	1872
q21	5505	5306	5416	5306
q22	1170	1042	1059	1042
Total cold run time: 69312 ms
Total hot run time: 60416 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197501 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f1dd2005d5b56a63900fa77124fe7efc0255c509, data reload: false

query1	1318	928	945	928
query2	6263	2064	2076	2064
query3	10957	4313	4339	4313
query4	66686	28818	23472	23472
query5	4997	462	470	462
query6	419	171	165	165
query7	5652	311	319	311
query8	308	236	218	218
query9	9096	2652	2646	2646
query10	459	263	251	251
query11	17541	15327	15859	15327
query12	156	104	104	104
query13	1534	458	441	441
query14	10488	7326	7319	7319
query15	199	184	178	178
query16	7245	580	505	505
query17	1117	571	589	571
query18	1823	329	312	312
query19	231	155	155	155
query20	113	109	109	109
query21	203	102	108	102
query22	4887	4317	4438	4317
query23	34459	34145	34652	34145
query24	6166	2851	2904	2851
query25	517	415	401	401
query26	644	172	177	172
query27	1932	353	352	352
query28	4214	2441	2405	2405
query29	700	447	448	447
query30	240	159	156	156
query31	998	822	807	807
query32	65	52	56	52
query33	442	290	293	290
query34	940	520	524	520
query35	838	728	730	728
query36	1115	984	991	984
query37	122	76	78	76
query38	4187	4095	4084	4084
query39	1530	1496	1469	1469
query40	207	100	98	98
query41	49	47	50	47
query42	118	100	104	100
query43	541	504	519	504
query44	1183	806	824	806
query45	189	165	178	165
query46	1149	737	739	737
query47	2056	1900	1938	1900
query48	496	400	382	382
query49	752	396	400	396
query50	839	437	431	431
query51	7204	7352	7155	7155
query52	105	90	95	90
query53	269	187	181	181
query54	572	465	455	455
query55	80	78	79	78
query56	285	243	249	243
query57	1243	1135	1103	1103
query58	226	221	226	221
query59	3191	3286	2861	2861
query60	283	258	253	253
query61	107	108	119	108
query62	791	686	656	656
query63	216	190	201	190
query64	1367	656	627	627
query65	3266	3232	3206	3206
query66	706	298	300	298
query67	15823	15555	15540	15540
query68	3598	604	572	572
query69	450	265	264	264
query70	1166	1069	1112	1069
query71	356	251	267	251
query72	6190	4051	3982	3982
query73	743	345	347	345
query74	10040	9038	8964	8964
query75	3399	2679	2654	2654
query76	1856	1080	1058	1058
query77	508	277	269	269
query78	10565	9586	9689	9586
query79	1176	604	582	582
query80	847	411	436	411
query81	505	242	240	240
query82	1286	115	116	115
query83	171	141	149	141
query84	281	82	85	82
query85	819	293	286	286
query86	338	290	310	290
query87	4563	4422	4228	4228
query88	3495	2398	2345	2345
query89	411	291	293	291
query90	2031	188	190	188
query91	180	146	164	146
query92	63	49	49	49
query93	1284	538	546	538
query94	787	290	291	290
query95	352	256	252	252
query96	616	286	285	285
query97	3313	3215	3195	3195
query98	219	192	195	192
query99	1584	1325	1301	1301
Total cold run time: 317306 ms
Total hot run time: 197501 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.64 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f1dd2005d5b56a63900fa77124fe7efc0255c509, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.03	0.03
query3	0.24	0.06	0.06
query4	1.62	0.10	0.11
query5	0.51	0.50	0.50
query6	1.14	0.72	0.71
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.55	0.50	0.50
query10	0.54	0.54	0.55
query11	0.14	0.10	0.09
query12	0.14	0.11	0.13
query13	0.61	0.61	0.60
query14	3.04	2.96	3.10
query15	0.89	0.83	0.82
query16	0.38	0.38	0.38
query17	1.01	1.06	1.00
query18	0.24	0.21	0.22
query19	1.85	1.81	2.00
query20	0.01	0.02	0.01
query21	15.35	0.55	0.58
query22	2.71	2.85	2.43
query23	16.83	1.12	0.80
query24	3.25	1.09	1.51
query25	0.35	0.12	0.12
query26	0.38	0.13	0.13
query27	0.06	0.04	0.04
query28	10.28	1.12	1.08
query29	12.58	3.31	3.25
query30	0.25	0.06	0.06
query31	2.85	0.37	0.37
query32	3.26	0.46	0.47
query33	2.99	3.00	3.04
query34	16.90	4.45	4.57
query35	4.54	4.57	4.53
query36	0.68	0.48	0.50
query37	0.09	0.06	0.06
query38	0.05	0.04	0.03
query39	0.04	0.02	0.02
query40	0.16	0.13	0.12
query41	0.09	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.82 s
Total hot run time: 33.64 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 3dfb51e into branch-3.0 Mar 20, 2025
22 of 23 checks passed
@github-actions github-actions bot deleted the auto-pick-46099-branch-3.0 branch March 20, 2025 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants