Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #50149

…50149)

### What problem does this PR solve?

Problem Summary:

sql
```
SELECT t1.no
      ,t1.sub_str
      ,t1.str
      ,substring_index(t1.str, t1.sub_str, -1)
      ,t2.rst2
FROM (
       SELECT 1 AS no, 'BBB' AS sub_str, 'AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06' AS str UNION ALL
       SELECT 2 AS no, 'ccc' AS sub_str, 'zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06' AS str UNION ALL
       SELECT 3 AS no, 'DDD' AS sub_str, 'AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06' AS str UNION ALL
       SELECT 4 AS no, 'DDD' AS sub_str, 'sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06' AS str UNION ALL
       SELECT 5 AS no, 'eee' AS sub_str, 'cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06' AS str UNION ALL
       SELECT 6 AS no, 'A_01' AS sub_str, 'AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06' AS str 
     ) t1
LEFT JOIN (
       SELECT 1 AS no, 'BBB' AS sub_str,  substring_index('AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06', 'BBB', -1) AS rst2 UNION ALL
       SELECT 2 AS no, 'ccc' AS sub_str,  substring_index('zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06', 'ccc', -1) AS rst2 UNION ALL
       SELECT 3 AS no, 'DDD' AS sub_str,  substring_index('AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06', 'DDD', -1) AS rst2 UNION ALL
       SELECT 4 AS no, 'DDD' AS sub_str,  substring_index('sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06', 'DDD', -1) AS rst2 UNION ALL
       SELECT 5 AS no, 'eee' AS sub_str,  substring_index('cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06', 'eee', -1) AS rst2 UNION ALL
       SELECT 6 AS no, 'A_01' AS sub_str, substring_index('AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06', 'A_01', -1) AS rst2 
     ) t2
     ON t1.no = t2.no AND t1.sub_str = t2.sub_str
ORDER BY t1.no;
```
previous results for doris:
```
-- master
function except for the first argument, other parameter must be a constant

-- 2.1
+------+---------+-------------------------------------------+-------------------------------------------------+-------------------------------------------+
| no   | sub_str | str                                       | substring_index(`t1`.`str`, `t1`.`sub_str`, -1) | rst2                                      |
+------+---------+-------------------------------------------+-------------------------------------------------+-------------------------------------------+
|    1 | BBB     | AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06 | _02|CCC_03|DDD_04|EEE_05|FFF_06                 | _02|CCC_03|DDD_04|EEE_05|FFF_06           |
|    2 | ccc     | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06       | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 |
|    3 | DDD     | AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06 | _02|CCC_03|DDD_04|EEE_05|FFF_06                 | _04|EEE_05|FFF_06                         |
|    4 | DDD     | sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06 | sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06       | _04|rfv_05|rgb_06                         |
|    5 | eee     | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06       | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 |
|    6 | A_01    | AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06 | AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06       | |dsd_02|ert_03|bgt_04|fgh_05|hyb_06       |
+------+---------+-------------------------------------------+-------------------------------------------------+-------------------------------------------+
```

mysql result:
```
+----+---------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
| no | sub_str | str                                       | substring_index(t1.str, t1.sub_str, -1)   | rst2                                      |
+----+---------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
|  1 | BBB     | AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06 | _02|CCC_03|DDD_04|EEE_05|FFF_06           | _02|CCC_03|DDD_04|EEE_05|FFF_06           |
|  2 | ccc     | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 |
|  3 | DDD     | AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06 | _04|EEE_05|FFF_06                         | _04|EEE_05|FFF_06                         |
|  4 | DDD     | sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06 | _04|rfv_05|rgb_06                         | _04|rfv_05|rgb_06                         |
|  5 | eee     | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 |
|  6 | A_01    | AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06 | |dsd_02|ert_03|bgt_04|fgh_05|hyb_06       | |dsd_02|ert_03|bgt_04|fgh_05|hyb_06       |
+----+---------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
```

current doris results:
```
+------+---------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
| no   | sub_str | str                                       | substring_index(t1.str, t1.sub_str, -1)   | rst2                                      |
+------+---------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
|    1 | BBB     | AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06 | _02|CCC_03|DDD_04|EEE_05|FFF_06           | _02|CCC_03|DDD_04|EEE_05|FFF_06           |
|    2 | ccc     | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 | zyz_01|zyz_02|CCC_03|qwe_04|qwe_05|qwe_06 |
|    3 | DDD     | AAA_01|BBB_02|CCC_03|DDD_04|EEE_05|FFF_06 | _04|EEE_05|FFF_06                         | _04|EEE_05|FFF_06                         |
|    4 | DDD     | sgr_01|wsc_02|CCC_03|DDD_04|rfv_05|rgb_06 | _04|rfv_05|rgb_06                         | _04|rfv_05|rgb_06                         |
|    5 | eee     | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 | cdr_01|vfr_02|dfc_03|DDD_04|EEE_05|FFF_06 |
|    6 | A_01    | AAA_01|dsd_02|ert_03|bgt_04|fgh_05|hyb_06 | |dsd_02|ert_03|bgt_04|fgh_05|hyb_06       | |dsd_02|ert_03|bgt_04|fgh_05|hyb_06       |
+------+---------+-------------------------------------------+-------------------------------------------+-------------------------------------------+
```

now consistent with mysql behavior.
@github-actions github-actions bot requested a review from dataroaring as a code owner April 22, 2025 11:06
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Apr 22, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40015 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 13ee91145f7b0e0b2a939a0aa1d19f4d3ac45da3, data reload: false

------ Round 1 ----------------------------------
q1	17597	6740	6582	6582
q2	2075	177	169	169
q3	10955	1089	1215	1089
q4	10563	705	755	705
q5	7752	2792	2857	2792
q6	214	134	131	131
q7	988	636	615	615
q8	9342	1913	2027	1913
q9	6612	6440	6438	6438
q10	7041	2278	2284	2278
q11	473	264	266	264
q12	395	209	212	209
q13	17803	2991	2957	2957
q14	230	201	214	201
q15	499	462	473	462
q16	671	599	593	593
q17	985	603	553	553
q18	7075	6718	6657	6657
q19	1463	1129	1025	1025
q20	462	203	206	203
q21	4053	3265	3183	3183
q22	1132	1027	996	996
Total cold run time: 108380 ms
Total hot run time: 40015 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6628	6575	6551	6551
q2	332	237	233	233
q3	2897	2769	2929	2769
q4	2072	1794	1795	1794
q5	5750	5752	5721	5721
q6	205	130	127	127
q7	2219	1800	1815	1800
q8	3392	3566	3506	3506
q9	8899	8871	8948	8871
q10	3575	3529	3477	3477
q11	602	505	501	501
q12	791	573	593	573
q13	8975	3113	3209	3113
q14	308	266	267	266
q15	521	481	470	470
q16	694	652	658	652
q17	1836	1622	1614	1614
q18	8222	7714	7729	7714
q19	1675	1473	1544	1473
q20	2067	1808	1841	1808
q21	5434	5356	5291	5291
q22	1157	1092	1056	1056
Total cold run time: 68251 ms
Total hot run time: 59380 ms

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit f7153b7 into branch-3.0 Apr 27, 2025
21 of 24 checks passed
@github-actions github-actions bot deleted the auto-pick-50149-branch-3.0 branch April 27, 2025 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants