Skip to content

Conversation

@mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Mar 24, 2025

What problem does this PR solve?

If the two arrays have the same non-null elements, they are considered overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array contains a null element, the result is null.
Otherwise, the result is 0.

select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Mar 24, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Member Author

mrhhsg commented Mar 24, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34233 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 79cbf2161f4e8dce06ee4924c54b57db65a733a7, data reload: false

------ Round 1 ----------------------------------
q1	26108	5166	5079	5079
q2	2098	301	164	164
q3	10388	1236	701	701
q4	10250	994	535	535
q5	7516	2376	2341	2341
q6	187	161	133	133
q7	898	725	601	601
q8	9301	1240	1125	1125
q9	6982	5182	5010	5010
q10	6868	2331	1925	1925
q11	488	267	270	267
q12	349	354	215	215
q13	17775	3687	3079	3079
q14	243	250	217	217
q15	539	473	494	473
q16	626	626	606	606
q17	583	858	346	346
q18	7542	7314	7226	7226
q19	1648	967	557	557
q20	305	326	198	198
q21	4259	3439	2487	2487
q22	1071	1015	948	948
Total cold run time: 116024 ms
Total hot run time: 34233 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5241	5155	5158	5155
q2	245	330	236	236
q3	2152	2584	2296	2296
q4	1428	1804	1495	1495
q5	4484	4441	4403	4403
q6	212	184	134	134
q7	2015	1921	1796	1796
q8	2570	2638	2547	2547
q9	7329	7211	7197	7197
q10	2987	3229	2757	2757
q11	589	519	495	495
q12	666	756	626	626
q13	3560	3962	3403	3403
q14	276	300	266	266
q15	515	465	472	465
q16	621	701	641	641
q17	1136	1588	1395	1395
q18	7816	7574	7504	7504
q19	860	823	902	823
q20	1958	1969	1849	1849
q21	5432	5193	4904	4904
q22	1084	1071	1033	1033
Total cold run time: 53176 ms
Total hot run time: 51420 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193754 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 79cbf2161f4e8dce06ee4924c54b57db65a733a7, data reload: false

query1	1418	1054	1058	1054
query2	6177	1917	1893	1893
query3	11070	4472	4468	4468
query4	52597	25154	23070	23070
query5	5166	567	484	484
query6	364	205	196	196
query7	4953	511	285	285
query8	347	255	237	237
query9	6170	2602	2631	2602
query10	437	324	268	268
query11	15480	15119	14852	14852
query12	169	112	117	112
query13	1104	521	398	398
query14	10083	6338	6320	6320
query15	208	192	185	185
query16	7086	657	513	513
query17	1050	736	566	566
query18	1522	407	323	323
query19	203	216	161	161
query20	129	129	119	119
query21	209	119	109	109
query22	4733	4732	4630	4630
query23	34113	33553	33388	33388
query24	6683	2396	2441	2396
query25	457	480	426	426
query26	703	277	145	145
query27	2278	513	330	330
query28	3025	2453	2489	2453
query29	589	564	437	437
query30	266	227	190	190
query31	878	867	781	781
query32	69	64	67	64
query33	441	370	293	293
query34	757	856	542	542
query35	803	859	777	777
query36	931	1028	933	933
query37	123	105	81	81
query38	4177	4340	4141	4141
query39	1491	1436	1442	1436
query40	210	124	107	107
query41	53	53	53	53
query42	126	105	100	100
query43	502	484	489	484
query44	1332	819	823	819
query45	183	174	172	172
query46	857	1027	635	635
query47	1903	1871	1813	1813
query48	402	419	319	319
query49	725	519	449	449
query50	717	781	415	415
query51	4309	4412	4271	4271
query52	109	109	102	102
query53	224	257	178	178
query54	525	496	419	419
query55	81	81	85	81
query56	285	314	270	270
query57	1176	1223	1146	1146
query58	248	248	253	248
query59	2696	2857	2695	2695
query60	320	293	273	273
query61	160	153	145	145
query62	731	726	689	689
query63	229	183	192	183
query64	1705	1043	728	728
query65	4594	4493	4445	4445
query66	720	394	290	290
query67	16012	15928	15461	15461
query68	6815	886	504	504
query69	534	305	255	255
query70	1273	1054	1059	1054
query71	492	292	277	277
query72	5617	5120	5023	5023
query73	1305	591	348	348
query74	9032	9108	8943	8943
query75	3809	3264	2723	2723
query76	4264	1201	741	741
query77	636	361	273	273
query78	10135	10192	9303	9303
query79	3471	810	537	537
query80	649	527	449	449
query81	492	256	224	224
query82	509	180	96	96
query83	277	179	155	155
query84	292	99	75	75
query85	798	359	322	322
query86	376	309	282	282
query87	4400	4538	4450	4450
query88	3403	2234	2197	2197
query89	396	309	273	273
query90	1799	210	211	210
query91	141	150	119	119
query92	72	62	54	54
query93	2761	1061	576	576
query94	678	407	293	293
query95	355	282	278	278
query96	488	561	270	270
query97	3317	3406	3322	3322
query98	233	207	202	202
query99	1412	1397	1278	1278
Total cold run time: 299824 ms
Total hot run time: 193754 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.73 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 79cbf2161f4e8dce06ee4924c54b57db65a733a7, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.11	0.10
query3	0.25	0.18	0.19
query4	1.60	0.19	0.18
query5	0.58	0.58	0.59
query6	1.20	0.71	0.72
query7	0.03	0.02	0.02
query8	0.04	0.03	0.03
query9	0.56	0.52	0.53
query10	0.59	0.58	0.58
query11	0.15	0.11	0.11
query12	0.14	0.11	0.11
query13	0.62	0.59	0.60
query14	2.68	2.72	2.68
query15	0.94	0.86	0.84
query16	0.38	0.36	0.39
query17	1.01	1.03	1.06
query18	0.21	0.20	0.19
query19	1.89	1.93	1.84
query20	0.01	0.01	0.01
query21	15.37	0.88	0.53
query22	0.76	1.26	0.61
query23	14.93	1.39	0.67
query24	7.30	2.01	0.37
query25	0.30	0.18	0.09
query26	0.67	0.16	0.13
query27	0.06	0.05	0.05
query28	8.70	0.89	0.45
query29	12.55	4.02	3.37
query30	0.25	0.09	0.07
query31	2.83	0.59	0.40
query32	3.22	0.54	0.47
query33	3.05	3.08	3.04
query34	15.66	5.15	4.48
query35	4.53	4.49	4.54
query36	0.66	0.49	0.48
query37	0.09	0.07	0.07
query38	0.06	0.04	0.04
query39	0.03	0.02	0.02
query40	0.17	0.14	0.13
query41	0.08	0.03	0.03
query42	0.03	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 104.38 s
Total hot run time: 30.73 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 78.18% (43/55) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 50.23% (13441/26760)
Line Coverage 39.67% (116453/293520)
Region Coverage 38.40% (59205/154186)
Branch Coverage 33.53% (29904/89194)

@yiguolei yiguolei added usercase Important user case type label dev/2.1.x dev/3.0.x labels Mar 24, 2025
2019-01-01 26823b3995ee38bd145ddd910b2f6300 ["x"]
2019-01-01 a648a447b8f71522f11632eba4b4adde ["p", "q", "r", "s", "t"]
2019-01-01 a9fb5c985c90bf05f3bee5ca3ae95260 ["u", "v"]
2019-01-01 ee27ee1da291e46403c408e220bed6e1 ["y"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我们改了arrays overlap,为啥会改array contains的case?

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 25, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

const UInt8* dst_nullmap_data, UInt8* dst_data) const {
const ColumnArrayExecutionData& right_data, UInt8* dst_nullmap_data,
UInt8* dst_data) const {
using ExecutorImpl = OverlapSetImpl<T>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this logic same with inverted index case ?

Copy link
Contributor

@amorynan amorynan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit 4acae63 into apache:master Mar 29, 2025
33 of 38 checks passed
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Mar 31, 2025
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Mar 31, 2025
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Apr 1, 2025
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Apr 3, 2025
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Apr 3, 2025
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Apr 3, 2025
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
mrhhsg added a commit to mrhhsg/doris that referenced this pull request Apr 3, 2025
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
yiguolei pushed a commit that referenced this pull request Apr 4, 2025
Pick #49403
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```

### What problem does this PR solve?
dataroaring pushed a commit that referenced this pull request Apr 9, 2025
PICK #49403

If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
@gavinchou gavinchou mentioned this pull request Apr 23, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```
yiguolei pushed a commit to apache/doris-website that referenced this pull request Jul 18, 2025
apache/doris#49403

## Versions 

- [x] dev
- [ ] 3.0
- [ ] 2.1
- [ ] 2.0

## Languages

- [x] Chinese
- [x] English

## Docs Checklist

- [ ] Checked by AI
- [ ] Test Cases Built
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.10-merged dev/3.0.5-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants