Skip to content

perf: Optimize array_has_any() with scalar arg#20385

Merged
alamb merged 23 commits intoapache:mainfrom
neilconway:neilc/optimize-array-has-any-scalar
Feb 24, 2026
Merged

perf: Optimize array_has_any() with scalar arg#20385
alamb merged 23 commits intoapache:mainfrom
neilconway:neilc/optimize-array-has-any-scalar

Conversation

@neilconway
Copy link
Copy Markdown
Contributor

@neilconway neilconway commented Feb 16, 2026

Which issue does this PR close?

Rationale for this change

When array_has_any is passed a scalar for either of its arguments, we can use a much faster algorithm: rather than doing O(N*M) comparisons for each row of the columnar arg, we can build a hash table on the scalar argument and probe it instead.

What changes are included in this PR?

  • Add benchmark to cover the one-scalar-arg case
  • Implement optimization as described above

Note that we fallback to a linear scan when the scalar arg is smaller than a threshold (<= 8 elements), because benchmarks suggested probing a HashSet is not profitable for very small arrays.

Are these changes tested?

Yes. Tests pass and benchmarked.

Are there any user-facing changes?

No.

@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Feb 16, 2026
@neilconway
Copy link
Copy Markdown
Contributor Author

Benchmark results:

 group                                          base                                    target
  -----                                          ----                                    ------
  array_has_all/all_found_small_needle/10        1.10      7.3±2.65ms        ? ?/sec     1.00      6.6±0.08ms        ? ?/sec
  array_has_all/all_found_small_needle/100       1.00     14.9±0.04ms        ? ?/sec     1.02     15.2±0.09ms        ? ?/sec
  array_has_all/all_found_small_needle/500       1.00     52.0±0.11ms        ? ?/sec     1.02     53.0±0.17ms        ? ?/sec
  array_has_all/not_all_found/10                 1.00      6.3±0.05ms        ? ?/sec     1.00      6.3±0.03ms        ? ?/sec
  array_has_all/not_all_found/100                1.00     13.7±0.07ms        ? ?/sec     1.01     13.9±0.09ms        ? ?/sec
  array_has_all/not_all_found/500                1.00     46.5±0.22ms        ? ?/sec     1.03     47.8±0.33ms        ? ?/sec
  array_has_all_strings/all_found/10             1.18      5.3±0.02ms        ? ?/sec     1.00      4.5±0.01ms        ? ?/sec
  array_has_all_strings/all_found/100            1.00     14.9±0.11ms        ? ?/sec     1.00     14.8±0.05ms        ? ?/sec
  array_has_all_strings/all_found/500            1.00     56.9±0.24ms        ? ?/sec     1.00     57.1±1.13ms        ? ?/sec
  array_has_all_strings/not_all_found/10         1.00      3.9±0.03ms        ? ?/sec     1.19      4.6±0.01ms        ? ?/sec
  array_has_all_strings/not_all_found/100        1.00     13.4±0.03ms        ? ?/sec     1.00     13.3±0.19ms        ? ?/sec
  array_has_all_strings/not_all_found/500        1.00     67.5±0.12ms        ? ?/sec     1.01     67.9±0.25ms        ? ?/sec
  array_has_any/no_match/10                      1.00      7.4±0.08ms        ? ?/sec     1.00      7.4±0.04ms        ? ?/sec
  array_has_any/no_match/100                     1.00     23.7±0.06ms        ? ?/sec     1.00     23.8±0.07ms        ? ?/sec
  array_has_any/no_match/500                     1.00     96.0±0.14ms        ? ?/sec     1.01     97.0±0.19ms        ? ?/sec
  array_has_any/scalar_no_match/10               3.62      7.9±0.08ms        ? ?/sec     1.00      2.2±0.02ms        ? ?/sec
  array_has_any/scalar_no_match/100              1.17     24.3±0.11ms        ? ?/sec     1.00     20.8±0.09ms        ? ?/sec
  array_has_any/scalar_no_match/500              1.00     96.8±0.30ms        ? ?/sec     1.40    135.2±0.83ms        ? ?/sec
  array_has_any/scalar_some_match/10             6.53      6.9±0.08ms        ? ?/sec     1.00   1060.1±4.31µs        ? ?/sec
  array_has_any/scalar_some_match/100            1.37     14.8±0.10ms        ? ?/sec     1.00     10.8±0.12ms        ? ?/sec
  array_has_any/scalar_some_match/500            1.00     49.2±0.15ms        ? ?/sec     1.67     82.3±0.83ms        ? ?/sec
  array_has_any/some_match/10                    1.00      6.5±0.08ms        ? ?/sec     1.01      6.6±0.26ms        ? ?/sec
  array_has_any/some_match/100                   1.00     14.9±0.07ms        ? ?/sec     1.01     15.0±0.04ms        ? ?/sec
  array_has_any/some_match/500                   1.00     52.1±0.11ms        ? ?/sec     1.01     52.9±0.12ms        ? ?/sec
  array_has_any_scalar/i64_no_match/1            15.87     6.1±0.05ms        ? ?/sec     1.00    386.8±1.76µs        ? ?/sec
  array_has_any_scalar/i64_no_match/10           12.87     6.4±0.04ms        ? ?/sec     1.00    496.2±3.32µs        ? ?/sec
  array_has_any_scalar/i64_no_match/100          18.87     9.8±0.15ms        ? ?/sec     1.00    520.2±5.31µs        ? ?/sec
  array_has_any_scalar/i64_no_match/1000         114.06    69.1±0.79ms        ? ?/sec    1.00    605.9±9.78µs        ? ?/sec
  array_has_any_scalar/string_no_match/1         12.91     3.8±0.01ms        ? ?/sec     1.00    290.8±1.47µs        ? ?/sec
  array_has_any_scalar/string_no_match/10        7.09      5.7±0.09ms        ? ?/sec     1.00    801.0±7.75µs        ? ?/sec
  array_has_any_scalar/string_no_match/100       30.32    29.3±0.23ms        ? ?/sec     1.00   967.6±15.09µs        ? ?/sec
  array_has_any_scalar/string_no_match/1000      323.46   311.0±2.99ms        ? ?/sec    1.00   961.5±14.46µs        ? ?/sec
  array_has_any_strings/no_match/10              1.01      5.0±0.04ms        ? ?/sec     1.00      4.9±0.02ms        ? ?/sec
  array_has_any_strings/no_match/100             1.05     22.2±0.18ms        ? ?/sec     1.00     21.1±0.03ms        ? ?/sec
  array_has_any_strings/no_match/500             1.05    133.6±0.58ms        ? ?/sec     1.00    127.7±0.33ms        ? ?/sec
  array_has_any_strings/scalar_no_match/10       7.02      6.6±0.02ms        ? ?/sec     1.00    939.6±2.90µs        ? ?/sec
  array_has_any_strings/scalar_no_match/100      3.11     26.5±0.11ms        ? ?/sec     1.00      8.5±0.04ms        ? ?/sec
  array_has_any_strings/scalar_no_match/500      1.38    166.5±1.60ms        ? ?/sec     1.00    120.6±0.51ms        ? ?/sec
  array_has_any_strings/scalar_some_match/10     5.42      5.5±0.03ms        ? ?/sec     1.00   1010.9±5.30µs        ? ?/sec
  array_has_any_strings/scalar_some_match/100    1.93     15.5±0.05ms        ? ?/sec     1.00      8.0±0.09ms        ? ?/sec
  array_has_any_strings/scalar_some_match/500    1.00     57.0±0.18ms        ? ?/sec     1.15     65.4±0.93ms        ? ?/sec
  array_has_any_strings/some_match/10            1.01      4.3±0.01ms        ? ?/sec     1.00      4.3±0.01ms        ? ?/sec
  array_has_any_strings/some_match/100           1.07     14.6±0.06ms        ? ?/sec     1.00     13.6±0.03ms        ? ?/sec
  array_has_any_strings/some_match/500           1.04     53.7±0.17ms        ? ?/sec     1.00     51.8±0.20ms        ? ?/sec
  array_has_i64/found/10                         1.00    709.5±4.90µs        ? ?/sec     1.01    713.5±4.85µs        ? ?/sec
  array_has_i64/found/100                        1.00  1146.7±45.85µs        ? ?/sec     1.04  1190.1±80.82µs        ? ?/sec
  array_has_i64/found/500                        1.02      4.7±0.16ms        ? ?/sec     1.00      4.6±0.11ms        ? ?/sec
  array_has_i64/not_found/10                     1.01    696.9±5.06µs        ? ?/sec     1.00    692.3±2.63µs        ? ?/sec
  array_has_i64/not_found/100                    1.00  1124.8±41.63µs        ? ?/sec     1.12  1254.4±117.95µs        ? ?/sec
  array_has_i64/not_found/500                    1.00      4.6±0.18ms        ? ?/sec     1.00      4.6±0.08ms        ? ?/sec
  array_has_strings/found/10                     1.00   1238.9±7.20µs        ? ?/sec     1.00   1238.2±6.99µs        ? ?/sec
  array_has_strings/found/100                    1.01      3.2±0.05ms        ? ?/sec     1.00      3.1±0.03ms        ? ?/sec
  array_has_strings/found/500                    1.01     15.5±0.31ms        ? ?/sec     1.00     15.3±0.22ms        ? ?/sec
  array_has_strings/not_found/10                 1.01    776.4±4.59µs        ? ?/sec     1.00    771.5±3.34µs        ? ?/sec
  array_has_strings/not_found/100                1.02      6.5±0.10ms        ? ?/sec     1.00      6.4±0.04ms        ? ?/sec
  array_has_strings/not_found/500                1.00     16.8±0.16ms        ? ?/sec     1.01     16.9±0.11ms        ? ?/sec

@neilconway neilconway force-pushed the neilc/optimize-array-has-any-scalar branch from cc2d735 to ef696bd Compare February 16, 2026 22:49
The previous implementation tested the cost of building an array_has()
`Expr` (!), not actually evaluating the array_has() operation itself.
Refactor things along the way.
@neilconway neilconway force-pushed the neilc/optimize-array-has-any-scalar branch from 83484af to b3bbf3a Compare February 17, 2026 16:02
@neilconway
Copy link
Copy Markdown
Contributor Author

Note that the commit fixing up the benchmarks is shared with #20374 -- I can also pull that out into a separate PR, because it's a prerequisite for any performance work on these functions

Comment thread datafusion/functions-nested/benches/array_has.rs Outdated
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
use crate::utils::make_scalar_function;

use std::any::Any;
use std::collections::HashSet;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we get performance gains if we use hashbrown instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to benchmark, although std HashSet uses hashbrown internally these days; have we found that using hashbrown directly leads to be better performance in other circumstances?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main point of reference is this reply:

Though its a few years old at this point so don't know if things have changed since

Comment thread datafusion/functions-nested/src/array_has.rs Outdated
Comment thread datafusion/functions-nested/src/array_has.rs Outdated
Comment thread datafusion/functions-nested/src/array_has.rs Outdated
Comment thread datafusion/functions-nested/src/array_has.rs Outdated
Comment thread datafusion/functions-nested/src/array_has.rs Outdated

let col_list: ArrayWrapper = col_arr.as_ref().try_into()?;
let all_col_strings = string_array_to_vec(col_list.values().as_ref());
let col_offsets: Vec<usize> = col_list.offsets().collect();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably avoid this collect of offsets if we take advantage of peeking the iter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, this turned out to be much slower:

array_has_any_scalar/string_no_match/1
                        time:   [97.474 µs 98.334 µs 99.302 µs]
                        change: [−15.448% −14.766% −14.081%] (p = 0.00 < 0.05)
                        Performance has improved.
array_has_any_scalar/string_no_match/10
                        time:   [298.73 µs 317.71 µs 343.54 µs]
                        change: [+32.141% +60.866% +85.741%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 20 outliers among 100 measurements (20.00%)
  3 (3.00%) low mild
  17 (17.00%) high severe
array_has_any_scalar/string_no_match/100
                        time:   [437.14 µs 455.34 µs 480.03 µs]
                        change: [+22.766% +39.379% +59.120%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 18 outliers among 100 measurements (18.00%)
  1 (1.00%) low mild
  17 (17.00%) high severe
array_has_any_scalar/string_no_match/1000
                        time:   [332.13 µs 351.25 µs 376.77 µs]
                        change: [+28.480% +54.273% +78.992%] (p = 0.00 < 0.05)
                        Performance has regressed.

I didn't dig into why; maybe dynamic dispatch because of the iterator adds a bunch of overhead? I'll leave this as-is for now.

columnar_arg: &ColumnarValue,
scalar_values: &ArrayRef,
) -> Result<ColumnarValue> {
let scalar_strings = string_array_to_vec(scalar_values.as_ref());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should defer creating this vec, since if we take the hashing path we effectively allocate a vec to allocate a hashset, which seems redundant

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this but it didn't seem to help the benchmarks, so I'll keep things as they were for now.

Comment thread datafusion/functions-nested/src/array_has.rs Outdated
@github-actions github-actions Bot added the functions Changes to functions implementation label Feb 20, 2026
@neilconway
Copy link
Copy Markdown
Contributor Author

@Jefffrey Thank you for the detailed code review! 🙏 I addressed all of your comments; please let me know if you have more feedback.

Copy link
Copy Markdown
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do the benchmarks look now?

use crate::utils::make_scalar_function;

use std::any::Any;
use std::collections::HashSet;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main point of reference is this reply:

Though its a few years old at this point so don't know if things have changed since

@neilconway
Copy link
Copy Markdown
Contributor Author

Benchmarks comparing the latest version (with hashbrown) versus main:

 group                                          base                                    target
  -----                                          ----                                    ------
  array_has_any/no_match/10                      1.00      7.4±0.12ms        ? ?/sec     1.00      7.4±0.12ms        ? ?/sec
  array_has_any/no_match/100                     1.01     23.9±1.06ms        ? ?/sec     1.00     23.6±0.28ms        ? ?/sec
  array_has_any/no_match/500                     1.00     96.6±1.07ms        ? ?/sec     1.02     98.5±4.88ms        ? ?/sec
  array_has_any/scalar_no_match/10               3.89      8.5±0.64ms        ? ?/sec     1.00      2.2±0.05ms        ? ?/sec
  array_has_any/scalar_no_match/100              1.14     24.3±0.58ms        ? ?/sec     1.00     21.4±1.29ms        ? ?/sec
  array_has_any/scalar_no_match/500              1.00     96.8±0.71ms        ? ?/sec     1.47    142.4±1.39ms        ? ?/sec
  array_has_any/scalar_some_match/10             6.62      7.0±0.24ms        ? ?/sec     1.00  1051.8±10.58µs        ? ?/sec
  array_has_any/scalar_some_match/100            1.21     14.8±0.34ms        ? ?/sec     1.00     12.2±1.14ms        ? ?/sec
  array_has_any/scalar_some_match/500            1.00     53.5±5.72ms        ? ?/sec     1.73     92.3±8.25ms        ? ?/sec
  array_has_any/some_match/10                    1.08      6.9±0.55ms        ? ?/sec     1.00      6.4±0.04ms        ? ?/sec
  array_has_any/some_match/100                   1.01     14.9±0.25ms        ? ?/sec     1.00     14.8±0.24ms        ? ?/sec
  array_has_any/some_match/500                   1.00     52.3±0.14ms        ? ?/sec     1.01     52.9±2.27ms        ? ?/sec
  array_has_any_scalar/i64_no_match/1            17.81     6.7±0.61ms        ? ?/sec     1.00    375.0±4.67µs        ? ?/sec
  array_has_any_scalar/i64_no_match/10           15.74     7.0±0.57ms        ? ?/sec     1.00   444.8±10.79µs        ? ?/sec
  array_has_any_scalar/i64_no_match/100          15.98    10.2±0.70ms        ? ?/sec     1.00   638.0±42.57µs        ? ?/sec
  array_has_any_scalar/i64_no_match/1000         134.73    73.0±7.67ms        ? ?/sec    1.00   542.0±15.41µs        ? ?/sec
  array_has_any_scalar/string_no_match/1         16.10     4.1±0.03ms        ? ?/sec     1.00    254.2±2.92µs        ? ?/sec
  array_has_any_scalar/string_no_match/10        13.87     5.8±0.44ms        ? ?/sec     1.00    418.3±8.81µs        ? ?/sec
  array_has_any_scalar/string_no_match/100       56.63    32.3±2.88ms        ? ?/sec     1.00   570.6±51.41µs        ? ?/sec
  array_has_any_scalar/string_no_match/1000      691.94  319.4±19.85ms        ? ?/sec    1.00   461.6±11.67µs        ? ?/sec
  array_has_any_strings/no_match/10              1.01      5.1±0.10ms        ? ?/sec     1.00      5.0±0.09ms        ? ?/sec
  array_has_any_strings/no_match/100             1.04     23.3±1.87ms        ? ?/sec     1.00     22.4±1.09ms        ? ?/sec
  array_has_any_strings/no_match/500             1.00    133.0±1.73ms        ? ?/sec     1.03   137.4±10.46ms        ? ?/sec
  array_has_any_strings/scalar_no_match/10       7.30      6.8±0.44ms        ? ?/sec     1.00   926.6±13.70µs        ? ?/sec
  array_has_any_strings/scalar_no_match/100      3.70     29.5±2.04ms        ? ?/sec     1.00      8.0±0.06ms        ? ?/sec
  array_has_any_strings/scalar_no_match/500      1.79    164.2±1.46ms        ? ?/sec     1.00     91.9±1.24ms        ? ?/sec
  array_has_any_strings/scalar_some_match/10     6.56      5.2±0.07ms        ? ?/sec     1.00    791.9±7.79µs        ? ?/sec
  array_has_any_strings/scalar_some_match/100    2.81     15.6±0.07ms        ? ?/sec     1.00      5.6±0.09ms        ? ?/sec
  array_has_any_strings/scalar_some_match/500    3.07     57.6±1.24ms        ? ?/sec     1.00     18.7±0.26ms        ? ?/sec
  array_has_any_strings/some_match/10            1.01      4.4±0.19ms        ? ?/sec     1.00      4.3±0.06ms        ? ?/sec
  array_has_any_strings/some_match/100           1.00     15.5±1.62ms        ? ?/sec     1.00     15.5±1.31ms        ? ?/sec
  array_has_any_strings/some_match/500           1.00     59.7±5.05ms        ? ?/sec     1.02     60.7±5.82ms        ? ?/sec

@neilconway
Copy link
Copy Markdown
Contributor Author

@Jefffrey Got it; the default hashbrown hash function does seem like a better choice. Interestingly the benchmarks are significantly better in some cases. This is comparing the feature branch with hashbrown (target) vs std hashset (base):

  group                                          base                                   target
  -----                                          ----                                   ------
  array_has_any/no_match/10                      1.00      7.4±0.08ms        ? ?/sec    1.00      7.4±0.04ms        ? ?/sec
  array_has_any/no_match/100                     1.00     23.0±0.05ms        ? ?/sec    1.02     23.6±0.05ms        ? ?/sec
  array_has_any/no_match/500                     1.00     91.7±0.11ms        ? ?/sec    1.05     96.2±0.53ms        ? ?/sec
  array_has_any/scalar_no_match/10               1.00      2.1±0.02ms        ? ?/sec    1.00      2.1±0.00ms        ? ?/sec
  array_has_any/scalar_no_match/100              1.00     20.3±0.07ms        ? ?/sec    1.01     20.5±0.06ms        ? ?/sec
  array_has_any/scalar_no_match/500              1.00    134.4±1.10ms        ? ?/sec    1.00    134.6±0.37ms        ? ?/sec
  array_has_any/scalar_some_match/10             1.00  1038.6±12.05µs        ? ?/sec    1.00   1036.2±4.87µs        ? ?/sec
  array_has_any/scalar_some_match/100            1.00     10.7±0.09ms        ? ?/sec    1.00     10.7±0.07ms        ? ?/sec
  array_has_any/scalar_some_match/500            1.00     83.1±0.39ms        ? ?/sec    1.00     83.2±0.40ms        ? ?/sec
  array_has_any/some_match/10                    1.01      6.5±0.03ms        ? ?/sec    1.00      6.4±0.04ms        ? ?/sec
  array_has_any/some_match/100                   1.00     14.6±0.06ms        ? ?/sec    1.01     14.8±0.05ms        ? ?/sec
  array_has_any/some_match/500                   1.00     50.1±0.13ms        ? ?/sec    1.06     52.9±0.22ms        ? ?/sec
  array_has_any_scalar/i64_no_match/1            1.00    359.7±1.46µs        ? ?/sec    1.04    373.2±2.89µs        ? ?/sec
  array_has_any_scalar/i64_no_match/10           1.91    844.9±9.22µs        ? ?/sec    1.00    441.6±9.23µs        ? ?/sec
  array_has_any_scalar/i64_no_match/100          1.59  1003.3±34.17µs        ? ?/sec    1.00   629.3±21.51µs        ? ?/sec
  array_has_any_scalar/i64_no_match/1000         1.77   955.1±12.20µs        ? ?/sec    1.00   540.2±12.02µs        ? ?/sec
  array_has_any_scalar/string_no_match/1         1.01    256.7±1.83µs        ? ?/sec    1.00    255.1±1.92µs        ? ?/sec
  array_has_any_scalar/string_no_match/10        1.97   826.3±13.46µs        ? ?/sec    1.00    420.2±8.06µs        ? ?/sec
  array_has_any_scalar/string_no_match/100       1.65   910.6±19.59µs        ? ?/sec    1.00    552.9±17.14µs        ? ?/sec
  array_has_any_scalar/string_no_match/1000      1.90   874.5±12.71µs        ? ?/sec    1.00    459.8±8.70µs        ? ?/sec
  array_has_any_strings/no_match/10              1.00      5.0±0.01ms        ? ?/sec    1.00      5.0±0.02ms        ? ?/sec
  array_has_any_strings/no_match/100             1.01     22.2±0.05ms        ? ?/sec    1.00     22.0±0.03ms        ? ?/sec
  array_has_any_strings/no_match/500             1.00    128.7±0.18ms        ? ?/sec    1.03    132.1±1.15ms        ? ?/sec
  array_has_any_strings/scalar_no_match/10       1.00    863.4±2.22µs        ? ?/sec    1.07    920.9±1.92µs        ? ?/sec
  array_has_any_strings/scalar_no_match/100      1.00      7.3±0.02ms        ? ?/sec    1.10      8.0±0.02ms        ? ?/sec
  array_has_any_strings/scalar_no_match/500      1.00     87.1±0.14ms        ? ?/sec    1.05     91.4±0.14ms        ? ?/sec
  array_has_any_strings/scalar_some_match/10     1.00    769.2±2.00µs        ? ?/sec    1.03    790.9±3.03µs        ? ?/sec
  array_has_any_strings/scalar_some_match/100    1.00      4.1±0.17ms        ? ?/sec    1.04      4.3±0.22ms        ? ?/sec
  array_has_any_strings/scalar_some_match/500    1.00     16.9±0.08ms        ? ?/sec    1.08     18.2±0.07ms        ? ?/sec
  array_has_any_strings/some_match/10            1.00      4.3±0.02ms        ? ?/sec    1.00      4.3±0.01ms        ? ?/sec
  array_has_any_strings/some_match/100           1.01     14.3±0.05ms        ? ?/sec    1.00     14.1±0.04ms        ? ?/sec
  array_has_any_strings/some_match/500           1.00     53.5±0.11ms        ? ?/sec    1.00     53.6±0.07ms        ? ?/sec

@alamb alamb added the performance Make DataFusion faster label Feb 24, 2026
@alamb alamb added this pull request to the merge queue Feb 24, 2026
@alamb
Copy link
Copy Markdown
Contributor

alamb commented Feb 24, 2026

Thanks @Jefffrey @martin-g and @neilconway

Merged via the queue into apache:main with commit 585bbf3 Feb 24, 2026
32 checks passed
de-bgunter pushed a commit to de-bgunter/datafusion that referenced this pull request Mar 24, 2026
## Which issue does this PR close?

- Closes apache#20384.
- See apache#18181 for related context.

## Rationale for this change

When `array_has_any` is passed a scalar for either of its arguments, we
can use a much faster algorithm: rather than doing O(N*M) comparisons
for each row of the columnar arg, we can build a hash table on the
scalar argument and probe it instead.

## What changes are included in this PR?

* Add benchmark to cover the one-scalar-arg case
* Implement optimization as described above

Note that we fallback to a linear scan when the scalar arg is smaller
than a threshold (<= 8 elements), because benchmarks suggested probing a
HashSet is not profitable for very small arrays.

## Are these changes tested?

Yes. Tests pass and benchmarked.

## Are there any user-facing changes?

No.

---------

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation functions Changes to functions implementation performance Make DataFusion faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize array_has_any() for scalar arg

4 participants