[Rust] [DataFusion] Improve performance if IN list function

The initial implementation of IN and NOT IN followed the "functional first, and then fast"

There are several potential performance improvements for the IN and NOT IN implementation in Data fusion such as optimizing for large lists (use a hash table rather than repeated comparisons) and short circuiting results. 

There are a bunch of good ideas in the comments on this PR: https://github.com/apache/arrow/pull/9038/files

**Reporter**: [Andrew Lamb](https://issues.apache.org/jira/browse/ARROW-11182) / @alamb
#### Related issues:
- [[Rust] [DataFusion] Add support for is_in](https://github.com/apache/arrow/issues/26342) (relates to)

<sub>**Note**: *This issue was originally created as [ARROW-11182](https://issues.apache.org/jira/browse/ARROW-11182). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Rust] [DataFusion] Improve performance if IN list function #27087

Related issues:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Rust] [DataFusion] Improve performance if IN list function #27087

Description

Related issues:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions