Skip to content

[Rust][DataFusion] Improve performance of equality to a constant predicate support #26181

@asfimport

Description

@asfimport

I noticed this behavior while working on support for DictionaryArrays and wanted to capture it in a ticket in case someone has time to work on it.

In order to implement an equality predicate to a constant such as d1 = 'three', DataFusion effectively creates an array with the same value 'three' repeated over and over again and uses the equality compute kernel. This is ... suboptimal.

Here is what the predicate looks like:

        predicate: BinaryExpr {
            left: CastExpr {
                expr: Column {
                    name: "d1",
                },
                cast_type: Utf8,
            },
            op: Eq,
            right: Literal {
                value: Utf8("three"),
            },
        },

Reporter: Andrew Lamb / @alamb
Assignee: Yordan Pavlov / @yordan-pavlov

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-10173. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions