Skip to content

Optimize filters to remove redundant IsNotNull checks #938

@andygrove

Description

@andygrove

What is the problem the feature request solves?

I am comparing native query plans between Comet and Ballista for TPC-H q1 and noticed a significant difference between the filter expressions and performance:

Comet (total filter time 7.2 seconds):

FilterExec: col_6@6 IS NOT NULL AND col_6@6 <= 1998-09-24

Ballista (total filter time 3.3 seconds):

FilterExec: l_shipdate@6 <= 10493

The differences are:

  • Comet evaluates 3 expressions (And, IsNotNull, LtEq) compared to 1 expression in Ballista (LtEq)
  • Comet compares the date to a date literal, Ballista compares to an integer literal

We can likely improve Comet performance by eliding the redundant IsNotNull and And. I am not sure if there is a difference with the date versus int literal, but we should check.

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions