Skip to content

TPC-H q10 performance regression (expression for filter with added alias is not pushed down)  #1367

@Dandandan

Description

@Dandandan

Describe the bug
Fastest I get on master is 10s. After #1366 it's around 7s,
This used to be <2s.
Looking at the plan it looks like some filters are not pushed down successfully (o_orderdate@9 >= 8674 AND o_orderdate@9 < 8766 AND l_returnflag@13 = R)

After #1319 we added some alias to constant folding. Might be good to only do this for Projection (at least, not in Filter).

As you can see the filter has an AS - which makes the filter push down to not work.

[2021-11-26T17:12:17Z DEBUG datafusion::execution::context] Optimized logical plan:
     Sort: #revenue DESC NULLS FIRST
      Projection: #customer.c_custkey, #customer.c_name, #SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS revenue, #customer.c_acctbal, #nation.n_name, #customer.c_address, #customer.c_phone, #customer.c_comment
        Aggregate: groupBy=[[#customer.c_custkey, #customer.c_name, #customer.c_acctbal, #customer.c_phone, #nation.n_name, #customer.c_address, #customer.c_comment]], aggr=[[SUM(#lineitem.l_extendedprice * Int64(1) - #lineitem.l_discount)]]
          Join: #customer.c_nationkey = #nation.n_nationkey
            Filter: #orders.o_orderdate >= Date32("8674") AND #orders.o_orderdate < Date32("8766") AND #lineitem.l_returnflag = Utf8("R") AS orders.o_orderdate >= CAST(Utf8("1993-10-01") AS Date32) AND orders.o_orderdate < CAST(Utf8("1994-01-01") AS Date32) AND lineitem.l_returnflag = Utf8("R")
              Join: #orders.o_orderkey = #lineitem.l_orderkey
                Join: #customer.c_custkey = #orders.o_custkey
                  TableScan: customer projection=Some([0, 1, 2, 3, 4, 5, 7])
                  TableScan: orders projection=Some([0, 1, 4])
                TableScan: lineitem projection=Some([0, 5, 6, 8])
            TableScan: nation projection=Some([0, 1])

To Reproduce
Use latest master to run q10 against some parquet data. Use an older version (to be found out how old).

Expected behavior

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingperformanceMake DataFusion faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions