Skip to content

SimplifyExpressions errors when simplifying power fn #5996

@wolffcm

Description

@wolffcm

Describe the bug

When calling the power function with a constant in the second argument, I get an error:

Internal error: Optimizer rule 'simplify_expressions' failed due to unexpected error: Internal error: The expr has more than one schema, could not determine data type. This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker.

To Reproduce

with datafusion-cli:

select power(f, 100) from (values(10.0)) as t(f);

I was also able to reproduce this with a unit test in datafusion/optimizer/src/simplify_expressions/simlify_exprs.rs:

    #[test]
    fn simplify_power() -> Result<()> {
        let schema = Schema::new(vec![
            Field::new("f", DataType::Float64, false),
        ]);
        let table_scan = table_scan(Some("test"), &schema, None)
            .expect("creating scan")
            .build()
            .expect("building plan");

        let plan = LogicalPlanBuilder::from(table_scan)
            .project(vec![call_fn("power", vec![col("f"), lit(1)])?])?
            .build()?;
        let rule = SimplifyExpressions::new();
        let _optimized_plan = rule
            .try_optimize(&plan, &OptimizerContext::new())
            .unwrap()
            .expect("failed to optimize plan");
        Ok(())
    }

Expected behavior

Calling power with a constant is a reasonable thing to want to do, so I would expect this to work.

Additional context

Unless the optimizer is configured with skip_failed_rules set to false, this could manifest in other, or the query might succeed just fine.

Related issue: #4685

I found this specifically when invoking power but this same issue probably occurs when simplifying other expressions.

In IOx we want to run with skip_failed_rules set to false specifically so we can return reasonable error messages to the user: https://github.com/influxdata/influxdb_iox/issues/7330

This test failure seems to happen in simpl_pow when trying to get the data type of the exponent argument:
https://github.com/apache/arrow-datafusion/blob/fcd8b899e2a62f798413c536f47078289ece9d05/datafusion/optimizer/src/simplify_expressions/utils.rs#L403
And then the SimplifyContext can't get the type because there is "more than one schema":
https://github.com/apache/arrow-datafusion/blob/fcd8b899e2a62f798413c536f47078289ece9d05/datafusion/optimizer/src/simplify_expressions/context.rs#L135

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions