-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Describe the bug
This bug is very similar to #4844 - The EliminateCrossJoin rule discards the Filter rules of InnerJoin.
The fix for the previous bug (#4869) was to skip the rule if an InnerJoin with a Filter was found.
Unfortunately the fix only checks the top InnerJoin, so nested InnerJoins are not checked and might lose their filter condition
To Reproduce
Add the below test to eliminate_cross_join.rs:
#[test]
fn eliminate_cross_not_possible_nested_inner_join_with_filter() -> Result<()> {
let t1 = test_table_scan_with_name("t1")?;
let t2 = test_table_scan_with_name("t2")?;
let t3 = test_table_scan_with_name("t3")?;
// could not eliminate to inner join with filter
let plan = LogicalPlanBuilder::from(t1)
.join(
t3,
JoinType::Inner,
(vec!["t1.a"], vec!["t3.a"]),
Some(col("t1.a").gt(lit(20u32))),
)?
.join(t2, JoinType::Inner, (vec!["t1.a"], vec!["t2.a"]), None)?
.filter(col("t1.a").gt(lit(15u32)))?
.build()?;
let formatted = plan.display_indent_schema().to_string();
let before_optimization: Vec<&str> = formatted.trim().lines().collect();
assert_optimized_plan_eq(&plan, before_optimization);
Ok(())
}The test will fail in assert_optimized_plan_eq with this error:
expected:
[
"Filter: t1.a > UInt32(15) [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" Inner Join: t1.a = t2.a [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" Inner Join: t1.a = t3.a Filter: t1.a > UInt32(20) [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" TableScan: t1 [a:UInt32, b:UInt32, c:UInt32]",
" TableScan: t3 [a:UInt32, b:UInt32, c:UInt32]",
" TableScan: t2 [a:UInt32, b:UInt32, c:UInt32]",
]
actual:
[
"Filter: t1.a > UInt32(15) [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" Inner Join: t1.a = t2.a [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" Inner Join: t1.a = t3.a [a:UInt32, b:UInt32, c:UInt32, a:UInt32, b:UInt32, c:UInt32]",
" TableScan: t1 [a:UInt32, b:UInt32, c:UInt32]",
" TableScan: t3 [a:UInt32, b:UInt32, c:UInt32]",
" TableScan: t2 [a:UInt32, b:UInt32, c:UInt32]",
]
expected is the plan before the optimization is applied and actual is the plan after the optimization is applied.
You can see nested Inner join filter was incorrectly dismissed (t1.a > UInt32(20))
Expected behavior
In case the InnerJoin has a Filter the rule should be skipped or optimized correctly
Additional context
#4866 will probably fix this bug but I think a quicker fix is needed to be applied here in order to regain query results correctness.