Skip to content

Incorrect results for join condition against current master branch #4844

@DDtKey

Description

@DDtKey

Describe the bug
Join condition(on with between) works incorrectly.
Looks like ignored and returned cartesian product.

It used to work for latest stable release (15.0.0 from crates.io)
But I tested it against current master branch, hash 3cc607de4ce6e9e1fd537091e471858c62f58653.

To Reproduce
Steps to reproduce the behavior:
students.csv:

name,mark
Stuart,28
Amina,89
Christen,50
Salma,77
Samantha,21

grades.csv:

grade,min,max
1,0,14
2,15,35
3,36,55
4,56,79
5,80,100
MRE:
use datafusion::prelude::{CsvReadOptions, SessionContext};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let students_path = "../students.csv";
    let grades_path = "../grades.csv";

    // Datafusion execution
    let ctx = SessionContext::new();

    ctx.register_csv("students", students_path, CsvReadOptions::default())
        .await?;

    ctx.register_csv("grades", grades_path, CsvReadOptions::default())
        .await?;

    let data_frame = ctx.sql("SELECT s.*, g.grade FROM students s join grades g on s.mark between g.min and g.max WHERE grade > 2 ORDER BY s.mark DESC").await?;

    data_frame.show().await?;

    Ok(())
}

It will return:

+----------+------+-------+
| name     | mark | grade |
+----------+------+-------+
| Amina    | 89   | 3     |
| Amina    | 89   | 4     |
| Amina    | 89   | 5     |
| Salma    | 77   | 3     |
| Salma    | 77   | 4     |
| Salma    | 77   | 5     |
| Christen | 50   | 3     |
| Christen | 50   | 4     |
| Christen | 50   | 5     |
| Stuart   | 28   | 3     |
| Stuart   | 28   | 4     |
| Stuart   | 28   | 5     |
| Samantha | 21   | 3     |
| Samantha | 21   | 4     |
| Samantha | 21   | 5     |
+----------+------+-------+

Expected behavior
It should be the same as for datafusion = "15.0.0":

+----------+------+-------+
| name     | mark | grade |
+----------+------+-------+
| Amina    | 89   | 5     |
| Salma    | 77   | 4     |
| Christen | 50   | 3     |
+----------+------+-------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions