Skip to content

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Feb 8, 2025

Cherry-picked from #47625

### What problem does this PR solve?

Related PR: #43255

Problem Summary:
Example:
```sql
CREATE TABLE table_a (
    id INT,
    age INT
) STORED AS ORC;

INSERT INTO table_a VALUES
(1, null),
(2, 18),
(3, null),
(4, 25);

CREATE TABLE table_b (
    id INT,
    age INT
) STORED AS ORC;

INSERT INTO table_b VALUES
(1, null),
(2, null),
(3, 1000000),
(4, 100);
```
run sql
```
select * from table_a inner join table_b on table_a.age <=> table_b.age and table_b.id in (1,3);
```
When executing this SQL, the backend generates a runtime filter on the
table_a side during the join operation, resulting in a condition like
WHERE table_a.age IN (NULL, 1000000). It’s important to note that since
<=> is a null-aware comparison operator, the IN predicate must also be
null-aware. However, the ORC predicate pushdown API does not support
null-aware IN predicates. As a result, our current approach ignores null
values, leading to an empty result set for this query.

To fix this bug, we’ve adjusted the logic so that predicates with
null-aware comparisons are not pushed down, ensuring the correct result
as follows:
```text
+------+------+------+------+
| id   | age  | id   | age  |
+------+------+------+------+
|    1 | NULL |    1 | NULL |
|    3 | NULL |    1 | NULL |
+------+------+------+------+
```
@github-actions github-actions bot requested a review from dataroaring as a code owner February 8, 2025 11:57
@Thearas
Copy link
Contributor

Thearas commented Feb 8, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Feb 8, 2025
@dataroaring dataroaring reopened this Feb 8, 2025
@Thearas
Copy link
Contributor

Thearas commented Feb 8, 2025

run buildall

@morningman morningman closed this Feb 8, 2025
@CalvinKirs CalvinKirs deleted the auto-pick-47625-branch-3.0 branch March 28, 2025 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants