-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Thanks to the issue: #1619 Arrow-datafusion decided to add bitwise operation (using the standards of postgresql db). But, an optimizer of these expressions does not exist, therefore a lot of bitwise operations execute slowly due to useless work.
I decided to solve this problem.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Corresponding issues and pull requests:
#3430
#1619
Describe the solution you'd like
Below, there are rules, on which it is built the optimizer:
First table (null and 0):
| BitwiseAnd (&) | BitwiseOr (|) | BitwiseXor (^) | BitwiseShiftRight (>>) | BitwiseShiftLeft (<<) |
|---|---|---|---|---|
| A & null -> null | A | null -> null | A ^ null -> null | A >> null -> null | A << null -> null |
| null & A -> null | null | A -> null | null ^ A -> null | null >> A -> null | null << A -> null |
| A & 0 -> 0 | A | 0 -> A | A ^ 0 -> A | A >> 0 -> A | A << 0 -> A |
| 0 & A -> 0 | 0 | A -> A | 0 ^ A -> A | - | - |
Second table (equality):
*if A not nullable
| BitwiseAnd (&) | BitwiseOr (|) | BitwiseXor (^) | BitwiseShiftRight (>>) | BitwiseShiftLeft (<<) |
|---|---|---|---|---|
| A & A -> A | A | A -> A | A ^ A -> 0 | - | - |
Third table (bitwise not (Negative, see https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/expr.rs#L122-L123)):
*If A not nullable
*! - Bitwise NOT (Negative)
| BitwiseAnd (&) | BitwiseOr (|) | BitwiseXor (^) | BitwiseShiftRight (>>) | BitwiseShiftLeft (<<) |
|---|---|---|---|---|
| !A & A -> 0 | !A | A -> -1 | !A ^ A -> -1 | - | - |
| A & !A -> 0 | A | !A -> -1 | A ^ !A -> -1 | - | - |
Fourth table (InList):
*If B not null
*Considering Commutativity
| BitwiseAnd (&) | BitwiseOr (|) | BitwiseXor (^) | BitwiseShiftRight (>>) | BitwiseShiftLeft (<<) |
|---|---|---|---|---|
| A & (A & B) -> (A & B) | A | (A | B) -> (A | B) | A ^ (A ^ B) -> B | - | - |
Fifth table (cross values):
*Some values were taken from fourth table.
*If A and B not nullable
*Considering Commutativity
| - | BitwiseAnd (&) (A & B) | BitwiseOr (|) (A | B) | BitwiseXor (^) (A ^ B) | BitwiseShiftRight (>>) (A >> B) | BitwiseShiftLeft (<<) (A << B) |
|---|---|---|---|---|---|
| BitwiseAnd (&) | A & (A & B) -> (A & B) | A | (A & B) -> A | - | - | - |
| BitwiseOr (|) | A & (A | B) -> A | A | (A | B) -> (A | B) | - | - | - |
| BitwiseXor (^) | - | - | A ^ (A ^ B) -> B | - | - |
| BitwiseShiftRight (>>) | - | - | - | - | - |
| BitwiseShiftLeft (<<) | - | - | - | - | - |
Describe alternatives you've considered
If bitwise operations are not very popular, that PR can slightly slow down performance.