-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
We are starting to look for more advanced methods for filter pushdown. I was starting to think of porting SimplifyWithGuarantee. The critical functionality we are looking for is being able to evaluate a predicate against some statistics and get the residual expression. For example, if I have the predicate x = 1 AND y < 2:
- file 1 with stats
0 <= x <= 20and0 < y <= 1=> residual filterx = 1(y < 2is always satisfied) => scan this file withx = 1filter - file 2 with stats
3 < y <= 10=> residual filterfalse=> don't scan this file since it will never satisfy the predicate
Describe the solution you'd like
I think a straightforward port of that function would be useful, but if there is a design that integrates better with existing functionality, I'm open to other designs.
/// Given a guarantee expression and a predicate expression, simplify the predicate expression.
///
/// # Example
///
/// This is useful for example when filtering data that has statistics. For example
/// if the statistics tell you `x > 2` (the guarantee), and you want to filter with
/// `x > 3 and y < 0`, then you can simplify the predicate to `y < 0`. Alternatively,
/// if the predicate is `x < 1 and y < 0`, then you know now directly from the
/// statistics that the predicate will always be false, so your filter can
/// immediately return an empty result.
///
/// ```
/// use datafusion_expr::{lit, col, Expr};
///
/// let guarantee = col("x") > lit(2);
/// let predicate = (col("x") > lit(3)) & (col("y") < lit(0));
/// assert_eq!(predicate.simplify_with_guarantee(guarantee), col("y") < lit(0));
///
/// let predicate = (col("x") < lit(1)) & (col("y") < lit(0));
/// assert_eq!(predicate.simplify_with_guarantee(guarantee), lit(false));
/// ```
pub fn simplify_with_guarantee(&self, guarantee: &Expr) -> Expr {
todo!()
}Describe alternatives you've considered
It seems like the current solutions with PruningPredicate don't give you the residual expression.
Additional context
This is related to #5830
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request