Conversation
|
need to merge #19821 first |
| } | ||
| _ => { | ||
| return internal_err!( | ||
| "First argument of `DATE_PART` must be non-null scalar Utf8" |
There was a problem hiding this comment.
same as DF date_part, part is a literal
| use datafusion_expr::planner::{ExprPlanner, PlannerResult}; | ||
|
|
||
| #[derive(Default, Debug)] | ||
| pub struct SparkFunctionPlanner; |
There was a problem hiding this comment.
If we're including this planner now, I feel we should update the lib docs with an example of using this
https://github.com/apache/datafusion/blob/main/datafusion/spark/src/lib.rs
There was a problem hiding this comment.
yes, I can do that. I also think it would be nice to provide a way to register the expr planner and the udfs at the same time with something like
.we could do a
with_spark_features ? could track that in a separate issue/PR
There was a problem hiding this comment.
That sounds good as a followup 👍
| internal_err!("spark date_part should have been simplified to standard date_part") | ||
| } | ||
|
|
||
| fn simplify( |
There was a problem hiding this comment.
I like that we're using simplify here 👍
| let date_part_expr = Expr::ScalarFunction(ScalarFunction::new_udf( | ||
| datafusion_functions::datetime::date_part(), | ||
| vec![part_expr, date_expr], | ||
| )); |
There was a problem hiding this comment.
One concern is if the nullability of the output field will match here?
There was a problem hiding this comment.
ah you're right, should we update
to be nullable depending on the inputs ?There was a problem hiding this comment.
Yes, I think that would be the easiest/most consistent way to fix this
bcf8a96 to
3405613
Compare
Merged! |
3405613 to
20e0b96
Compare
|
Thanks @cht42 |
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19822 . - Part of apache#15914 ## Rationale for this change The current date_part function in datafusion have a few differences with the spark implementation: - day of week parts are 1 indexed in spark but 0 indexed in datafusion - spark supports a few more aliases for certain parts Full list of spark supported aliases: https://github.com/apache/spark/blob/a03bedb6c1281c5263a42bfd20608d2ee005ab05/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala#L3356-L3371 ## What changes are included in this PR? New date_part function in spark crate. ## Are these changes tested? Yes with SLT ## Are there any user-facing changes? yes
Which issue does this PR close?
datafusion-sparkSpark Compatible Functions #15914Rationale for this change
The current date_part function in datafusion have a few differences with the spark implementation:
Full list of spark supported aliases: https://github.com/apache/spark/blob/a03bedb6c1281c5263a42bfd20608d2ee005ab05/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala#L3356-L3371
What changes are included in this PR?
New date_part function in spark crate.
Are these changes tested?
Yes with SLT
Are there any user-facing changes?
yes