[GLUTEN-10566][VL] Add Spark unix_timestamp support with timestamp and format arguments#10567
[GLUTEN-10566][VL] Add Spark unix_timestamp support with timestamp and format arguments#10567rui-mo merged 16 commits intoapache:mainfrom
Conversation
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
@rui-mo can you review this? test failures are unrelated |
rui-mo
left a comment
There was a problem hiding this comment.
Thanks! Would you please add a test like below to ensure the Project operator does not fallback?
|
Run Gluten Clickhouse CI on x86 |
2 similar comments
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
80d6434 to
38377ac
Compare
|
Run Gluten Clickhouse CI on x86 |
|
@rui-mo Seems like even if we passed just one argument in the function, still the signature of the function is retrieved from the original expr, and it will still fallback. Fixing this in gluten might be complex, as we would need to create expression again. Do you think making this change in Velox would make more sense ? |
| replaceWithExpressionTransformer0(t.timeExp, attributeSeq, expressionsMap), | ||
| replaceWithExpressionTransformer0(t.format, attributeSeq, expressionsMap) | ||
| ), | ||
| children, |
There was a problem hiding this comment.
A more typical approach is to implement a custom transformer for this function and put the specific logic within it. For example,
BackendsApiManager.getSparkPlanExecApiInstance.genTruncTimestampTransformer returns the transformer for TruncTimestamp. I’d prefer addressing this issue in Gluten, since introducing a function with an unused parameter to Velox would be confusing.
There was a problem hiding this comment.
Thanks @rui-mo for the suggestion. Made changes accordingly
|
Run Gluten Clickhouse CI on x86 |
4c36450 to
5aaec82
Compare
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
Test failures are unrelated. @rui-mo can you review? |
| ), | ||
| toUnixTimestamp | ||
| ) | ||
| } |
There was a problem hiding this comment.
The specific transformer can be used to handle different logics between Timestamp and String types, and 'ExpressionConverter' could be much simplified to just call the transformer. FYI: I added the modified version in my commit: rui-mo@0430595.
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
@nimesh1601 It appears that the Clickhouse UT fails because of below error. The way to solve it is to implement different transformers for CH and Velox backends, like the
|
|
Run Gluten Clickhouse CI on x86 |
| substraitExprName: String, | ||
| timeExp: ExpressionTransformer, | ||
| format: ExpressionTransformer, | ||
| original: Expression): ExpressionTransformer = { |
There was a problem hiding this comment.
The implementation can be removed from this API, and allow backend to provide specific implementation.
| ) | ||
| } | ||
|
|
||
| case class ToUnixTimestampTransformer( |
There was a problem hiding this comment.
Consider moving this transformer to 'backends-velox/src/main/scala/org/apache/gluten/expression/ExpressionTransformer.scala' because it's only used by Velox.
|
Run Gluten Clickhouse CI on x86 |
|
@nimesh1601 Can you please follow up this comment: #10567 (comment)? It helps avoid duplication. |
|
Run Gluten Clickhouse CI on x86 |
What changes are proposed in this pull request?
Support unix_timestamp and to_unix_timestamp registered with arguments: (TIMESTAMP VARCHAR) by ignoring format arguments to be in parity with Spark
How was this patch tested?
Uts