-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-42052][SQL] Codegen Support for HiveSimpleUDF #40397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @cloud-fan |
|
|
||
| def evaluate(): Any | ||
|
|
||
| final def doGenCode(ctx: CodegenContext, ev: ExprCode, dataType: DataType): ExprCode = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird to implement codegen in the evaluator. If we really want to deduplicate the code, let's add HiveUDFExpressionBase later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Let's reserve some redundant logic first.
| } | ||
| } | ||
|
|
||
| abstract class HiveUDFEvaluatorBase[UDFType <: AnyRef]( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move evaluators to a separated file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
| lazy val function = funcWrapper.createFunction[UDF]() | ||
| private val isUDFDeterministic = { | ||
| val udfType = evaluator.function.getClass.getAnnotation(classOf[HiveUDFType]) | ||
| udfType != null && udfType.deterministic() && !udfType.stateful() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code seems to be the same with generic UDF. maybe we can move it to HiveUDFEvaluatorBase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
| case (child, idx) => | ||
| evaluator.setArg(idx, child.eval(input)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| case (child, idx) => | |
| evaluator.setArg(idx, child.eval(input)) | |
| case (child, idx) => evaluator.setArg(idx, child.eval(input)) |
| | $resultTerm = ($resultType) $refEvaluator.evaluate(); | ||
| | ${ev.isNull} = $resultTerm == null; | ||
| |} catch (Throwable e) { | ||
| | throw QueryExecutionErrors.failedExecuteUserDefinedFunctionError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we move the try-catch to evaluator.evaluate()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW this seems like an unrelated change. The previous code does not rethrow the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| udfType != null && udfType.deterministic() && !udfType.stateful() | ||
| } | ||
|
|
||
| def returnType: DataType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to add this method here because it will be used in exception handling.
| lazy val function = funcWrapper.createFunction[UDFType]() | ||
|
|
||
| @transient | ||
| val isUDFDeterministic = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be lazy val, as it accesses a lazy val.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
@cloud-fan Can we merge it to master? After it I will try to refactor HiveGenericUDTF & HiveUDAFFunction. Thanks! |
|
thanks, merging to master! |
…pleUDF (#1288) ### What changes were proposed in this pull request? - As a subtask of [SPARK-42050](https://issues.apache.org/jira/browse/SPARK-42050), this PR adds Codegen Support for HiveSimpleUDF - Extract a`HiveUDFEvaluatorBase` class for the common behaviors of HiveSimpleUDFEvaluator & HiveGenericUDFEvaluator. ### Why are the changes needed? - Improve codegen coverage and performance. - Following #39949. Make the code more concise. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Add new UT. Pass GA. Closes #40397 from panbingkun/refactor_HiveSimpleUDF. Authored-by: panbingkun <pbk1982@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
HiveUDFEvaluatorBaseclass for the common behaviors of HiveSimpleUDFEvaluator & HiveGenericUDFEvaluator.Why are the changes needed?
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Add new UT.
Pass GA.