-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-10925] Add rule to replace Calc with BeamCalcRel for ZetaSQL UDFs. #13841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| root.rel.getCluster().invalidateMetadataQuery(); | ||
| return (BeamRelNode) plannerImpl.transform(0, desiredTraits, root.rel); | ||
| BeamRelNode beamRelNode = (BeamRelNode) plannerImpl.transform(0, desiredTraits, root.rel); | ||
| LOG.info("BEAMPlan>\n" + RelOptUtil.toString(beamRelNode)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: do we need this LOG.info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, but it may be useful someday. I actually left this in from your original PR (#12398).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that was added for the purpose of debugging.
Keep this LOG.info is fine. I find that in CalciteQueryPlanner there is also a such LOG.info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess LOG.debug would probably be better. I might change both planners in a separate PR.
amaliujia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
| @Override | ||
| public boolean matches(RelOptRuleCall x) { | ||
| return ZetaSQLQueryPlanner.hasUdfInProjects(x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is insufficient and likely introduces data corruption bugs if UDFs and regular operators are mixed. I plan to review this pull request on Monday.
apilloud
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs tests. What happens if ZetaSQL is mixed with UDFs? It should fail to plan, but as implemented everything will BeamCalcRel.
| import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.logical.LogicalCalc; | ||
|
|
||
| /** {@link ConverterRule} to replace {@link Calc} with {@link BeamCalcRel}. */ | ||
| public class BeamJavaUdfCalcRule extends ConverterRule { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can drop the convert method if you extend BeamCalcRule instead...
| } | ||
|
|
||
| /** Returns true if the argument contains any user-defined Java functions. */ | ||
| static boolean hasUdfInProjects(RelOptRuleCall x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would probably stick this method in BeamZetaSqlCalcRule. What you have here are operators not supported by ZetaSqlCalc.
|
We will need to reject the mixed UDF and builtin functions queries before calc/rule splitting happen. |
The rule isn't used yet because there are no
USER_DEFINED_JAVA_SCALAR_FUNCTIONSyet.Most of this code was originally written by @amaliujia.
R: @amaliujia @apilloud
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.