Skip to content

Conversation

@apilloud
Copy link
Member

This change makes BeamCalcRel output a Calcite row, then converts to a Beam row using normal code (rather than generated). This makes things much easier to debug and fixes nested rows.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK ULR Dataflow Flink Samza Spark Twister2
Go Build Status --- Build Status Build Status --- Build Status ---
Java Build Status Build Status Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status --- Build Status Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status Build Status --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

Copy link
Member

@kennknowles kennknowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have a performance cost?

@apilloud
Copy link
Member Author

I ran our nexmark tests and the result was within the noise bounds of our performance dashboard. I don't think there is a huge benefit to code generation for this piece. What code generation gives us here is unrolled loops and dead-code elimination, both of those will be done by the Java runtime.

@apilloud
Copy link
Member Author

This one tries to be very generous on the types it accepts from Calcite. I added tests that covered a few cases that didn't appear to have coverage.
R: @ibzib
cc: @robinyqiu

Copy link

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much easier to read than the generated code before.

I'm generally unsure about why we can expect such a wide range of Java types for each schema type. Do we have an idea about where the variance comes from (Calcite?)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, maybe I can reuse these for ZetaSQL eventually.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why this happens? Is there a bug tracking it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened https://issues.apache.org/jira/browse/BEAM-12175, I'm not sure for this one but it may be related to the issue with Numbers.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why there are so many different possible types here? Same question for everywhere else we can't just do a simple cast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the tests that fail, Calcite will probably consider this a feature. It looks like when the implementation for an internal operator is missing, Calcite just substitutes in a 'compatible' one. For example Integer round(Integer x) can silently become Long round(Long x). I think there are possibly some cast operations that are being treated as no-op as well. This bug is preexisting, but I opened https://issues.apache.org/jira/browse/BEAM-12176

@apilloud apilloud merged commit e040608 into apache:master Apr 15, 2021
@apilloud apilloud deleted the calcout branch April 15, 2021 20:18
@apilloud apilloud mentioned this pull request Sep 7, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants