-
Notifications
You must be signed in to change notification settings - Fork 4.5k
python sdk: fix several bugs regarding avto <-> beam schema conversion #30770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python sdk: fix several bugs regarding avto <-> beam schema conversion #30770
Conversation
|
CC: @robertwb , mentioned in the description but I'm not entirely sure about my logic for converting between signed and unsigned 2's complement numbers to satisfy VarInt.java. Would you be able to assist me here? |
|
tests failing due to the |
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
|
assign set of reviewers see comment above about the failing tests. |
|
Assigning reviewers. If you would like to opt out of this review, comment R: @riteshghorse for label python. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
riteshghorse
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit. Tagging our schema expert R: @ahmedabu98
ahmedabu98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching and fixing these bugs!
Can we add some unit tests for the helper methods too? (ie. avro_union_type_to_beam_type, avro_atomic_value_to_beam_atomic_value, beam_atomic_value_to_avro_atomic_value)
|
@ahmedabu98 could you take another look at this PR? |
|
@benkonz thanks for making the changes. will try to take a look next week |
|
@ahmedabu98 thanks for taking another look at my PR! I've been pretty busy lately, but I'll try to find some time this week to address your comments. |
|
@ahmedabu98 I addressed your review comments, could you take another look? Thanks! |
ahmedabu98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the comments, this looks great! I appreciate the documentation too
We may need to add support for nullable arrays and maps later, but this PR is good to go
|
This breaks validatesCrossLanguageRunnerPythonUsingSql test: e.g. https://github.com/apache/beam/actions/runs/8820828279/job/24215280605 |
@Abacn oh, that was a test I added in this PR that was marked as skipped when the CI ran. It was working when I last tested on my machine. Let me do some debugging. We can revert my PR if necessary until we can get the tests passing. |
attempts to fix several issues regarding avro <-> beam schema conversion in the python sdk
IOException("varint overflow " + r)exception to get thrown when using the converted beam schema with aSqlTransformNOTE: I'm not entirely sure about my logic here, since the resulting
beam.Rowshave the converted 2's complement number will get returned if the user doesn't apply aSqlTransformto their pipeline, but the previous implementation was also causing errors, so I'll need some help hereavroio.py's beam_value_to_avro_value function, it recurses onavro_value_to_beam_value, rather than thebeam_value_to_avro_valuefunction. I'm 99% sure this should be the other way around, however since the two functions are so similar this previously wasn't causing issues. I only noticed it when I added the custom logic to convert to/from signed and unsigned numbers.Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.