-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-9044] Protobuf options to Schema options #10529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-9044] Protobuf options to Schema options #10529
Conversation
d984367 to
7a2dbac
Compare
7a2dbac to
4eb8f25
Compare
4eb8f25 to
c6578d0
Compare
7aa174e to
ccc7eb1
Compare
31da15f to
d5f3bf3
Compare
|
Run Java_Examples_Dataflow PreCommit |
1 similar comment
|
Run Java_Examples_Dataflow PreCommit |
|
@reuvenlax friendly ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to switch this metadata to use options. However you are translating proto option names directly to field names, which means there's no guarantee that these other options won't conflict. Maybe prefix all of these options with something to prevent potential conflict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to switch this metadata to use options. However you are translating proto option names directly to field names, which means there's no guarantee that these other options won't conflict. Maybe prefix all of these options with something to prevent potential conflict?
What about beam:option:proto:, that would make an proto option vptech.data.v1.description into beam:option:proto:vptech.data.v1.description. Making it a URI format
beam:option : fixed prefix for options, proto : for the extension (will be avro for avro), vptech.data.v1 package of the extension and desciption the proto extension field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Reuven was referring to the metadata that's added in withMetaData, since that's the "special" proto metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it, I don't think so... because I told Reuven I would do metadata after this. Still I think it's a good idea... and with your comment I want to revise the URI:
beam:option:proto:field:vptech.data.v1.descriptionfor a proto field optionbeam:option:proto:message:vptech.data.v1.descriptionfor a proto message optionbeam:option:proto:meta:numberfor a proto field numberbeam:option:proto:meta:type_namefor a proto type name
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to force all Beam option names to have a prefix - I think that's the wrong approach. I'm suggesting that we create well-known prefixes for these proto options (when translated to Beam options), so we can distinguish them from other options. The prefixes you suggested sound fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now all the options are prefix. I moved the removal of the metadata to another ticket, because it's more involved:
[BEAM-9604] BIP-1: Remove schema metadata usage for Protobuf extension
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, I think we should add a well-known prefix here.
|
Run Java PreCommit |
|
Run Java PreCommit |
496d77a to
f7185f6
Compare
|
@reuvenlax if you don't have remarks (removing metadata will be in another PR) I can make a private build of master. I got a pipeline full of schema aware stuff to test this on. |
|
Run Java PreCommit |
|
I've decided to add the removal of metadata usage in the proto implementation as an additional commit, this makes the PR bigger but it also makes it complete. The option based storing of proto number and message name makes it much cleaner then the removed metadata based one. |
|
@reuvenlax friendly reminder for reviewing the changes. It would be great if this would make the release (2.21) that will be cut in 6 days. |
|
LGTM |
Convert protobuf options to Row schema options. It gets the options from a proto message descriptor and converts them into beam schema options on the Schema. It does the same from proto field descriptors, converting them into beam field options. The converter creates typed options as the proto options are also fully typed. For the convertion it uses the ProtoDynamicSchema to resolve the types.
a7d7492 to
e741380
Compare
|
Thanks, now I can start working on the next parts and think about documentation. Hooray! |
[BEAM-9044] Protobuf options to Schema options
Convert protobuf options to Row schema options. It gets the options from a proto message descriptor and converts them into beam schema options on the Schema. It does the same from proto field descriptors, converting them into beam field options.
The converter creates typed options as the proto options are also fully typed. For the convertion it uses the ProtoDynamicSchema to resolve the types.
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.