-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-215] Override Create in the SparkPipelineRunner #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
R: @amitsela |
This allows existing pipelines to continue to function by keeping the graph structure identical while replacing Create with a Read.
78afdb3 to
f98addc
Compare
| import org.apache.beam.sdk.values.PCollection.IsBounded; | ||
| import org.apache.beam.sdk.values.PInput; | ||
|
|
||
| public class SinglePrimitiveOutputPTransform<T> extends PTransform<PInput, PCollection<T>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the name change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
This could be a more specific override (probably should be, in order to facilitate quick removal), but as written is a relatively general override
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually a bit confused as to how/whether this works. The translator is expecting Create.Values
I think the two options are:
- Wait until the Spark runner supports
Read. - Override
Create.Valuesto a Spark-specific clone of it as here, but alter the translator to translate the new class.
Am I missing something?
|
I think @kennknowles is right, the translator will look for Create.Values as in https://github.com/apache/incubator-beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java#L783 |
|
If the objective is to allow existing pipeline implementations with Create.Values to construct the pipeline with Read instead - I guess that SinglePrimitiveOutputPTransform should be translated to the runner's create() instead of Create.Values |
|
I seems as if https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java#L369 still get's Create.Values as the transform... or that's actually the entry point to translation.. |
|
The use of |
|
Ah, that is excellent. It should also be done for the |
|
(and it is also very confusing - hence working to remove this code path) |
|
Oh.. OK. Awesome. |
|
Create.Values is being removed from the list of model-primitive transforms, and replaced with an implementation based on Read.Bounded This is a transition solution that allows the implementation of Create.Values to remain a Primitive that the SparkPipelineRunner supports and allowing the default implementation of Create to move to a composite built on top of Read.Bounded (in #183). When the SparkPipelineRunner supports the Read primitive, this should be removed, and we can get rid of create in the Spark TransformTranslator. |
|
Yea, the primitive transforms are moving to the list here. We've been altering the SDK to match, while adding overrides to the runners so there is no behavioral change. |
|
Yep, I remember now :) |
|
It works for arbitrary PTransforms |
Closes apache#214. Closes apache#215. Closes apache#216. Co-authored-by: Christopher Wilcox <crwilcox@google.com>
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
[BEAM-<Jira issue #>] Description of pull requestmvn clean verify. (Even better, enableTravis-CI on your fork and ensure the whole test matrix passes).
<Jira issue #>in the title with the actual Jira issuenumber, if there is one.
Individual Contributor License Agreement.
This allows existing pipelines to continue to function by keeping the
graph structure identical while replacing Create with a Read.
After BEAM-17 this override can be removed.
See #183 for more information.