[BEAM-215] Override Create in the SparkPipelineRunner #214

tgroh · 2016-04-20T01:48:19Z

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

Make sure the PR title is formatted like:
[BEAM-<Jira issue #>] Description of pull request
Make sure tests pass via mvn clean verify. (Even better, enable
Travis-CI on your fork and ensure the whole test matrix passes).
Replace <Jira issue #> in the title with the actual Jira issue
number, if there is one.
If this contribution is large, please file an Apache
Individual Contributor License Agreement.

This allows existing pipelines to continue to function by keeping the
graph structure identical while replacing Create with a Read.

After BEAM-17 this override can be removed.

See #183 for more information.

tgroh · 2016-04-20T01:48:27Z

R: @amitsela
CC: @kennknowles

This allows existing pipelines to continue to function by keeping the graph structure identical while replacing Create with a Read.

kennknowles · 2016-04-20T18:21:41Z

.../spark/src/main/java/org/apache/beam/runners/spark/util/SinglePrimitiveOutputPTransform.java

+import org.apache.beam.sdk.values.PCollection.IsBounded;
+import org.apache.beam.sdk.values.PInput;
+
+public class SinglePrimitiveOutputPTransform<T> extends PTransform<PInput, PCollection<T>> {


Why the name change?

?

This could be a more specific override (probably should be, in order to facilitate quick removal), but as written is a relatively general override

I'm actually a bit confused as to how/whether this works. The translator is expecting Create.Values

I think the two options are:

Wait until the Spark runner supports Read.

Override Create.Values to a Spark-specific clone of it as here, but alter the translator to translate the new class.

Am I missing something?

amitsela · 2016-04-20T19:03:44Z

I think @kennknowles is right, the translator will look for Create.Values as in https://github.com/apache/incubator-beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java#L783

amitsela · 2016-04-20T19:09:52Z

If the objective is to allow existing pipeline implementations with Create.Values to construct the pipeline with Read instead - I guess that SinglePrimitiveOutputPTransform should be translated to the runner's create() instead of Create.Values

amitsela · 2016-04-20T20:06:30Z

I seems as if https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java#L369 still get's Create.Values as the transform... or that's actually the entry point to translation..
I'm not completely sure this override applies.. I'll take a deeper look.

tgroh · 2016-04-20T20:18:27Z

The use of super.apply(Override, PInput) as opposed of input.apply(PTransform) causes the Pipeline to replace the contents of the Create.Values apply method with the contents of the Override, while keeping the transform noted in the TransformTreeNode representing the transform equivalent.

kennknowles · 2016-04-20T20:20:09Z

Ah, that is excellent. It should also be done for the GroupByKey override.

kennknowles · 2016-04-20T20:20:28Z

(and it is also very confusing - hence working to remove this code path)

amitsela · 2016-04-20T20:23:23Z

Oh.. OK. Awesome.
So is Create.Values being deprecated ? And this is a "transition" solution ?

tgroh · 2016-04-20T20:27:03Z

Create.Values is being removed from the list of model-primitive transforms, and replaced with an implementation based on Read.Bounded

This is a transition solution that allows the implementation of Create.Values to remain a Primitive that the SparkPipelineRunner supports and allowing the default implementation of Create to move to a composite built on top of Read.Bounded (in #183).

When the SparkPipelineRunner supports the Read primitive, this should be removed, and we can get rid of create in the Spark TransformTranslator.

kennknowles · 2016-04-20T20:29:22Z

Yea, the primitive transforms are moving to the list here. We've been altering the SDK to match, while adding overrides to the runners so there is no behavioral change.

amitsela · 2016-04-20T20:34:35Z

Yep, I remember now :)
so +1 from me.
BTW @tgroh the behaviour you described in super.apply(Override, PInput) will only work for PTransform<PInput, PCollection<T>> or also for transformations like GroupByKey ?

tgroh · 2016-04-20T20:36:40Z

It works for arbitrary PTransforms

dhalperi · 2016-04-20T21:22:16Z

@tgroh looks like you have LGTMs from everyone? I am happy to play the role of mergebot if you need a committer in order to unblock you on #183

Closes apache#214. Closes apache#215. Closes apache#216. Co-authored-by: Christopher Wilcox <crwilcox@google.com>

Override Create in the SparkPipelineRunner

f98addc

This allows existing pipelines to continue to function by keeping the graph structure identical while replacing Create with a Read.

tgroh force-pushed the spark_override_create branch from 78afdb3 to f98addc Compare April 20, 2016 16:07

kennknowles reviewed Apr 20, 2016
View reviewed changes

kennknowles mentioned this pull request Apr 20, 2016

[BEAM-115] Add control of PipelineVisitor recursion into composite transforms #217

Merged

4 tasks

asfgit closed this in de601a8 Apr 20, 2016

tgroh mentioned this pull request Apr 20, 2016

[BEAM-215] Implement Create as An OffsetBasedSource #183

Closed

4 tasks

iemejia pushed a commit to iemejia/beam that referenced this pull request Jan 12, 2018

This closes apache#214

9511ebf

pl04351820 pushed a commit to pl04351820/beam that referenced this pull request Dec 20, 2023

chore: manual synth (apache#224)

2df1e61

Closes apache#214. Closes apache#215. Closes apache#216. Co-authored-by: Christopher Wilcox <crwilcox@google.com>

[BEAM-215] Override Create in the SparkPipelineRunner #214

[BEAM-215] Override Create in the SparkPipelineRunner #214

Uh oh!

Conversation

tgroh commented Apr 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgroh commented Apr 20, 2016

Uh oh!

kennknowles Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

tgroh Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

kennknowles Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

amitsela commented Apr 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amitsela commented Apr 20, 2016

Uh oh!

amitsela commented Apr 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgroh commented Apr 20, 2016

Uh oh!

kennknowles commented Apr 20, 2016

Uh oh!

kennknowles commented Apr 20, 2016

Uh oh!

amitsela commented Apr 20, 2016

Uh oh!

tgroh commented Apr 20, 2016

Uh oh!

kennknowles commented Apr 20, 2016

Uh oh!

amitsela commented Apr 20, 2016

Uh oh!

tgroh commented Apr 20, 2016

Uh oh!

dhalperi commented Apr 20, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tgroh commented Apr 20, 2016 •

edited

Loading

amitsela commented Apr 20, 2016 •

edited

Loading

amitsela commented Apr 20, 2016 •

edited

Loading