-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Make beam-examples run with spark runner #533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
87a1307 to
99dd9f4
Compare
|
R: @amitsela How the spark-core library is provided in the previous tests? |
| opts.setRunner(SparkRunner.class); | ||
| Pipeline pipeline = Pipeline.create(opts); | ||
| public void testE2ETfIdfSpark() throws Exception { | ||
| SparkPipelineOptions options = PipelineOptionsFactory.as(SparkPipelineOptions.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use PipelineOptionsFactory.fromArgs to be runner agnostic ? Is there a benefit in that ? maybe apply the same to all runners ?
|
I would still duplicate a WordCount copy into the Spark runner like I did in #539 because it's widely used in the runner's unit tests. This could be resolved by removing the provided scope on spark dependencies from the Spark runner, but I don't think that's a good idea. Looping in @jbonofre WDYT ? this could make the Spark runner Jar become very heavy.. and what about different Spark distributions on clusters ? |
|
add R: @davorbonaci |
|
@peihe anything new here ? because #539 is passing tests now - but like you said, it doesn't eliminate code duplication. I don't see this working if the runner doesn't have a Pinging @jbonofre: from your experience with customers, is Spark usually provided ? |
|
While my point of view on things is of a Spark (+YARN) cluster, I'm starting to get the feeling that there are a lot of interest in "out-of-the-box" packaging.. Let me raise that in the mailing list to get people's thought on this, and I might change the build to either compile or use profiles or something. |
Add more time for grpc cleanup
No description provided.