Skip to content

Conversation

@echauchot
Copy link
Contributor

@echauchot echauchot commented May 12, 2017

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify.
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

R: @dpmills @dhalperi
CC: @stasl @aviemzur @aljoscha because we discussed NexMark together :)
CC: @mshields822 I know you don not work on it anymore, but you might be interested :)
CC: @ssisk for reflexion for IT tests

This is a port of the NexMark queries to Beam, to be used as integration tests.
This can also be used as A-B testing (no-regression or performance comparison between 2 versions of the same engine or of the same runner)

This a continuation of the previous PRs (#99 and #366) from Mark Shields.
The code has changed quite a bit: some queries have changed to use new Beam APIs and there where some big refactorings. More important, we can now run all the queries in all the runners. Nevertheless, there are still some open issues in Nexmark (https://github.com/iemejia/beam/issues) and in Beam upstream (see issue links in https://issues.apache.org/jira/browse/BEAM-160)

Here is a doc that present NexMark components and pseudo code of the queries to ease the review : https://drive.google.com/open?id=1VgnGiVu8vSfm7Et-xAtQYv0PlEpqeyfmhpQUNPmWRJs

Everything needed to launch the queries is in the Readme. There is also a support matrix towards the runners.

Please do not squash commits because there are several authors Mark, Ismaël and I.

Good review :) !

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.07%) to 68.13% when pulling 05698df on iemejia:BEAM-160-nexmark into d6ac39a on apache:master.

@jbonofre
Copy link
Member

R: @jbonofre

@dhalperi
Copy link
Contributor

R: -@dhalperi

return options.isStreaming();
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these hard-coded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option allows the user to specify if we wants to run the pipelines in streaming mode or in batch mode. If he sets streaming=true in the options, then the first step of the pipelines will be
Read.from(new UnboundedEventSource) otherwise it will be Read.from(new BoundedEventSource)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the unclear comment; I was referring to the coresPerWorker and maxNumWorkers() functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO I think they are leftovers: coresPerWorker is never used, so I will remove it. And maxNumWorkers() is only used in query 10 (write event logs to Google Cloud Storage) to define the number of shards.

I will start a discussion on the ML about whether we should keep query 10 like this (with Google specifics as the aim was to test a specific Google customer use case) or use a more generic (Beam filesystems) data storage technology. If so, the configuration parameter is likely to disappear also. But this refactoring of query 10 does not block the PR I think. It could be done afterwards.

*/
private void invokeBuilderForPublishOnlyPipeline(PipelineBuilder<NexmarkOptions> builder) {
builder.build(options);
// throw new UnsupportedOperationException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete commented code if it's not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes .The PR in general lacks a big cleaning :)

* If monitoring, wait until the publisher pipeline has run long enough to establish
* a backlog on the Pubsub topic. Otherwise, return immediately.
*/
private void waitForPublisherPreload() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider just deleting this along with the other preload-related functionality, since it doesn't work in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only left conf.preloadSeconds that is used to count initialization time in the time before cancelling the job.

sinkEventsToAvro(source);
}

// Special hacks for Query 10 (big logger).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add more comments about what the hacks are and why they are needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

* Setup pipeline with codes and some other options.
*/
public static void setupPipeline(CoderStrategy coderStrategy, Pipeline p) {
//TODO Ismael check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new direct runner doesn't support disabling randomization at all, so this can be deleted. Do the tests pass on DirectRunner without it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all the queries and and all the unit tests pass in direct runner in batch and in streaming mode even if randomization is not disabled so I can safely remove this piece of code.

/**
* Return a transform to write given number of bytes to durable store on every record.
*/
public static <T> ParDo.SingleOutput<T, T> diskBusy(String name, final long bytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can use the State API now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, forgot that one, thanks!

public void encode(Event value, OutputStream outStream)
throws CoderException, IOException {
if (value.newPerson != null) {
INT_CODER.encode(0, outStream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the tag values in an enum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed that one also, thanks!

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.6%) to 68.55% when pulling 403f1ec on iemejia:BEAM-160-nexmark into d6ac39a on apache:master.

@asfbot
Copy link

asfbot commented May 30, 2017

--none--

@asfbot
Copy link

asfbot commented May 31, 2017

Build finished.
--none--

@echauchot
Copy link
Contributor Author

@dpmills thanks for your comments, I addressed all of them, PTAL.

@asfbot
Copy link

asfbot commented May 31, 2017

Build finished.
--none--

@asfbot
Copy link

asfbot commented May 31, 2017

Build finished.
--none--

@dpmills
Copy link
Contributor

dpmills commented May 31, 2017

LGTM

I'm not a committer however, so over to @dhalperi for comitting.

@dhalperi
Copy link
Contributor

I don't know how to merge this. Too many commits, too many authors, request not to squash.

This is going to really pollute history if merged as-is, but that is quite possibly by design.

I'll let @iemejia do it, he's a committer :)

@dhalperi
Copy link
Contributor

Additionally, I believe there are as-yet-unresolved issues about module names and paths of code.

R: @davorbonaci for some input.

@iemejia
Copy link
Member

iemejia commented May 31, 2017

@dhalperi I will see with @echauchot to reduce the number of commits, I think we can still squash some of the latest commits (that addressed the review comments + did the latest fixes before the PR), anyway you have to consider that this is a long ongoing work with three different authors so it is normal to have a reasonable number of commits, I won't consider this pollution because there are not intentionally tiny commits, most of them are like this by design, but we will see still how to reduce them.

@iemejia
Copy link
Member

iemejia commented May 31, 2017

For reference @davorbonaci the last point Dan mentions is the paths, currently Nexmark uses the original path that Mark had for the PR: integration/java/nexmark This introduces a new directory structure for integration tests on Beam, this can make sense I think and even be used to have also part of the things @ssisk is working on.
In a brief discussion with Dan we also talked about the possibility of putting this into the examples directory but after discussing with @echauchot we considered that it didn't make sense either because Nexmark is not just a example, it has clearly a different scope.
So in the end we need you to confirm us if you are ok with this new path or to propose us some alternative so we can refactor it if needed.

@echauchot
Copy link
Contributor Author

@dhalperi we managed to reduce the number of commits from 55 to 27: we wanted to keep some big refactorings or some big subjects (like queries fixing/improving) separated. I just pushed the branch. WDYT?

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.1%) to 68.548% when pulling 6232a61 on iemejia:BEAM-160-nexmark into 217f085 on apache:master.

Copy link
Member

@davorbonaci davorbonaci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Left a few comments.

@@ -0,0 +1,282 @@
<!--
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sdks/java/nexmark?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have it in the sdk! I am in favor of it. The only question I ask myself is: Nexmark is more a benchmarking suite, not an API or framework. Is it not a problem to have a piece of software that is not used to build software in the sdk module?

under the License.
-->

# Running NexMark on Beam on Flink on Google Compute Platform
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to the website; contribution section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out! Actually it is a leftover, I forgot that file. I propose to simply remove it because we have not tested to run Nexmark that way since Mark did; more important, I prefer communicating about running NexMark on a standard Flink cluster similarly to what is described in the readme about Spark cluster.

under the License.
-->

# NEXMark integration suite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

website?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean having this content in both beam website and the readme, right?

Copy link
Contributor Author

@echauchot echauchot Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, what I propose is to add a new section to the website with a short introduction to Nexmark similar to the talk we gave at the ApacheCon and a reference the Readme (once it is merged) to avoid duplication of information.
Do you find it ok?

<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>beam-integration-java-nexmark</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beam-sdks-java-nexmark


<parent>
<groupId>org.apache.beam</groupId>
<artifactId>beam-integration-java-parent</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beam-sdks-java-parent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 modulo my interrogation in your first comment

/**
* Nexmark.
*/
package org.apache.beam.integration.nexmark;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double-check the output of the aggregated javadoc. Nexmark packages shouldn't be there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.beam.integration.nexmark.queries;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

org.apache.beam.sdk.nexmark.XXX (across the board)

pom.xml Outdated
<module>sdks</module>
<module>runners</module>
<module>examples</module>
<module>integration</module>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?

@@ -0,0 +1,51 @@
<?xml version="1.0" encoding="UTF-8"?>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?

@@ -0,0 +1,37 @@
<?xml version="1.0" encoding="UTF-8"?>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?

@echauchot
Copy link
Contributor Author

Thanks @davorbonaci for your comments ! I'll be in meetings today, but I'll address them ASAP.

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.2%) to 68.524% when pulling bd0bac5 on iemejia:BEAM-160-nexmark into 217f085 on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.008%) to 70.674% when pulling 45206f6 on iemejia:BEAM-160-nexmark into 217f085 on apache:master.

@echauchot
Copy link
Contributor Author

echauchot commented Jun 16, 2017

Davor, thanks for your review, I have addressed your comments, see the last 4 commits. I have also rebased on master. I will squash all the review comments into Fix general issues commit when you find the PR is ok.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.006%) to 70.67% when pulling 85d743f on iemejia:BEAM-160-nexmark into c528fb2 on apache:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 70.687% when pulling 85d743f on iemejia:BEAM-160-nexmark into c528fb2 on apache:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 70.695% when pulling 85d743f on iemejia:BEAM-160-nexmark into c528fb2 on apache:master.

@reuvenlax
Copy link
Contributor

Is this PR blocked on something right now?

@echauchot
Copy link
Contributor Author

@jbonofre moved to 2.2.0-SNAPSHOT and rebased on master

@reuvenlax
Copy link
Contributor

@davorbonaci, any more concerns with this PR?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 69.961% when pulling b3d1f81 on iemejia:BEAM-160-nexmark into 07e8cd5 on apache:master.

Copy link
Member

@davorbonaci davorbonaci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sorry for my delay here. At this point, I think this should just be merged. I do think there are open questions such as whether we release this module to Maven Central, does the code belong to the regular jar or test-jar, etc.

Regardless of those questions, please move forward and we'll figure things out, as needed.

@echauchot
Copy link
Contributor Author

echauchot commented Aug 23, 2017

I'm very glad to see nexmark move forward, thanks! I'm doing a final rebase of some commits.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.007%) to 69.974% when pulling 4a6c9f0 on iemejia:BEAM-160-nexmark into 9b175cc on apache:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 69.971% when pulling 6d068f7 on iemejia:BEAM-160-nexmark into 9b175cc on apache:master.

@echauchot echauchot force-pushed the BEAM-160-nexmark branch 2 times, most recently from 75a4ae5 to 7aa698c Compare August 23, 2017 13:11
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 69.967% when pulling 7aa698c on iemejia:BEAM-160-nexmark into 9b175cc on apache:master.

…ation code

- Use state API in NexmarkUtils.diskBusy()
- Remove commented code for direct runner randomization disabling: direct runner no more allows disabling randomization and queries and UT pass
Clean pom, exclude nexmark packages from aggregated javadoc, put spark logs in WARN

Update execution matrix in README: Flink termination of streaming pipelines is now ok as far as Nexmark is concerned

Remove how to run Nexmark on Flink on Google Cloud Platform from README

Update command lines in README after moving nexmark from integration to sdks module
@coveralls
Copy link

Coverage Status

Coverage increased (+0.003%) to 69.977% when pulling d83bd07 on iemejia:BEAM-160-nexmark into f0ce31b on apache:master.

@asfgit asfgit closed this in 64ff21f Aug 23, 2017
@mshields822
Copy link
Contributor

Thanks for all your great work on this Etienne and Ismaël! Happy to see it closed at last.

Best,
-Mark

@iemejia
Copy link
Member

iemejia commented Aug 23, 2017

No, the big credit is up to you @mshields822 this would not have been possible at all without your work, we just pushed it a bit to the last line, you did the long journey. Thanks for the support and of course for your nice creation.

@echauchot
Copy link
Contributor Author

I could not have said it better :)

@iemejia iemejia deleted the BEAM-160-nexmark branch August 24, 2017 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants