Skip to content

MINOR: refactor streams system test class hierachy#2392

Closed
mjsax wants to merge 2 commits intoapache:trunkfrom
mjsax:minor-system-test-rework
Closed

MINOR: refactor streams system test class hierachy#2392
mjsax wants to merge 2 commits intoapache:trunkfrom
mjsax:minor-system-test-rework

Conversation

@mjsax
Copy link
Copy Markdown
Member

@mjsax mjsax commented Jan 18, 2017

No description provided.

@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Jan 18, 2017

@enothereska @dguy @guozhangwang

Reworking in preparation to add new system tests.

Also for 0.10.2

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 18, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/965/
Test PASSed (JDK 8 and Scala 2.11).

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 18, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/963/
Test PASSed (JDK 8 and Scala 2.12).

@guozhangwang
Copy link
Copy Markdown
Contributor

guozhangwang commented Jan 18, 2017

LGTM overall, could you trigger one simplebenchmark test locally to make sure it works properly @mjsax ?

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 18, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/963/
Test FAILed (JDK 7 and Scala 2.10).

@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Jan 18, 2017

I did test with streams_shutdown_deadlock_test.py and it works there. (Also with the new test to be added for client-broker backwards compatibility).

For streams_simple_benchmark_test.py I found one problem that I fixed. But the benchmark never complete locally but crashes and times out. As it starts up, I guess it is working fine.

The error I get in stderr is (seems to be unrelated to the changes though -- WDYT?):

Exception in thread "StreamThread-3" org.apache.kafka.streams.errors.StreamsException: stream-thread [StreamThread-3] Failed to rebalance
        at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:617)
        at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:369)
Caused by: java.lang.NullPointerException
        at org.apache.kafka.streams.processor.internals.ProcessorStateManager.register(ProcessorStateManager.java:133)
        at org.apache.kafka.streams.processor.internals.AbstractProcessorContext.register(AbstractProcessorContext.java:99)
        at org.apache.kafka.streams.state.internals.RocksDBStore.init(RocksDBStore.java:160)
        at org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueBytesStore.init(ChangeLoggingKeyValueBytesStore.java:39)
        at org.apache.kafka.streams.state.internals.ChangeLoggingKeyValueStore.init(ChangeLoggingKeyValueStore.java:60)
        at org.apache.kafka.streams.state.internals.MeteredKeyValueStore$7.run(MeteredKeyValueStore.java:100)
        at org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:188)
        at org.apache.kafka.streams.state.internals.MeteredKeyValueStore.init(MeteredKeyValueStore.java:131)
        at org.apache.kafka.streams.processor.internals.AbstractTask.initializeStateStores(AbstractTask.java:86)
        at org.apache.kafka.streams.processor.internals.StreamTask.<init>(StreamTask.java:141)
        at org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:839)
        at org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1212)
        at org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1185)
        at org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:942)
        at org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
        at org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
        at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:230)
        at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:336)
        at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:300)
        at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:261)
        at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1025)
        at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:990)
        at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:587)
        ... 1 more
```

@mjsax mjsax force-pushed the minor-system-test-rework branch from 1e70ba7 to d36791c Compare January 18, 2017 06:26
@asfbot
Copy link
Copy Markdown

asfbot commented Jan 18, 2017

Build finished. 2890 tests run, 0 skipped, 1 failed.
Test FAILed (JDK 8 and Scala 2.12).

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 18, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/976/
Test FAILed (JDK 8 and Scala 2.12).

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 18, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/978/
Test PASSed (JDK 8 and Scala 2.11).

@asfbot
Copy link
Copy Markdown

asfbot commented Jan 18, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.10/976/
Test FAILed (JDK 7 and Scala 2.10).

@enothereska
Copy link
Copy Markdown
Contributor

Given that this changes the python code, it'd be good to trigger all system tests, including the smoke one and bounce one before merging.

@enothereska
Copy link
Copy Markdown
Contributor

Also the benchmark test has always ran nightly, that needs to be part of tests to be run (as @mjsax has already done, thanks)

@enothereska
Copy link
Copy Markdown
Contributor

@guozhangwang in an ideal world, this PR and PR: https://github.com/apache/kafka/pull/2391/files should go together.

@guozhangwang
Copy link
Copy Markdown
Contributor

@enothereska System tests will take about 5 hours, and given that we are going to have nightly builds in another 10 hour, I'd say just watch the upcoming system test after merging this.

asfgit pushed a commit that referenced this pull request Jan 18, 2017
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Eno Thereska, Guozhang Wang

Closes #2392 from mjsax/minor-system-test-rework

(cherry picked from commit d8a7756)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
@guozhangwang
Copy link
Copy Markdown
Contributor

LGTM. Merged to trunk and 0.10.2.

@asfgit asfgit closed this in d8a7756 Jan 18, 2017
@mjsax mjsax deleted the minor-system-test-rework branch January 18, 2017 22:05
@enothereska
Copy link
Copy Markdown
Contributor

enothereska commented Jan 19, 2017

@guozhangwang I think this is really bad. System tests for streams take just 5 mins/test, you don't have to run all the system tests, just streams.

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Jan 19, 2017

I agree with @enothereska, a change like this should not be merged before running the Streams system tests via the branch builder. It doesn't take long since there are a small number of Streams tests.

For Core, whenever we change shared security code, we run all the tests before merging even though they take several hours as there is a lot of value in catching issues before the code is merged.

if len(self.pids(node)) == 0:
raise RuntimeError("No process ids recorded")

def collect_data(self, node):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This did not happen intentionally... Will open a PR to fix it.

@guozhangwang
Copy link
Copy Markdown
Contributor

@enothereska @ijuma My apologies, I mis-understood Eno's comment before and was careless for merging the PR.

with node.account.monitor_log(self.STDOUT_FILE) as monitor:
node.account.ssh(self.start_cmd(node))
monitor.wait_until('StreamsSmokeTest instance started', timeout_sec=15, err_msg="Never saw message indicating StreamsSmokeTest finished startup on " + str(node.account))
monitor.wait_until('StreamsTest instance started', timeout_sec=60, err_msg="Never saw message indicating StreamsTest finished startup on " + str(node.account))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably doesn't need to be increased since they start fast.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

soenkeliebau pushed a commit to soenkeliebau/kafka that referenced this pull request Feb 7, 2017
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Eno Thereska, Guozhang Wang

Closes apache#2392 from mjsax/minor-system-test-rework
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants