Skip to content

MINOR: add test for StreamsSmokeTestDriver#6231

Merged
bbejeck merged 14 commits intoapache:trunkfrom
vvcephei:test-streams-smoketest
Feb 15, 2019
Merged

MINOR: add test for StreamsSmokeTestDriver#6231
bbejeck merged 14 commits intoapache:trunkfrom
vvcephei:test-streams-smoketest

Conversation

@vvcephei
Copy link
Copy Markdown
Contributor

@vvcephei vvcephei commented Feb 4, 2019

Also, add more output for debuggability

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Also, add more output for debuggability
@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 4, 2019

Hi @bbejeck @mjsax @ableegoldman @guozhangwang ,

Please take a look at this when you get the chance.

The primary concern is adding the test. It will help us verify changes to the smoke test (such as adding suppression).

I've also added some extra output to the smoke test stdout, which will hopefully aid us in diagnosing the flaky tests.

Finally, I bundled in some cleanup. It was my intention to do that in a separate PR, but it wound up getting smashed together during refactoring.

Please let me know if you'd prefer for me to pull any of these out into a separate request.

Thanks,
-John

* Runs an in-memory, "embedded" Kafka cluster with 1 ZooKeeper instance and supplied number of Kafka brokers.
*/
public class EmbeddedKafkaCluster extends ExternalResource {
public class EmbeddedKafkaCluster extends ExternalResource implements AutoCloseable {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allows using the embedded kafka with try-with-resources.

public class SmokeTestClient extends SmokeTestUtil {

private final Properties streamsProperties;
private final String name;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adds a name to the streams instance to help correlate logs and stdout events

e.printStackTrace();
}
streams.setUncaughtExceptionHandler((t, e) -> {
System.out.println(name + ": SMOKE-TEST-CLIENT-EXCEPTION");
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add new outputs with the name, but avoid touching existing lines that are used in the system tests.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To what extend are existing lines used? How difficult would it be to update the testa to avoid duplicate output? For example, if we only grep for the pattern, adding a prefix name should not require any updates to system test code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've been thinking about this... I agree I should either confirm your suspicion or just ditch the new lines.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree I think that by adding a prefix with space should be fine for the test.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I've removed the unprefixed lines, and am running the system tests to see.

import static org.apache.kafka.streams.tests.SmokeTestDriver.generate;
import static org.apache.kafka.streams.tests.SmokeTestDriver.verify;

public class SmokeTestDriverTest {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the new test.

@mjsax mjsax added the streams label Feb 5, 2019
e.printStackTrace();
}
streams.setUncaughtExceptionHandler((t, e) -> {
System.out.println(name + ": SMOKE-TEST-CLIENT-EXCEPTION");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To what extend are existing lines used? How difficult would it be to update the testa to avoid duplicate output? For example, if we only grep for the pattern, adding a prefix name should not require any updates to system test code.

final KafkaConsumer<byte[], byte[]> consumer = new KafkaConsumer<>(props);
final List<TopicPartition> partitions = getAllPartitions(consumer, "echo", "max", "min", "dif", "sum", "cnt", "avg", "wcnt", "tagg");
final KafkaConsumer<String, byte[]> consumer = new KafkaConsumer<>(props);
final List<TopicPartition> partitions = getAllPartitions(consumer, "data", "echo", "max", "min", "dif", "sum", "cnt", "avg", "wcnt", "tagg");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to verify the input topic?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious if there was dirty state from a prior run (there was). Also, this enables us to print the input events for the same key so we can analytically verify where the output went wrong.


final int recordsGenerated = allData.size() * maxRecordsPerKey;
int recordsProcessed = 0;
final Map<String, AtomicInteger> processed = Stream.of("data", "echo", "max", "min", "dif", "sum", "cnt", "avg", "wcnt", "tagg").collect(Collectors.toMap(t -> t, t -> new AtomicInteger(0)));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: line too long

}

private static boolean verifyMin(final Map<String, Integer> map, final Map<String, Set<Integer>> allData, final boolean print) {
private static <V> void addEvent(final String key, final HashMap<String, LinkedList<V>> eventsMap, final V value) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we need this method. Use Map#computeIfAbsent() instead?

final HashMap<String, LinkedList<Long>> cntEvents = new HashMap<>();
final HashMap<String, LinkedList<Double>> avgEvents = new HashMap<>();
final HashMap<String, LinkedList<Long>> wcntEvents = new HashMap<>();
final HashMap<String, LinkedList<Long>> taggEvents = new HashMap<>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need all output events? Also, if we store all output events, why do we still need the current once from below? Seems redundant?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for debugging. If a key has the wrong value, we can compare the input and output events to see where it went wrong.

Yes, it's redundant with the last result below, I can take it out.

}

@Test
public void shouldWorkWithRebalance() throws InterruptedException, IOException {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: simplify to throw Exception

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, just throw Exception here.

Thread.sleep(1000);

// add a new client
final SmokeTestClient smokeTestClient = new SmokeTestClient("streams" + numClientsCreated++);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add dash -> "streams-" + numClientsCreated++

}
for (final SmokeTestClient client : clients) {
client.close();
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not both loops?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't follow. The first one is async and tells all the instances to stop. The second one blocks until they do stop. Otherwise, we'd be waiting for them to stop one at a time.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups. That comment is weird. I meant: Why do we need both loops?

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 5, 2019

Question for the reviewers:

We currently base our expectations on the data that we produced (I.e., the producer encodes the expected values in the key itself), and this is what we compare the results against.

It seems more resilient to instead consume the input topic and compute the expected results analytically (by finding the max, min, etc.). Doing so would render us blind to input events that got dropped (or otherwise corrupted) by the broker, though.

Alternatively, now that we're printing the input events and output events when there's a failure, we can more easily reason about what actually happened. So maybe we should instead just take the step in this PR and then decide what to do from there?

What do you think?

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vvcephei left couple of minor comments, otherwise looks good.

e.printStackTrace();
}
streams.setUncaughtExceptionHandler((t, e) -> {
System.out.println(name + ": SMOKE-TEST-CLIENT-EXCEPTION");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree I think that by adding a prefix with space should be fine for the test.

}

@Test
public void shouldWorkWithRebalance() throws InterruptedException, IOException {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, just throw Exception here.

final int maxRecordsPerKey) {
return generate(kafka, numKeys, maxRecordsPerKey, true);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we remove this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to simplify the paths. It was used in exactly one place. Might as well just pass the flag from there.

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Feb 5, 2019

It seems more resilient to instead consume the input topic and compute the expected results analytically (by finding the max, min, etc.). Doing so would render us blind to input events that got dropped (or otherwise corrupted) by the broker, though.

I am fine with this approach.

Alternatively, now that we're printing the input events and output events when there's a failure, we can more easily reason about what actually happened.

I am actually a little concerned with this, because we pipe a lof of data through... Not sure if it is feasible (or helpful) to compare all the raw data manually in case of failure. It's like searching a needle in a haystack.

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 5, 2019

@mjsax :

I am actually a little concerned with this, because we pipe a lof of data through... Not sure if it is feasible (or helpful) to compare all the raw data manually in case of failure. It's like searching a needle in a haystack.

Note, it's only the raw data for the relevant key.
Right now, we get a message like "key='6-1006', expected min=7, got min=5".
It doesn't seem infeasible to search a list of 1000 numbers to determine if 5 is really in the list or not. Seeing the full input/output, at least in the short-term, can help us answer the question that we have not been able to answer so far: "is Streams computing the wrong result, or does the test have the wrong expectation?"

Is there a better way to approach this that I'm not seeing?

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Feb 5, 2019

It doesn't seem infeasible to search a list of 1000 numbers

Maybe. Not sure.

Is there a better way to approach this that I'm not seeing?

Don't think so. Was just raising a question. Have no better suggestion.

Copy link
Copy Markdown
Contributor Author

@vvcephei vvcephei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjsax @bbejeck ,

Thanks for your reviews!

I've updated the PR in response to the request to refactor the driver for less redundancy. I also addressed your comments.

Thanks,
-John

streams4.close();

System.out.println("shutdown");
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the main method. It actually didn't work, and it wasted a bunch of my time debugging it. Its function is now served by the integration test.

if (retry++ > MAX_RECORD_EMPTY_RETRIES) {
break;
}
break;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prior logic would keep polling for up to 30 seconds after we pulled as many records as we generated, as long as the verification kept failing.

If we pull more than we generated, it's already a failure, so we might as well fail fast. If you agree, then the only remaining logic was to poll for 30 seconds in .5 second increments. Now that we have long polling in the consumer, we might as well just poll for up to 30 seconds once.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor point, if we don't have EOS enabled, we could end up pulling duplicates in some tests like the StreamsUpgradeTest::upgrade_downgrade_brokers

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point.

I have noticed in running the test locally that the long-poll does seem slower than our prior code here, so maybe I'll go ahead and revert this part of the diff.

break;
default:
System.out.println("unknown topic: " + record.topic());
for (final ConsumerRecord<String, byte[]> record : records) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactor is just to eliminate redundancies in the code and hopefully make it easier to read.

final KafkaConsumer<byte[], byte[]> consumer = new KafkaConsumer<>(props);
final List<TopicPartition> partitions = getAllPartitions(consumer, "echo", "max", "min", "dif", "sum", "cnt", "avg", "wcnt", "tagg");
final KafkaConsumer<String, byte[]> consumer = new KafkaConsumer<>(props);
final String[] topics = {"data", "echo", "max", "min", "dif", "sum", "cnt", "avg", "tagg"};
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: "wcnt" is removed from the list. We weren't verifying it, and it turns out, we don't produce to it at all. So I guess it's a fossil from a prior refactor.

}

@Test
public void shouldWorkWithRebalance() throws InterruptedException {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test you can run to make sure the smoke test driver actually works.

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @vvcephei overall looks good. I just have one minor comment.

Additionally, I'll kick off a branch builder for the streams upgrade test which is failing at the moment and it will be a good test to see if looking through the results is any easier than our current approach.

EDIT: I just saw that is the system test you kicked off. So I'll just wait until it's done and look through the results then.

if (retry++ > MAX_RECORD_EMPTY_RETRIES) {
break;
}
break;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor point, if we don't have EOS enabled, we could end up pulling duplicates in some tests like the StreamsUpgradeTest::upgrade_downgrade_brokers

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 5, 2019

@bbejeck , the system test job is: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2319/console

result: http://testing.confluent.io/confluent-kafka-branch-builder-system-test-results/?prefix=2019-02-05--001.1549412112--vvcephei--test-streams-smoketest--80bd3f1/

I did notice one failure scroll by, and I had the same thought. I'd definitely welcome your thoughts on seeing the output in context.

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 6, 2019

Java 11 build was healthy, but timed out after 3 hours :(

Java8 had one unrelated test failure:

  • kafka.api.SaslSslAdminClientIntegrationTest.testMinimumRequestTimeouts

Retest this, please.

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 7, 2019

Java 11 passed: https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2182/
Both java 8 failures were unrelated:
https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/19276/

kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup
kafka.api.AdminClientIntegrationTest.testMinimumRequestTimeouts

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 7, 2019

@mjsax I think this PR is ready. Do you mind making another quick pass when you have a chance?

Thanks,
-John

if (!verificationResult.passed()) {
verificationResult = verifyAll(inputs, events);
}
success &= verificationResult.passed();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be |= ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's only a success if all the checks pass. So I think it's correct as an "and". Does that seem right?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. I guess I mixed up old code and new code -- also the comment give it one more try if it's not already passing is confusing. The interplay between verificationResult.passed() and success seems a little convoluted.

final Number value;
switch (record.topic()) {
case "data": {
value = intSerde.deserializer().deserialize("", record.value());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pass in record.topic() instead of "" into deserialize.

if (print) {
System.out.println("verifying min");
case "echo": {
value = intSerde.deserializer().deserialize("", record.value());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is s repetition of "data" case -- similar below -- we should put all int/long/double cases etc together to void code duplication using "case-fall-through" pattern.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually did that in my follow-up PR. I'll go ahead and pull it into this one.

private static VerificationResult verifyAll(final Map<String, Set<Integer>> inputs,
final Map<String, Map<String, LinkedList<Number>>> events) {
final ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
final PrintStream resultStream = new PrintStream(byteArrayOutputStream);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use try-with-resource here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

pass = verifyTAgg(resultStream, inputs, events.get("tagg"));
pass &= verify(resultStream, "min", inputs, events, SmokeTestDriver::getMin);
pass &= verify(resultStream, "max", inputs, events, SmokeTestDriver::getMax);
pass &= verify(resultStream, "dif", inputs, events, key -> getMax(key).intValue() - getMin(key).intValue());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a good idea to convert long to int here? Are we sure it's zero risk? Also, it it future proof if we change the test data with potential bigger values that don't fit into ints?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not converting long to int. It's reading a Number as an int. Since we populated the Number with an int to begin with, it should be perfectly safe.

pass &= verify(resultStream, "max", inputs, events, SmokeTestDriver::getMax);
pass &= verify(resultStream, "dif", inputs, events, key -> getMax(key).intValue() - getMin(key).intValue());
pass &= verify(resultStream, "sum", inputs, events, SmokeTestDriver::getSum);
pass &= verify(resultStream, "cnt", inputs, events, key1 -> getMax(key1).intValue() - getMin(key1).intValue() + 1L);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

if (!expected.equals(actual)) {
resultStream.printf("fail: key=%s %s=%s expected=%s%n\t inputEvents=%s%n\toutputEvents=%s%n",
key,
topicName,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use topicName here? Should it not be failed: key=X actual=A expected=B... instead of failed: key=X <topicName>=A expected=B...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I was just re-producing the existing log message, which says something like failed: key=X min=A expected=B.

I agree with you. I'll refactor it a bit.

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Feb 12, 2019

Addressed the comments and pulled in some additional improvements I made while debugging the suppression logic.

Kicked off another system test run: https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2343/

System tests all passed, except for the upgrade test, which is currently broken:
http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/2019-02-12--001.1550036000--vvcephei--test-streams-smoketest--512ea92/report.html

@vvcephei
Copy link
Copy Markdown
Contributor Author

Failures were unrelated broker tests.

Retest this, please.

@mjsax
Copy link
Copy Markdown
Member

mjsax commented Feb 13, 2019

https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2338/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testMinimumRequestTimeouts/

java.lang.AssertionError: Expected an exception of type org.apache.kafka.common.errors.TimeoutException; got type org.apache.kafka.common.errors.SslAuthenticationException
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.assertTrue(Assert.java:42)
	at kafka.utils.TestUtils$.assertFutureExceptionTypeEquals(TestUtils.scala:1435)
	at kafka.api.AdminClientIntegrationTest.testMinimumRequestTimeouts(AdminClientIntegrationTest.scala:1071)

Retest this please.

@vvcephei
Copy link
Copy Markdown
Contributor Author

More unrelated core test failures for Java 11:
https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/2344/

02:02:03.455 kafka.admin.ConfigCommandTest > classMethod FAILED
02:02:03.455     java.lang.AssertionError: Found unexpected threads during @AfterClass, allThreads=Set(data-plane-kafka-network-thread-0-ListenerName(PLAINTEXT)-PLAINTEXT-0, SessionTracker, data-plane-kafka-network-thread-1-ListenerName(PLAINTEXT)-PLAINTEXT-1, data-plane-kafka-request-handler-4, data-plane-kafka-network-thread-2-ListenerName(PLAINTEXT)-PLAINTEXT-2, Reference Handler, data-plane-kafka-network-thread-0-ListenerName(PLAINTEXT)-PLAINTEXT-2, ProcessThread(sid:0 cport:40574):, ExpirationReaper-0-Heartbeat, SyncThread:0, kafka-coordinator-heartbeat-thread | sticky-group, data-plane-kafka-request-handler-0, executor-Heartbeat, ExpirationReaper-1-Heartbeat, data-plane-kafka-network-thread-2-ListenerName(PLAINTEXT)-PLAINTEXT-0, SensorExpiryThread, ExpirationReaper-0-ElectPreferredLeader, kafka-log-cleaner-thread-0, kafka-scheduler-8, ExpirationReaper-2-Heartbeat, ExpirationReaper-2-ElectPreferredLeader, metrics-meter-tick-thread-1, TxnMarkerSenderThread-1, /0:0:0:0:0:0:0:1:46408 to /0:0:0:0:0:0:0:1:34525 workers Thread 3, Signal Dispatcher, ExpirationReaper-0-Rebalance, ExpirationReaper-2-DeleteRecords, controller-event-thread, transaction-log-manager-0, executor-Rebalance, ThrottledChannelReaper-Produce, ExpirationReaper-1-Fetch, ExpirationReaper-1-Rebalance, ExpirationReaper-2-Produce, kafka-scheduler-4, kafka-scheduler-6, ExpirationReaper-2-Rebalance, ReplicaFetcherThread-0-2, data-plane-kafka-request-handler-7, ExpirationReaper-2-Fetch, kafka-scheduler-0, ExpirationReaper-1-Produce, Controller-0-to-broker-1-send-thread, ExpirationReaper-0-Fetch, ExpirationReaper-1-DeleteRecords, Finalizer, ThrottledChannelReaper-Fetch, kafka-scheduler-2, data-plane-kafka-network-thread-1-ListenerName(PLAINTEXT)-PLAINTEXT-0, data-plane-kafka-network-thread-2-ListenerName(PLAINTEXT)-PLAINTEXT-1, data-plane-kafka-request-handler-3, data-plane-kafka-network-thread-0-ListenerName(PLAINTEXT)-PLAINTEXT-1, executor-Fetch, data-plane-kafka-network-thread-1-ListenerName(PLAINTEXT)-PLAINTEXT-2, ExpirationReaper-0-Produce, ReplicaFetcherThread-0-0, data-plane-kafka-request-handler-5, data-plane-kafka-socket-acceptor-ListenerName(PLAINTEXT)-PLAINTEXT-0, ThrottledChannelReaper-Request, ExpirationReaper-1-ElectPreferredLeader, /config/changes-event-process-thread, Test worker, NIOServerCxn.Factory:/127.0.0.1:0, data-plane-kafka-request-handler-1, /0:0:0:0:0:0:0:1:46408 to /0:0:0:0:0:0:0:1:34525 workers Thread 2, kafka-scheduler-9, metrics-meter-tick-thread-2, main, TxnMarkerSenderThread-2, ExpirationReaper-0-DeleteRecords, kafka-scheduler-3, Controller-0-to-broker-2-send-thread, ExpirationReaper-1-topic, daemon-consumer-assignment, kafka-scheduler-5, Controller-0-to-broker-0-send-thread, ReplicaFetcherThread-0-1, data-plane-kafka-request-handler-6, scala-execution-context-global-11990, group-metadata-manager-0, LogDirFailureHandler, kafka-scheduler-7, ExpirationReaper-2-topic, Test worker-EventThread, TxnMarkerSenderThread-0, ExpirationReaper-0-topic, Common-Cleaner, kafka-scheduler-1, Test worker-SendThread(localhost:40574), data-plane-kafka-request-handler-2), unexpected=Set(kafka-coordinator-heartbeat-thread | sticky-group, controller-event-thread)

Java 8 passed (https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/19443/)

Retest this, please.

Copy link
Copy Markdown
Member

@mjsax mjsax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

\cc @bbejeck for merging (your review is a while back so wanna make sure your are still +1)

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @vvcephei. LGTM

@bbejeck bbejeck merged commit 3656ad9 into apache:trunk Feb 15, 2019
@vvcephei vvcephei deleted the test-streams-smoketest branch February 15, 2019 16:25
@vvcephei
Copy link
Copy Markdown
Contributor Author

Thanks, @bbejeck !

jarekr pushed a commit to confluentinc/kafka that referenced this pull request Apr 18, 2019
* ak/trunk: (45 commits)
  KAFKA-7487: DumpLogSegments misreports offset mismatches (apache#5756)
  MINOR: improve JavaDocs about auto-repartitioning in Streams DSL (apache#6269)
  KAFKA-7935: UNSUPPORTED_COMPRESSION_TYPE if ReplicaManager.getLogConfig returns None (apache#6274)
  KAFKA-7895: Fix stream-time reckoning for suppress (apache#6278)
  KAFKA-6569: Move OffsetIndex/TimeIndex logger to companion object  (apache#4586)
  MINOR: add log indicating the suppression time (apache#6260)
  MINOR: Make info logs for KafkaConsumer a bit more verbose (apache#6279)
  KAFKA-7758: Reuse KGroupedStream/KGroupedTable with named repartition topics (apache#6265)
  KAFKA-7884; Docs for message.format.version should display valid values (apache#6209)
  MINOR: Save failed test output to build output directory
  MINOR: add test for StreamsSmokeTestDriver (apache#6231)
  MINOR: Fix bugs identified by compiler warnings (apache#6258)
  KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 4] (apache#5433)
  MINOR: fix bypasses in ChangeLogging stores (apache#6266)
  MINOR: Make MockClient#poll() more thread-safe (apache#5942)
  MINOR: drop dbAccessor reference on close (apache#6254)
  KAFKA-7811: Avoid unnecessary lock acquire when KafkaConsumer commits offsets (apache#6119)
  KAFKA-7916: Unify store wrapping code for clarity (apache#6255)
  MINOR: Add missing Alter Operation to Topic supported operations list in AclCommand
  KAFKA-7921: log at error level for missing source topic (apache#6262)
  ...
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
* MINOR: add test for StreamsSmokeTestDriver
Hi @bbejeck @mjsax @ableegoldman @guozhangwang ,

Please take a look at this when you get the chance.

The primary concern is adding the test. It will help us verify changes to the smoke test (such as adding suppression).

I've also added some extra output to the smoke test stdout, which will hopefully aid us in diagnosing the flaky tests.

Finally, I bundled in some cleanup. It was my intention to do that in a separate PR, but it wound up getting smashed together during refactoring.

Please let me know if you'd prefer for me to pull any of these out into a separate request.

Thanks,
-John

Also, add more output for debuggability

* cleanup

* cleanup

* refactor

* refactor

* remove redundant printlns

* Update EmbeddedKafkaCluster.java

* move to integration package

* replace early-exit on pass

* use classrule for embedded kafka

* pull in smoke test improvements from side branch

* try-with-resources

* format events instead of printing long lines

* minor formatting fix

Reviewers:  Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants