KAFKA-9509: Fixing flakiness of MirrorConnectorsIntegrationTest.testReplication#8048
KAFKA-9509: Fixing flakiness of MirrorConnectorsIntegrationTest.testReplication#8048hachikuji merged 8 commits intoapache:trunkfrom
Conversation
|
ok to test |
hachikuji
left a comment
There was a problem hiding this comment.
Thanks, left a few comments.
kkonstantine
left a comment
There was a problem hiding this comment.
Thanks @skaundinya15
This test is indeed flaky.
After a first pass on the PR I have a suggestion regarding your assertion code. We have a similar assertion elsewhere.
|
Thanks for the initial reviews @hachikuji and @kkonstantine! I've updated as per your comments by using the method posted by @kkonstantine and checking to see if the connector and its tasks were up after each connector is configured. @hachikuji Now that the logic has changed a bit, do you think it's okay to do the connector + task check after each connector is configured or do you think we should refactor it out so we can do it per |
kkonstantine
left a comment
There was a problem hiding this comment.
Thanks @skaundinya15
Almost there I think. One more comment regarding how the assertion is validated.
Additionally, I'd also change the jenkins.sh file locally and would run this test repeatedly (like 100 times at least) to see if it ever fails again (locally and in a jenkins branch builder).
ryannedolan
left a comment
There was a problem hiding this comment.
Thanks, big improvement :)
kkonstantine
left a comment
There was a problem hiding this comment.
Thanks for fixing this. LGTM
|
Thanks for the reviews @kkonstantine and @hachikuji! @hachikuji I just updated the PR with your suggested changes, so ready for another round of reviews whenever you are. |
hachikuji
left a comment
There was a problem hiding this comment.
Looks good, just a few small comments.
|
ok to test |
|
retest this please |
|
ok to test |
|
Checkstyle failed: |
|
@hachikuji Checkstyle issues should be fixed |
|
ok to test |
|
retest this please |
2 similar comments
|
retest this please |
|
retest this please |
hachikuji
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the fix!
…eplication (#8048) The test case `org.apache.kafka.connect.mirror.MirrorConnectorsIntegrationTest.testReplication` has shown to be increasingly flaky recently. This PR aims to make this test more deterministic. Specifically, the flakiness was due to a timing issue between the tasks not starting up in time for the test to start running. This PR remediates that by introducing a status check after every connector is started up. These status checks include that the connector is found on the connect cluster as well as there are tasks created and up and running for that connector. These checks are introduced before the test starts running so that there is a confidence that the connectors and tasks are started up correctly before the test runs. Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>
…eplication (apache#8048) The test case `org.apache.kafka.connect.mirror.MirrorConnectorsIntegrationTest.testReplication` has shown to be increasingly flaky recently. This PR aims to make this test more deterministic. Specifically, the flakiness was due to a timing issue between the tasks not starting up in time for the test to start running. This PR remediates that by introducing a status check after every connector is started up. These status checks include that the connector is found on the connect cluster as well as there are tasks created and up and running for that connector. These checks are introduced before the test starts running so that there is a confidence that the connectors and tasks are started up correctly before the test runs. Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>
…t-for-generated-requests * apache-github/trunk: (410 commits) KAFKA-8843: KIP-515: Zookeeper TLS support MINOR: Add missing quote for malformed line content (apache#8070) MINOR: Simplify KafkaProducerTest (apache#8044) KAFKA-9507; AdminClient should check for missing committed offsets (apache#8057) KAFKA-9519: Deprecate the --zookeeper flag in ConfigCommand (apache#8056) KAFKA-9509; Fixing flakiness of MirrorConnectorsIntegrationTest.testReplication (apache#8048) HOTFIX: Fix two test failures in JDK11 (apache#8063) DOCS - clarify transactionalID and idempotent behavior (apache#7821) MINOR: further InternalTopologyBuilder cleanup (apache#8046) MINOR: Add timer for update limit offsets (apache#8047) HOTFIX: Fix spotsbug failure in Kafka examples (apache#8051) KAFKA-9447: Add new customized EOS model example (apache#8031) KAFKA-8164: Add support for retrying failed (apache#8019) HOTFIX: checkstyle for newly added unit test KAFKA-9261; Client should handle unavailable leader metadata (apache#7770) MINOR: Fix typos introduced in KIP-559 (apache#8042) MINOR: Fixing null handilg in ValueAndTimestampSerializer (apache#7679) KAFKA-9113: Clean up task management and state management (apache#7997) MINOR: fix checkstyle issue in ConsumerConfig.java (apache#8038) KAFKA-9491; Increment high watermark after full log truncation (apache#8037) ...
JIRA: https://issues.apache.org/jira/browse/KAFKA-9509
As the JIRA indicates,
org.apache.kafka.connect.mirror.MirrorConnectorsIntegrationTest.testReplicationhas shown to be an increasingly flaky test recently. This PR aims to make this test more deterministic. Specifically, the flakiness was due to a timing issue between the tasks not starting up in time for the test to start running. This PR remediates that by introducing a status check after every connector is started up. These status checks include that the connector is found on the connect cluster as well as there are tasks created and up and running for that connector. These checks are introduced before the test starts running so that there is a confidence that the connectors and tasks are started up correctly before the test runs.Committer Checklist (excluded from commit message)