Skip to content

KAFKA-7940: wait until we've got the expected partition size#9777

Merged
hachikuji merged 4 commits intoapache:trunkfrom
showuon:KAFKA-7940
Feb 1, 2021
Merged

KAFKA-7940: wait until we've got the expected partition size#9777
hachikuji merged 4 commits intoapache:trunkfrom
showuon:KAFKA-7940

Conversation

@showuon
Copy link
Copy Markdown
Member

@showuon showuon commented Dec 22, 2020

The test will create 99 partitions in a topic, and expect we can get the partition info after 15 seconds. If we can't get the partition info within 15 secs, we'll get the error:

org.scalatest.exceptions.TestFailedException: Partition [group1_largeTopic,69] metadata not propagated after 15000 ms

Obviously, 15 secs is not enough to complete the 99 partitions creation under slow system. So, fix it by explicitly wait until we've got the expected partition size before retrieving each partition info.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Dec 22, 2020

@rajinisivaram , please help review this PR. Thanks.

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Jan 7, 2021

@rajinisivaram , please help review this PR. Thanks.

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Jan 14, 2021

@rajinisivaram @junrao , please help review this PR. Thanks.

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Jan 21, 2021

@rajinisivaram @junrao , please help review this PR. Thanks.

@showuon
Copy link
Copy Markdown
Member Author

showuon commented Jan 27, 2021

@omkreddy @hachikuji , could you take a look for this PR?
The failed test cases keep increasing with the same root cause while we try to create topic with many partitions, it'll fail that to propagate the metadata in time.

https://ci-builds.apache.org/job/Kafka/job/kafka-trunk-jdk11/439/testReport/junit/kafka.server/MultipleListenersWithDefaultJaasContextTest/testProduceConsume__/
https://ci-builds.apache.org/job/Kafka/job/kafka-trunk-jdk11/437/testReport/kafka.api/PlaintextConsumerTest/testMultiConsumerStickyAssignment__/

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@showuon Thanks for the patch and sorry for the delay. I left a couple comments.

Comment thread core/src/test/scala/unit/kafka/utils/TestUtils.scala Outdated
Comment thread core/src/test/scala/unit/kafka/utils/TestUtils.scala Outdated
Comment thread core/src/test/scala/unit/kafka/utils/TestUtils.scala Outdated
@showuon
Copy link
Copy Markdown
Member Author

showuon commented Feb 1, 2021

@hachikuji , thanks for the comments. I've updated:

  1. rename the method:
    waitUntilMetadataIsPropagatedWithExpectedSize -> waitForAllPartitionMetadata
    waitUntilMetadataIsPropagated -> waitForPartitionMetadata
  2. return the partition metadata for waitForAllPartitionMetadata and waitForPartitionMetadata.
  3. skip the calls to waitUntilMetadataIsPropagated and waitUntilLeaderIsElectedOrChanged, get the info from the returned partition metadata
  4. Use server.metadataCache directly instead of server.dataPlaneRequestProcessor.metadataCache

Thank you.

TestUtils.waitForPartitionMetadata(servers, topic3, 3)
TestUtils.waitForPartitionMetadata(servers, topic3, 4)
TestUtils.waitForPartitionMetadata(servers, topic3, 5)
TestUtils.waitForPartitionMetadata(servers, topic3, 6)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't replace them with waitForAllPartitionsMetadata because I'm afraid it'll break the original testing purposes. And same as other places.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the patch!

@hachikuji hachikuji merged commit 7feb557 into apache:trunk Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants