Skip to content

KAFKA-18653: Fix mocks and potential thread leak issues causing silent RejectedExecutionException in share group broker tests#18725

Merged
AndrewJSchofield merged 10 commits intoapache:trunkfrom
adixitconfluent:kafka-18653
Jan 29, 2025
Merged

Conversation

@adixitconfluent
Copy link
Copy Markdown
Contributor

What

  1. Develocity shows possible thread leak in a few SharePartitionManager tests.
  2. On running SharePartitionMangerTest suite locally, we can observe testReplicaManagerFetchShouldNotProceed and testPendingInitializationShouldCompleteFetchRequest throwing silent RejectedExecutionException.
  3. There are mocks missing in a couple of tests on broker for share groups and cleanup.

@github-actions github-actions Bot added triage PRs from the community core Kafka Broker tests Test fixes (including flaky tests) KIP-932 Queues for Kafka small Small PRs labels Jan 28, 2025
@adixitconfluent adixitconfluent marked this pull request as ready for review January 28, 2025 05:21
@adixitconfluent adixitconfluent changed the title KAFKA-18653: Fix mocks and potential thread leak issues causing silent RejectedExecutionException in share group tests KAFKA-18653: Fix mocks and potential thread leak issues causing silent RejectedExecutionException in share group broker tests Jan 28, 2025
@AndrewJSchofield AndrewJSchofield removed the triage PRs from the community label Jan 28, 2025
Copy link
Copy Markdown
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, some queries.

Comment thread core/src/test/java/kafka/server/share/DelayedShareFetchTest.java
Comment thread core/src/test/java/kafka/server/share/DelayedShareFetchTest.java Outdated
@github-actions github-actions Bot removed the small Small PRs label Jan 28, 2025
@adixitconfluent
Copy link
Copy Markdown
Contributor Author

Test failures are unrelated.

Copy link
Copy Markdown
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, looking good. Some comments.

Comment thread core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala
Comment thread core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala Outdated
Comment thread core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala
Comment thread core/src/test/java/kafka/server/share/DelayedShareFetchTest.java Outdated
Comment thread core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala
Comment thread core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala Outdated
Copy link
Copy Markdown
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes, LGTM!

Copy link
Copy Markdown
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of tiny points. The logs from the already passing test runs are much cleaner with these mocking changes.

* the records are fetched and acquired.
*
* @return A boolean which indicates whether the fetch lock is acquired.
* Visible for testing.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the javadoc, this sentence is part of the description for the return value. Either remove "Visible for testing" (preferred) or make sure it fits in the javadoc more nicely.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed "Visible for testing"


DelayedShareFetch(
/**
* This function returns an instance of delayed share fetch operation for completing share fetch requests instantaneously or with delay.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructs rather than returns I think.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@AndrewJSchofield AndrewJSchofield merged commit dd1f2b8 into apache:trunk Jan 29, 2025
ijuma added a commit to ijuma/kafka that referenced this pull request Jan 30, 2025
…ibrdkafka-compressed-produce-fails

* apache-github/trunk:
  MINOR: prevent exception from HdrHistogram (apache#18674)
  KAFKA-18653: Fix mocks and potential thread leak issues causing silent RejectedExecutionException in share group broker tests (apache#18725)
  KAFKA-18646: Null records in fetch response breaks librdkafka (apache#18726)
  KAFKA-18619: New consumer topic metadata events should set requireMetadata flag (apache#18668)
  KAFKA-18488: Improve KafkaShareConsumerTest (apache#18728)
@Test
def testDelayedShareFetchPurgatoryOperationExpiration(): Unit = {
val mockLogMgr = TestUtils.createLogManager(config.logDirs.map(new File(_)))
val rm = new ReplicaManager(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reminder, it's crucial to utilize a try-finally block to ensure proper closure of the ReplicaManager. Failing to do so can result in an unreleased thread from the purgatory, potentially leading to errors in subsequent integration tests that incorporate thread leak detection.

org.opentest4j.AssertionFailedError: Found 1 unexpected threads during @BeforeAll: executor-ShareFetch ==> expected: <true> but was: <false> 

Also, the error can be reproduced by following command.

./gradlew core:test --tests ReplicaManagerTest --tests SaslApiVersionsRequestTest --tests LeaderEpochIntegrationTest --tests RequestQuotaTest -PmaxParallelForks=1

I will fix it in https://issues.apache.org/jira/browse/KAFKA-18770

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chia7712 , make sense.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I missed it in review. Should have caught it.

pdruley pushed a commit to pdruley/kafka that referenced this pull request Feb 12, 2025
…t RejectedExecutionException in share group broker tests (apache#18725)

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
manoj-mathivanan pushed a commit to manoj-mathivanan/kafka that referenced this pull request Feb 19, 2025
…t RejectedExecutionException in share group broker tests (apache#18725)

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield <aschofield@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-approved core Kafka Broker KIP-932 Queues for Kafka tests Test fixes (including flaky tests)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants