Skip to content

KAFKA-9830: Implement AutoCloseable in ErrorReporter and subclasses#8442

Merged
kkonstantine merged 6 commits intoapache:trunkfrom
gharris1727:dlq-leak-producer
Apr 30, 2020
Merged

KAFKA-9830: Implement AutoCloseable in ErrorReporter and subclasses#8442
kkonstantine merged 6 commits intoapache:trunkfrom
gharris1727:dlq-leak-producer

Conversation

@gharris1727
Copy link
Copy Markdown
Contributor

@gharris1727 gharris1727 commented Apr 8, 2020

  • The DeadLetterQueueReporter has a KafkaProducer that it must close to clean up resources
  • Currently, the producer and its threads are leaked every time a task is stopped
  • Responsibility for cleaning up ErrorReporters is transitively assigned to the
    ProcessingContext, RetryWithToleranceOperator, and WorkerSinkTask/WorkerSinkTask classes
  • One new unit test in ErrorReporterTest asserts that the producer is closed by the dlq reporter

Signed-off-by: Greg Harris gregh@confluent.io

More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.

Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

* The DeadLetterQueueReporter has a KafkaProducer that it must close to clean up resources
* Currently, the producer and it's threads are leaked every time a task is stopped
* Responsibility for cleaning up ErrorReporters is transitively assigned to the
    ProcessingContext, RetryWithToleranceOperator, and WorkerSinkTask/WorkerSinkTask classes
* One new unit test in ErrorReporterTest asserts that the producer is closed by the dlq reporter

Signed-off-by: Greg Harris <gregh@confluent.io>
@gharris1727
Copy link
Copy Markdown
Contributor Author

cc @wicknicks @C0urante @ncliang for review

Copy link
Copy Markdown
Contributor

@wicknicks wicknicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the quick fix, @gharris1727. one nit.

…atibility

Signed-off-by: Greg Harris <gregh@confluent.io>
… task stop.

Signed-off-by: Greg Harris <gregh@confluent.io>
@gharris1727
Copy link
Copy Markdown
Contributor Author

Thanks for your comments @wicknicks @C0urante, this is ready for a second pass.

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still have a slight preference against adding a default implementation for ErrorReporter::close but will leave that to greater minds. LGTM, nice find @gharris1727!

Copy link
Copy Markdown
Contributor

@wicknicks wicknicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM!

@kkonstantine kkonstantine marked this pull request as ready for review April 8, 2020 03:57
@kkonstantine
Copy link
Copy Markdown
Contributor

ok to test

…rown.

Signed-off-by: Greg Harris <gregh@confluent.io>
… exceptions

Also add a test to verify that multiple reporters are closed despite exceptions

Signed-off-by: Greg Harris <gregh@confluent.io>
@gharris1727
Copy link
Copy Markdown
Contributor Author

@kkonstantine It appears that the tests failed because of some unrelated checkstyle issue on trunk that has been fixed, can you re-run the tests?

@kkonstantine
Copy link
Copy Markdown
Contributor

ok to test

@kkonstantine
Copy link
Copy Markdown
Contributor

retest this please

Copy link
Copy Markdown
Contributor

@kkonstantine kkonstantine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gharris1727 !
Important not to leak producers when tasks close

I have a couple suggestions to make this more concise and a recommendation to avoid instantiating an exception if it's not going to be used.

Signed-off-by: Greg Harris <gregh@confluent.io>
@gharris1727
Copy link
Copy Markdown
Contributor Author

@kkonstantine I've implemented your feedback and this is ready for another pass.

@kkonstantine
Copy link
Copy Markdown
Contributor

retest this please

Copy link
Copy Markdown
Contributor

@kkonstantine kkonstantine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jdk14 green, jdk8 2 failures in streams tests, jdk11 4 failures, also unrelated.

LGTM!

@kkonstantine kkonstantine merged commit 0ee19e8 into apache:trunk Apr 30, 2020
kkonstantine pushed a commit that referenced this pull request Apr 30, 2020
…8442)

* The DeadLetterQueueReporter has a KafkaProducer that it must close to clean up resources
* Currently, the producer and its threads are leaked every time a task is stopped
* Responsibility for cleaning up ErrorReporters is transitively assigned to the
    ProcessingContext, RetryWithToleranceOperator, and WorkerSinkTask/WorkerSinkTask classes
* One new unit test in ErrorReporterTest asserts that the producer is closed by the dlq reporter

Reviewers: Arjun Satish <arjun@confluent.io>, Chris Egerton <chrise@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>
ijuma added a commit to confluentinc/kafka that referenced this pull request Apr 30, 2020
…/master`

* apache-github/trunk: (45 commits)
  MINOR: Fix broken JMX link in docs by adding missing starting double quote (apache#8587)
  KAFKA-9652: Fix throttle metric in RequestChannel and request log due to KIP-219 (apache#8567)
  KAFKA-9922: Update demo instructions in examples README (apache#8559)
  KAFKA-9830: Implement AutoCloseable in ErrorReporter and subclasses (apache#8442)
  KAFKA-9875: Make integration tests more resilient (apache#8578)
  KAFKA-9932: Don't load configs from ZK when the log has already been loaded (apache#8582)
  KAFKA-9925: decorate pseudo-topics with app id (apache#8574)
  KAFKA-9832: fix attempt to commit non-running tasks (apache#8580)
  KAFKA-9127: don't create StreamThreads for global-only topology (apache#8540)
  MINOR: add support for kafka 2.4 and 2.5 to downgrade test
  KAFKA-9176: Retry on getting local stores from KafkaStreams (apache#8568)
  KAFKA-9823: Follow-up, check state for handling commit error response (apache#8548)
  KAFKA-6145: KIP-441: Add TaskAssignor class config (apache#8541)
  MINOR: Fix partition numbering from 0 to P-1 instead of P in docs (apache#8572)
  KAFKA-9921: disable caching on stores configured to retain duplicates (apache#8564)
  Minor: remove redundant check in auto preferred leader election (apache#8566)
  MINOR: Update the link to the Raft paper in docs (apache#8560)
  MINOR: Fix typos in config properties in MM2 test (apache#8561)
  MINOR: Improve producer test BufferPoolTest#testCloseNotifyWaiters. (apache#7982)
  MINOR: document how to escape json parameters to ducktape tests (apache#8546)
  ...
ijuma added a commit to confluentinc/kafka that referenced this pull request Apr 30, 2020
There was a minor conflict in gradle.properties because the default
Scala version changed upstream to Scala 2.13. I kept the upstream
change.

Related to this, I have updated Jenkinsfile to compile and validate
with Scala 2.12 in a separate stage so that we ensure we maintain
compatibility. Unlike Apache Kafka, we only run the tests with the
default Scala version, which is now 2.13.

* apache-github/trunk: (45 commits)
MINOR: Fix broken JMX link in docs by adding missing starting double
quote (apache#8587)
KAFKA-9652: Fix throttle metric in RequestChannel and request log due
to KIP-219 (apache#8567)
  KAFKA-9922: Update demo instructions in examples README (apache#8559)
KAFKA-9830: Implement AutoCloseable in ErrorReporter and subclasses
(apache#8442)
  KAFKA-9875: Make integration tests more resilient (apache#8578)
KAFKA-9932: Don't load configs from ZK when the log has already been
loaded (apache#8582)
  KAFKA-9925: decorate pseudo-topics with app id (apache#8574)
  KAFKA-9832: fix attempt to commit non-running tasks (apache#8580)
KAFKA-9127: don't create StreamThreads for global-only topology
(apache#8540)
  MINOR: add support for kafka 2.4 and 2.5 to downgrade test
  KAFKA-9176: Retry on getting local stores from KafkaStreams (apache#8568)
KAFKA-9823: Follow-up, check state for handling commit error response
(apache#8548)
  KAFKA-6145: KIP-441: Add TaskAssignor class config (apache#8541)
MINOR: Fix partition numbering from 0 to P-1 instead of P in docs
(apache#8572)
KAFKA-9921: disable caching on stores configured to retain duplicates
(apache#8564)
Minor: remove redundant check in auto preferred leader election
(apache#8566)
  MINOR: Update the link to the Raft paper in docs (apache#8560)
  MINOR: Fix typos in config properties in MM2 test (apache#8561)
MINOR: Improve producer test BufferPoolTest#testCloseNotifyWaiters.
(apache#7982)
MINOR: document how to escape json parameters to ducktape tests
(apache#8546)
  ...
@gharris1727 gharris1727 deleted the dlq-leak-producer branch April 30, 2020 21:36
gharris1727 added a commit to gharris1727/kafka that referenced this pull request Apr 30, 2020
…pache#8442)

* The DeadLetterQueueReporter has a KafkaProducer that it must close to clean up resources
* Currently, the producer and its threads are leaked every time a task is stopped
* Responsibility for cleaning up ErrorReporters is transitively assigned to the
    ProcessingContext, RetryWithToleranceOperator, and WorkerSinkTask/WorkerSinkTask classes
* One new unit test in ErrorReporterTest asserts that the producer is closed by the dlq reporter

Reviewers: Arjun Satish <arjun@confluent.io>, Chris Egerton <chrise@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>
kkonstantine pushed a commit that referenced this pull request May 6, 2020
…8442)

* The DeadLetterQueueReporter has a KafkaProducer that it must close to clean up resources
* Currently, the producer and its threads are leaked every time a task is stopped
* Responsibility for cleaning up ErrorReporters is transitively assigned to the
    ProcessingContext, RetryWithToleranceOperator, and WorkerSinkTask/WorkerSinkTask classes
* One new unit test in ErrorReporterTest asserts that the producer is closed by the dlq reporter

Reviewers: Arjun Satish <arjun@confluent.io>, Chris Egerton <chrise@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>
kkonstantine pushed a commit that referenced this pull request May 6, 2020
…8442)

* The DeadLetterQueueReporter has a KafkaProducer that it must close to clean up resources
* Currently, the producer and its threads are leaked every time a task is stopped
* Responsibility for cleaning up ErrorReporters is transitively assigned to the
    ProcessingContext, RetryWithToleranceOperator, and WorkerSinkTask/WorkerSinkTask classes
* One new unit test in ErrorReporterTest asserts that the producer is closed by the dlq reporter

Reviewers: Arjun Satish <arjun@confluent.io>, Chris Egerton <chrise@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>
qq619618919 pushed a commit to qq619618919/kafka that referenced this pull request May 12, 2020
…pache#8442)

* The DeadLetterQueueReporter has a KafkaProducer that it must close to clean up resources
* Currently, the producer and its threads are leaked every time a task is stopped
* Responsibility for cleaning up ErrorReporters is transitively assigned to the
    ProcessingContext, RetryWithToleranceOperator, and WorkerSinkTask/WorkerSinkTask classes
* One new unit test in ErrorReporterTest asserts that the producer is closed by the dlq reporter

Reviewers: Arjun Satish <arjun@confluent.io>, Chris Egerton <chrise@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants