Skip to content

KAFKA-9955: Prevent SinkTask::close from shadowing other exceptions#8618

Merged
kkonstantine merged 12 commits intoapache:trunkfrom
gharris1727:close-exception-handling
May 16, 2020
Merged

KAFKA-9955: Prevent SinkTask::close from shadowing other exceptions#8618
kkonstantine merged 12 commits intoapache:trunkfrom
gharris1727:close-exception-handling

Conversation

@gharris1727
Copy link
Copy Markdown
Contributor

@gharris1727 gharris1727 commented May 5, 2020

  • Catches exceptions from WorkerSinkTask::closePartitions call
  • If two exceptions are thrown the closePartitions exception is suppressed
  • Add unit test that throws exceptions in put and close to verify that
    the exception from put is propagated out of WorkerSinkTask::execute

Signed-off-by: Greg Harris gregh@confluent.io

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

* Catches exceptions from SinkTask::close call in WorkerSinkTask
* Logs exception at ERROR level but does not propagate exception upward
* Add unit test that throws exceptions in put and close to verify that
  the exception from put is propagated out of WorkerSinkTask::execute

Signed-off-by: Greg Harris <gregh@confluent.io>
@gharris1727 gharris1727 force-pushed the close-exception-handling branch from 337699b to 1a2cdbb Compare May 5, 2020 04:01
@gharris1727
Copy link
Copy Markdown
Contributor Author

@C0urante @ncliang Could you take a look at this when you have a chance?

This avoids swallowing exceptions that would have stopped the task in absence of an exception from start or put.

Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @gharris1727! This looks great, and the confusing exception logging here is definitely something that's bitten me in the past when debugging connectors. Just a few comments on the test; the functional bits look fine.

Signed-off-by: Greg Harris <gregh@confluent.io>
…ions

Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Greg, LGTM!

@gharris1727
Copy link
Copy Markdown
Contributor Author

@rhauch @kkonstantine This is ready for committer review. Thanks for taking a look!

Copy link
Copy Markdown
Contributor

@ncliang ncliang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gharris1727 . I have a question about how the suppressed exceptions are logged.

try {
// Make sure any uncommitted data has been committed and the task has
// a chance to clean up its state
try (QuietClosable ignored = this::closePartitions) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand how the suppressed exception is logged and not just silently swallowed? Do we need to call getSuppressed and log those somewhere or use one of those closeQuietly methods on Utils?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is not changing any of the printing logic, and that's still handled by the caller, WorkerTask::doRun. This is roughly what a suppressed exception looks like when it gets logged (from the test setup, not a live connector):

org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
	at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:569)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:327)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:229)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201)
	<snipped>
	Suppressed: java.lang.RuntimeException
		at org.easymock.internal.MockInvocationHandler.invoke(MockInvocationHandler.java:46)
		at org.easymock.internal.ObjectMethodsFilter.invoke(ObjectMethodsFilter.java:101)
		at org.easymock.internal.ClassProxyFactory$MockMethodInterceptor.intercept(ClassProxyFactory.java:97)
		at org.apache.kafka.connect.sink.SinkTask$$EnhancerByCGLIB$$713f645b.close(&lt;generated&gt;)
		at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:402)
		at org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:599)
		at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:202)
		... 57 more
Caused by: java.lang.RuntimeException
	at org.easymock.internal.MockInvocationHandler.invoke(MockInvocationHandler.java:46)
	at org.easymock.internal.ObjectMethodsFilter.invoke(ObjectMethodsFilter.java:101)
	at org.easymock.internal.ClassProxyFactory$MockMethodInterceptor.intercept(ClassProxyFactory.java:68)
	at org.apache.kafka.connect.sink.SinkTask$$EnhancerByCGLIB$$713f645b.put(&lt;generated&gt;)
	at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:547)
	... 60 more

Suppressed exceptions are a native Java feature, and log4j supports printing their stacktraces.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again naming here can be misleading. ignored is more like unused in the try block.
But also that's not the point of this idiom. It's about suppressing exceptions from finally instead of the originator.

How about suppressible? Also, unused might be even better, because ignored is untrue especially if closePartitions is the only method that throws. But I think suppressible highlights the intentions here specifically.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used ignored as a way to trigger the IDE to avoid highlighting the unused variable.
I can change this to suppressible since the behavior is better described (suppressible "if another exception occurs first").

Copy link
Copy Markdown
Contributor

@kkonstantine kkonstantine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice improvement.
I have a few comments aiming to offer more clarity.
Also,

Catches exceptions from WorkerSinkTask::closePartitions call

That hasn't changed from before right? The fix now preserves all the exceptions and suppresses the ones thrown on the (former) "finally" block.

Comment thread clients/src/main/java/org/apache/kafka/common/utils/Utils.java Outdated
try {
// Make sure any uncommitted data has been committed and the task has
// a chance to clean up its state
try (QuietClosable ignored = this::closePartitions) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again naming here can be misleading. ignored is more like unused in the try block.
But also that's not the point of this idiom. It's about suppressing exceptions from finally instead of the originator.

How about suppressible? Also, unused might be even better, because ignored is untrue especially if closePartitions is the only method that throws. But I think suppressible highlights the intentions here specifically.

}

@Test
public void testSinkTasksHandleCloseErrors() throws Exception {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also write a test where an exception is thrown only by sinkTask.close? Actually, we could keep the name for this new test here as is, and the new test could be named in a way that tells us that the exceptions on close are suppressed in the presence of exceptions in the main try block.

Copy link
Copy Markdown
Contributor Author

@gharris1727 gharris1727 May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ive written the test as described, and verified that the exception from close is caught by WorkerTask::doRun.

However, it never gets wrapped in a ConnectException like exceptions from the other connector methods, but i'm not sure that it's in-scope to change this in this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I understand what you are describing as wrapped.
My use case is as follows:
SinkTask#close attempts to release resources and if it fails it throws a ConnectException as we'd expect from connector developers to do (currently it throws a RuntimeException which might be less representative).

With your fix this exception can appear as suppressed when an exception happens in SinkTask#put and that's what your test is guarding against.

My point is to add the missing test case for when the exception on close is the only exception that is thrown. There is a variety of ways to do that, but I agree with you, this test is not there now. However, I don't think this necessarily makes it out of scope.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added that test, so please look at the diff.

By wrapping, i'm completely disregarding the behavior from the connector, I'm only discussing the additional layers of exceptions added by the framework before printing.

At the moment, the RuntimeException from put is wrapped by a ConnectException, but the RuntimeException from close is never wrapped in a ConnectException. I'm questioning whether this is the ideal behavior, and whether we should add that wrapping layer, or consider it out-of-scope for this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the testing I had in mind. That's great.
No need to change the exception type right now.

Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
@kkonstantine
Copy link
Copy Markdown
Contributor

ok to test

@kkonstantine
Copy link
Copy Markdown
Contributor

retest this please

@rhauch
Copy link
Copy Markdown
Contributor

rhauch commented May 14, 2020

ok to test

Copy link
Copy Markdown
Contributor

@kkonstantine kkonstantine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jdk11 and 14 green. jdk8 has some troubles starting recently.

LGTM

@kkonstantine kkonstantine merged commit 62fa8fc into apache:trunk May 16, 2020
kkonstantine pushed a commit that referenced this pull request May 16, 2020
…8618)

* If two exceptions are thrown the `closePartitions` exception is suppressed
* Add unit tests that throw exceptions in put and close to verify that
  the exceptions are propagated and suppressed appropriately out of WorkerSinkTask::execute

Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Konstantine Karantasis <konstantine@confluent.io>
kkonstantine pushed a commit that referenced this pull request May 16, 2020
…8618)

* If two exceptions are thrown the `closePartitions` exception is suppressed
* Add unit tests that throw exceptions in put and close to verify that
  the exceptions are propagated and suppressed appropriately out of WorkerSinkTask::execute

Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Konstantine Karantasis <konstantine@confluent.io>
kkonstantine pushed a commit that referenced this pull request May 16, 2020
…8618)

* If two exceptions are thrown the `closePartitions` exception is suppressed
* Add unit tests that throw exceptions in put and close to verify that
  the exceptions are propagated and suppressed appropriately out of WorkerSinkTask::execute

Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Konstantine Karantasis <konstantine@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants