KAFKA-15696: Refactor closing consumer by philipnee · Pull Request #14937 · apache/kafka

philipnee · 2023-12-06T08:09:35Z

We drives the consumer closing via events, and rely on the still-alived network thread to complete these operations.

This ticket encompasses several different tickets:
KAFKA-15696/KAFKA-15548

When closing the consumer we need to perform a few tasks. And here is the top level overview:
We want to keep the network thread alive until we are ready to shutdown, i.e., no more requests need to be sent out. To achieve so, I implemented a method, signalClose() to signal the managers to prepare for shutdown. Once we signal the network thread to close, the manager will prepare for the request to be sent out on the next event loop. The network thread can then be closed after issuing these events. The application thread's task is pretty straightforward, 1. tell the background thread to perform n events and 2. block on certain events until succeed or the timer runs out. Once all requests are sent out, we close the network thread and other components as usual.

Here I outline the changes in detail

AsynckafkaConsumer: Shutdown procedures, and several utility functions to ensure proper exceptions are thrown during shutdown
AsyncKafkaConsumerTest: I examine each individual test and fix ones that are blocking for too long or logging errors
CommitRequestManager: signalClose()
FetchRequestManagerTest: changes due to change in pollOnClose()
ApplicationEventProcessor: handle commitOnClose and LeaveGroupOnClose. Latter, it triggers leaveGroup() which should be completed on the next heartbeat (or we timeout on the application thread)

lucasbru · 2023-12-06T15:41:02Z

    @Test
    public void testGroupIdNotNullAndValid() {
+        // close the default consumer
+        shutDown();


Isn't this anyways going to happen in afterAll?

The test spins up another consumer so we should shutdown the BeforeEach one.

lucasbru · 2023-12-06T15:44:07Z

Yes, I think using events is much clearer. @kirktrue do you agree with this approach? Then I'd suggest we close the other PR and continue with this one.

philipnee · 2023-12-06T23:45:00Z

+                                       final AtomicReference<Throwable> firstException) {
+        try {
+            applicationEventHandler.addAndGet(event, timer);
+        } catch (TimeoutException e) {


We don't really throw timeout exceptions during closing because if user tries to close with 0 duration then all ops will be timedout. The current implementation just polls, but since we cannot directly polls the client, we need to either wait till the future is completed or times out and keep going.

Yes, we have issue with timeouts of 0 elsewhere. There's a Jira somewhere to solve it, but it's not been designed/fixed.

@kirktrue and I discussed the potential tasks for dealing with zero timeout. This needs to be examined perhaps after the preview. So we will spin off a jira ticket for this specific issue.

philipnee · 2023-12-06T23:50:18Z

Hi @kirktrue - I rewrote the previous PR based on your feedback. I thought driving the close via event is a better and clearer pattern, so thanks for the suggestions. Would you have time to take a look at this PR?

@lucasbru - Thanks for reviewing the PR - I've decided according to your suggestion to use Kirk's approach to close the consumer. Let me know what do you think.

philipnee · 2023-12-06T23:52:43Z

@@ -274,79 +269,18 @@ private void closeInternal(final Duration timeout) {
    }

    void cleanup() {


really not much to do when shutting down the network thread - we will try one more time to send the unsent and poll the network client to make sure all requests and sent

kirktrue

This is really tricky, @philipnee 😞

I think we need to resolve the behavioral ambiguity around a user invoking close(0) ASAHP.

cc @lianetm @AndrewJSchofield @lucasbru @cadonna

kirktrue · 2023-12-07T01:20:56Z

        } catch (Exception e) {
            log.error("Unexpected error during shutdown.  Proceed with closing.", e);
        } finally {
+            networkClientDelegate.awaitPendingRequests(timer);


Network requests are tied to the CompletableApplicationEvents, right? Can we just rely on the events to wait for their network I/O to complete via the addAndGet() method.?

Can there be other requests (not tied to the closing application events) that we want to wait for as long as we still have time?

I think we actually don't need this here because runAtClose is already closing after checking the code. if the timer runs out, then we don't need to poll again. if all request are completed before timer runs out, then we don't need to repoll again.

kirktrue · 2023-12-07T01:33:15Z

+            case COMMIT:
+                log.debug("Sending unsent commit before closing.");
+                sendUnsentCommit();
+                event.future().complete(null);


This is a bit of a different pattern than our other CompletableApplicationEvents. In the other events, we completed the Future when the response was processed. In these events, we're completing them just after sending off the request. Is that truly what we want?

Yeah, as long as our timeout did not expire, we probably want to wait for the response, right?

kirktrue · 2023-12-07T01:34:19Z

+            case PREP_CLOSING:
+                processPrepClosingEvent((ConsumerCloseApplicationEvent) event);
+                return;
+


Any reason we can't have these as separate types like the other events?

kirktrue · 2023-12-07T01:35:35Z

+    private void sendUnsentCommit() {
+        if (!requestManagers.commitRequestManager.isPresent())
+            return;
+        NetworkClientDelegate.PollResult res = requestManagers.commitRequestManager.get().pollOnClose();
+        if (res.unsentRequests.isEmpty())
+            return;
+        // NetworkThread will continue to poll the networkClientDelegate
+        networkClientDelegate.addAll(res);
+    }
+


I'm not quite understanding why this needs to be done as a special case. Why can't we rely on the normal runOnce() invocation to poll() the request managers?

kirktrue · 2023-12-07T01:36:53Z

    private final Logger log;
    private final ConsumerMetadata metadata;
    private final RequestManagers requestManagers;
+    private final NetworkClientDelegate networkClientDelegate;


I'm uncomfortable with introducing the NetworkClientDelegate at this layer. It's centralized in ConsumerNetworkThread for the reason that we can reason on where the various network I/O is performed.

kirktrue · 2023-12-07T01:37:41Z

+ */
+package org.apache.kafka.clients.consumer.internals.events;
+
+public class ConsumerCloseApplicationEvent extends CompletableApplicationEvent<Void> {


I'm happy to have a superclass for 'close' events, but having a type and a task gets a bit muddy, doesn't it?

Why not just have separate event types as per the rest of the codebase?

kirktrue · 2023-12-07T01:40:19Z

+                                       final AtomicReference<Throwable> firstException) {
+        try {
+            applicationEventHandler.addAndGet(event, timer);
+        } catch (TimeoutException e) {


Yes, we have issue with timeouts of 0 elsewhere. There's a Jira somewhere to solve it, but it's not been designed/fixed.

lucasbru

I had a look and left a few comments, but @kirktrue asked all the right questions already

lucasbru · 2023-12-07T13:32:01Z

        // Poll to ensure that request has been written to the socket. Wait until either the timer has expired or until
        // all requests have received a response.
-        do {
+        while (timer.notExpired() && !requestFutures.stream().allMatch(Future::isDone)) {


Why are you changing this back?

if the close timer has expired, should we proceed with closing without sending the request? I'm undecided on this. @kirktrue wdyt?

lucasbru · 2023-12-07T13:32:29Z

            return EMPTY;

        List<NetworkClientDelegate.UnsentRequest> requests = pendingRequests.drainOnClose();
+        System.out.print("ddraining + " + requests);


lucasbru · 2023-12-07T13:33:13Z

+        }
+    }
+
+    private CompletableFuture<Void> onLeavePrepare() {


not sure if that is the best name to describe what it does

lucasbru · 2023-12-07T13:35:31Z

        } catch (Exception e) {
            log.error("Unexpected error during shutdown.  Proceed with closing.", e);
        } finally {
+            networkClientDelegate.awaitPendingRequests(timer);


Can there be other requests (not tied to the closing application events) that we want to wait for as long as we still have time?

lucasbru · 2023-12-07T13:37:52Z

+            case COMMIT:
+                log.debug("Sending unsent commit before closing.");
+                sendUnsentCommit();
+                event.future().complete(null);


Yeah, as long as our timeout did not expire, we probably want to wait for the response, right?

philipnee · 2023-12-07T21:31:58Z

+        droppedPartitions.addAll(subscriptions.assignedPartitions());
+        if (!subscriptions.hasAutoAssignedPartitions() || droppedPartitions.isEmpty())
+            return CompletableFuture.completedFuture(null);
+        // TODO: Invoke rebalanceListener via KAFKA-15276


@kirktrue - I am not 100% sure what is the right way to invoke the listener. Are we returning a completable future? The current implementation blocks on listener invocation, which means where we need to do future.get(forever). If the listener is broken in some way, then we are stuck here.

Can we merge it without resolving this comment?

philipnee · 2023-12-07T21:32:20Z

                           final Timer timer) {
        // These are the optional outgoing requests at the
-        List<NetworkClientDelegate.PollResult> pollResults = requestManagers.stream()
+        requestManagers.stream()


see the comment in the FetchRequestManager

Can we merge it without resolving this comment?

This is not a conflict actually - this is just some changes to how fetch request manager closes

philipnee · 2023-12-07T21:34:19Z

     */
    @Override
    public PollResult pollOnClose() {
+        // TODO: move the logic to poll to handle signal close


I added a method signalClose() to the interface. I wonder if we should keep letting the network thread poll the network client as usual, until we actually invoke close. This means, close will do very little but check if there are any pending requests.

Yes, using the normal poll loop sounds like a good idea. We should still probably sendUnsentRequests once when the timeout has passed.

philipnee · 2023-12-07T21:38:15Z

        final String groupId = "consumerGroupA";
        final ConsumerConfig config = new ConsumerConfig(requiredConsumerPropertiesAndGroupId(groupId));
        final LinkedBlockingQueue<BackgroundEvent> backgroundEventQueue = new LinkedBlockingQueue<>();
-        try (final AsyncKafkaConsumer<String, String> consumer =


removing the try to close the consumer with 0 timeout. @lucasbru

Please be aware that all these changes will be unnecessary/conflicting once we move to mocks, which is rather soon I believe, see:

#14930

We will get lots of conflicts and I think whoever has to resolve the conflicts will probably just skip the changes from this PR, because the commit logic in the background thread will not be executed anymore in this test.

ok makes sense, I can revert these changes.

I didn't mean to revert the changes, but discard them when we merge with the asyncKafkaConsumer refactoring. Is it possible these changes were required to not run OOM?

One option would be to rebase this PR on the refactoring, which can possibly resolve these OOMs.

philipnee · 2023-12-07T21:38:54Z

        ConsumerNetworkThread.runAtClose(singletonList(Optional.of(fetcher)), networkClientDelegate, timer);
-
+        // the network is polled during the last state of clean up.
+        networkClientDelegate.poll(time.timer(1));


cc @kirktrue

Does kirk need to confirm this change before we can merge the PR?

philipnee · 2023-12-07T21:40:43Z

Hi @kirktrue @lucasbru - Sorry about the huge PR. But I've addressed most of your comments if not all (apologize if not). I've left comments on places where a bit more discussions are needed. LMK if you have any questions.

lucasbru

I had a look, seems like we are still iterating on the design. Left some comments

lucasbru · 2023-12-08T14:24:10Z

            final Supplier<ApplicationEventProcessor> applicationEventProcessorSupplier = ApplicationEventProcessor.supplier(logContext,
                    metadata,
-                    applicationEventQueue,
+                applicationEventQueue,


indentation

lucasbru · 2023-12-08T14:24:20Z

                logContext,
                metadata,
-                applicationEventQueue,
+            applicationEventQueue,


indentation

lucasbru · 2023-12-08T14:30:37Z

@@ -178,27 +171,11 @@ static void runAtClose(final Collection<Optional<? extends RequestManager>> requ
                           final NetworkClientDelegate networkClientDelegate,
                           final Timer timer) {


Seems timer is now completely unused.

noted, let's get rid of runAtClose all together in the future PR.

lucasbru · 2023-12-08T14:33:51Z

+    private void sendUnsentRequests(final Timer timer) {
+        // Poll to ensure that request has been written to the socket. Wait until either the timer has expired or until
+        // all requests have received a response.
+        while (!networkClientDelegate.unsentRequests().isEmpty() && timer.notExpired()) {


Closing with timeout 0 would mean we don't send any closing requests, right? I think we should poll nevertheless, so we should check the timer at the end.

I think if we'd use the normal poll loop as long as timeout > 0, this function may not need to check the timer anyway, since it's only used if the time ran out and there are still unsent requests.

sorry, to reiterate on your comment, perhaps your suggestion is: if time has run out, we do client.poll(0) to try to send and process the request one last time. if the time hasn't run out and there are still request to be sent, we continue to poll until all requests are sent and timer runs out. Is this what you meant?

Well, if all requests are sent, I wouldn't timeout. But otherwise, yes.

lucasbru · 2023-12-08T14:35:18Z

     */
    @Override
    public PollResult pollOnClose() {
+        // TODO: move the logic to poll to handle signal close


Yes, using the normal poll loop sounds like a good idea. We should still probably sendUnsentRequests once when the timeout has passed.

lucasbru

Approved from my side with minor comments, but needs to be rebased, approved by Kirk and pass CI.

lucasbru · 2023-12-12T13:55:47Z

-        closeQuietly(fetchBuffer, "Failed to close the fetch buffer", firstException);
+            closeQuietly(() -> applicationEventHandler.close(Duration.ofMillis(closeTimer.remainingMs())), "Failed shutting down network thread", firstException);
+        closeTimer.update();
+        // Ensure all async commit callbacks are invoked


misplaced comment

might not even need this comment.

lucasbru · 2023-12-12T13:58:20Z

+        droppedPartitions.addAll(subscriptions.assignedPartitions());
+        if (!subscriptions.hasAutoAssignedPartitions() || droppedPartitions.isEmpty())
+            return CompletableFuture.completedFuture(null);
+        // TODO: Invoke rebalanceListener via KAFKA-15276


Can we merge it without resolving this comment?

lucasbru · 2023-12-12T13:59:08Z

                           final Timer timer) {
        // These are the optional outgoing requests at the
-        List<NetworkClientDelegate.PollResult> pollResults = requestManagers.stream()
+        requestManagers.stream()


Can we merge it without resolving this comment?

lucasbru · 2023-12-12T14:01:50Z

        ConsumerNetworkThread.runAtClose(singletonList(Optional.of(fetcher)), networkClientDelegate, timer);
-
+        // the network is polled during the last state of clean up.
+        networkClientDelegate.poll(time.timer(1));


Does kirk need to confirm this change before we can merge the PR?

philipnee · 2023-12-12T21:25:21Z

        assertTrue(applicationEventsQueue.isEmpty());
    }

-    @Test


removed because they are irrelevant now

kirktrue

Thanks for the PR, @philipnee. Please make sure to file those follow-up Jiras 😉

lucasbru · 2023-12-13T09:30:02Z

Restarting build as previous was aborted. @philipnee have you run the integration tests on this one?

philipnee · 2023-12-13T15:35:08Z

Sorry @lucasbru - forgot to post the integration results but I did. Let me do that for the record.

lucasbru · 2023-12-13T15:43:30Z

CI ran testFetcherConcurrency for > 1h on two pipelines and seems to OOM as well.

Can you check if this PR is in any way related?

https://ci-builds.apache.org/blue/rest/organizations/jenkins/pipelines/Kafka/pipelines/kafka-pr/branches/PR-14937/runs/26/nodes/11/steps/87/log/?start=0
https://ci-builds.apache.org/blue/rest/organizations/jenkins/pipelines/Kafka/pipelines/kafka-pr/branches/PR-14937/runs/26/nodes/9/steps/90/log/?start=0

lucasbru · 2023-12-17T09:10:23Z

@philipnee If tests are flaky in the PR that are also flaky on trunk, I don't think this is blocking the merge given the current state of the CI. The problem is that the builds fail completely, so there is no way to tell whether this PR introduces new problems or not. I think this may also be caused by other changes on trunk. For now, I think we have to restart CI and hope for the situation to improve, or debug the build failures ourselves. I looked into it briefly (checking the last few runs on trunk), and it seemed to me that a lot of failures of Gradle Executors followed the execution of TransactionsWithTieredStoreTest.

lucasbru · 2023-12-17T09:36:42Z

@philipnee I think part of the reason why things are running particularly bad this weekend might be a change in transaction tests. I proposed a revert here. #15029 I will retrigger the CI for this PR for now. Let's see if we get the revert-PR merged, then we can rebase this PR and hopefully get stable runs again.

philipnee · 2023-12-17T17:26:58Z

Thanks @lucasbru

more clean up clean up clean up fix broken tests clean up refactor based on PR comment clean up

Update AsyncKafkaConsumer.java

…leave group ww

Update ApplicationEventProcessorTest.java

Update MetadataTest.java

philipnee · 2023-12-19T04:38:16Z

hey @lucasbru - Seems like your fix fixed the long running/oom issue with the test. The testExpandingTopicSubscriptions is on the flaky side, I've observed that in other PRs.

philipnee · 2023-12-19T04:47:17Z

Hey @lucasbru - I think your patch fixed the long running/oom issue with the async consumer test. The tests finished within 3.5hr in this commit: 1786208
However, there are still quite a few flaky tests in the PlaintextConsumerText. Namely..
testExpandingTopicSubscriptions
testShrinkingTopicSubscriptions
testFetchOutOfRangeOffsetResetConfigLatest

They are observed sparsely in other builds I've seen. So I'm disabling them in the latest commit: 11a3ae6

lucasbru · 2023-12-19T12:18:10Z

+        List<ConsumerPartitionAssignor> assignors,
+        String groupId,
+        String clientId) {
+        return new AsyncKafkaConsumer<>(


@philipnee I removed this on purpose from the test file to use the normal constructor and not just test a mock construction of the class. I'm sure there could have been a better way than to reintroduce the constructor that I remove. However, since we are otherwise not converging with this PR, I am going to merge this. Please consider following up with a PR that removes this constructor again

We drive the consumer closing via events, and rely on the still-lived network thread to complete these operations. This ticket encompasses several different tickets: KAFKA-15696/KAFKA-15548 When closing the consumer, we need to perform a few tasks. And here is the top level overview: We want to keep the network thread alive until we are ready to shut down, i.e., no more requests need to be sent out. To achieve so, I implemented a method, signalClose() to signal the managers to prepare for shutdown. Once we signal the network thread to close, the manager will prepare for the request to be sent out on the next event loop. The network thread can then be closed after issuing these events. The application thread's task is pretty straightforward, 1. Tell the background thread to perform n events and 2. Block on certain events until succeed or the timer runs out. Once all requests are sent out, we close the network thread and other components as usual. Here I outline the changes in detail AsyncKafkaConsumer: Shutdown procedures, and several utility functions to ensure proper exceptions are thrown during shutdown AsyncKafkaConsumerTest: I examine each individual test and fix ones that are blocking for too long or logging errors CommitRequestManager: signalClose() FetchRequestManagerTest: changes due to change in pollOnClose() ApplicationEventProcessor: handle CommitOnClose and LeaveGroupOnClose. Latter, it triggers leaveGroup() which should be completed on the next heartbeat (or we time out on the application thread) Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Kirk True <ktrue@confluent.io>

philipnee mentioned this pull request Dec 6, 2023

KAFKA-15696: Refactor AsyncConsumer close procedure #14920

Closed

lucasbru reviewed Dec 6, 2023

View reviewed changes

philipnee force-pushed the refactor-closing-consumer branch from 3207f21 to 2b590bf Compare December 6, 2023 18:20

philipnee commented Dec 6, 2023

View reviewed changes

philipnee changed the title ~~Refactor closing consumer~~ KAFKA-15696: Refactor closing consumer Dec 6, 2023

philipnee marked this pull request as ready for review December 6, 2023 23:48

philipnee force-pushed the refactor-closing-consumer branch from 0158de2 to a2e3ed1 Compare December 6, 2023 23:48

philipnee commented Dec 6, 2023

View reviewed changes

kirktrue reviewed Dec 7, 2023

View reviewed changes

lucasbru reviewed Dec 7, 2023

View reviewed changes

philipnee commented Dec 7, 2023

View reviewed changes

philipnee force-pushed the refactor-closing-consumer branch 2 times, most recently from c0e3780 to 86e6943 Compare December 7, 2023 23:13

lucasbru reviewed Dec 8, 2023

View reviewed changes

philipnee force-pushed the refactor-closing-consumer branch 2 times, most recently from 739bd4f to afb605d Compare December 11, 2023 21:29

lucasbru approved these changes Dec 12, 2023

View reviewed changes

philipnee force-pushed the refactor-closing-consumer branch from afb605d to 0fc4a0a Compare December 12, 2023 19:57

philipnee commented Dec 12, 2023

View reviewed changes

kirktrue approved these changes Dec 12, 2023

View reviewed changes

philipnee force-pushed the refactor-closing-consumer branch from d09d3be to 3dc624d Compare December 17, 2023 17:29

philipnee added 13 commits December 18, 2023 15:02

Refactor consumer close

1fe3f02

clean up based on comments

3948dc7

more clean up clean up clean up fix broken tests clean up refactor based on PR comment clean up

clean up

e551bcc

Update AsyncKafkaConsumer.java

pr comment

7df5695

when consumer failed the rebalance callback, it doesn't need to send …

da6715b

…leave group ww

changes based on discussion

8ac91bc

clean u p

4424fbd

revert unneeded changes

8425110

conflict resolution

74bdd6d

disable flaky tests

79029a2

Update ApplicationEventProcessorTest.java

Update MetadataTest.java

00dc403

Update MetadataTest.java

Update ConsumerTestBuilder.java

4f27113

rebase

1786208

philipnee force-pushed the refactor-closing-consumer branch from d22c9ef to 1786208 Compare December 19, 2023 00:36

disable flaky tests

11a3ae6

lucasbru reviewed Dec 19, 2023

View reviewed changes

lucasbru merged commit 5e37ec8 into apache:trunk Dec 19, 2023

lianetm mentioned this pull request Dec 19, 2023

MINOR: Client state machine fix for transition to stable on initial empty assignment #15033

Merged

		@@ -274,79 +269,18 @@ private void closeInternal(final Duration timeout) {
		}

		void cleanup() {

		@@ -178,27 +171,11 @@ static void runAtClose(final Collection<Optional<? extends RequestManager>> requ
		final NetworkClientDelegate networkClientDelegate,
		final Timer timer) {

Conversation

philipnee commented Dec 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philipnee Dec 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucasbru commented Dec 6, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philipnee commented Dec 6, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kirktrue left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philipnee Dec 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucasbru left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philipnee commented Dec 6, 2023 •

edited

Loading

philipnee Dec 6, 2023 •

edited

Loading

philipnee Dec 7, 2023 •

edited

Loading