Parallelize supervisor stop logic to make it run faster by georgew5656 · Pull Request #17535 · apache/druid

georgew5656 · 2024-12-04T20:33:12Z

Sometimes the LifecycleStop method of SupervisorManager (SupervisorManager.stop()) can take a long time to run. This is because the method iterates through all running supervisors and calls stop on them serially. Each streaming supervisor.stop() call tries to push a ShutdownNotice to its notice queue and then wait for the ShutdownNotice to run and set stopped = true up to tuningConfig.shutdownTimeout. This means the total run time can be the sum of tuningConfig.shutdownTimeout (default 80 seconds) across all supervisors.

This long stop time can cause lots of issues, most notably overlord leadership issues if the ZK leader is terminated (but the ensemble maintains quorum). This is because a overlord pod can get becomeLeader queued up behind stopLeader if it disconnects and then reconnects to ZK (the giant lock shared between the two methods).

This PR attempts to ensure SupervisorManager completes faster to prevent this issue.

Description

In SupervisorManager use a static pool of shutdownThreads to stop supervisors in parallel in the stop method to prevent a single or few slow supervisors from slowing down overall shutdown.
In SeekableStreamSupervisor, when stopGracefully is false (as it is when we are shutting down SupervisorManager), don't wait for the ShutdownNotice to run. This means that the recordSupplier (e.g. kafka consumer) may not be cleaned up immediately, but since all the supervisor objects are dereferenced and can be GC'd later i don't think this is a huge deal.

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

I used a static thread pool in SupervisorManager for now, it's possible it should be configurable but IMO the main point of the thread pool is to not let a few slow supervisor shutdowns run in series block the entire supervisor manager from shutting down. (changed in review)
I'm not sure if changing the SeekableStreamSupervisor.stop method when stopGracefully = false is necessary, but it didn't make sense to me to specify a non-graceful shutdown and then try to wait for things to clean up. (changed in review)

Release note

Improve recovery time for overlord leadership after zk nodes are bounced.

Key changed/added classes in this PR

SupervisorManager
SeekableStreamSupervisor

This PR has:

kfaraz

left some suggestions

kfaraz · 2024-12-12T13:33:43Z

+      }
+      log.info("Waiting for [%d] supervisors to shutdown", stopFutures.size());
+      try {
+        FutureUtils.coalesce(stopFutures).get(80, TimeUnit.SECONDS);


I don't think we should use a timeout of 80s here since each supervisor could have a different value of shutdown timeout. We could either just do get() with no args (which would be no worse than what the code is currently doing) or use a longer timeout.

kfaraz · 2024-12-12T13:34:48Z

  {
    this.metadataSupervisorManager = metadataSupervisorManager;
+    this.shutdownExec = MoreExecutors.listeningDecorator(
+        Execs.multiThreaded(25, "supervisor-manager-shutdown-%d")


25 may be excessive in some cases and inadequate in others. Maybe initialize the executor lazily inside the stop() method, then the number of required threads can be computed at run time. The shutdownExec need not be a class-level field either.

Alternatively, instead of using a completely new executor, you could consider using the scheduledExec inside each supervisor. That executor basically just sits idle most of the time and is responsible only for submitting RunNotice to the notice queue.

You could add a stopAsync method to SeekableStreamSupervisor that does the following:

returns a future that we coalesce and wait upon

internally submits a runnable to the scheduledExec to perform the actual stop

I guess the only thing we will miss out on is parallelizing the autoscaler.stop() which should not be a concern, I guess?

the issue i was running into with this strategy is that part of the stop logic is shutting down the scheduledExec executor, and I couldn't really think of a great way to avoid this chicken-and-egg problem.

You could perhaps work around that problem by doing something like this:

stopAsync sets the supervisor state to STOPPING

stopAsync then submits a stop() runnable to the scheduledExec

buildRunTask method should check and submit the RunNotice only if the state of the supervisor is not STOPPING

stop() can call scheduledExec.shutdown() instead of scheduledExec.shutdownNow()

Another alternative is to simply create a shutdownExec inside stopAsync.
Difference from the current approach would be that the SupervisorManager doesn't need to handle the lifecycle of the shutdown executor.

Let me know what you think.

i think this makes sense, will update

kfaraz · 2024-12-12T13:37:50Z

+              if (!stopGracefully) {
+                stopped = true;
+                break;
+              }


If we have already parallelized the stop of supervisors, is this still needed?

we could probably pull it out

kfaraz · 2024-12-17T07:19:30Z

   */
  void stop(boolean stopGracefully);

+  default ListenableFuture<Void> stopAsync(boolean stopGracefully)


Maybe add a javadoc.
Also, does this need the stopGracefully parameter?

kfaraz · 2024-12-17T07:20:39Z

+  default ListenableFuture<Void> stopAsync(boolean stopGracefully)
+  {
+    SettableFuture<Void> stopFuture = SettableFuture.create();
+    stop(stopGracefully);


If this method throws an exception, the future should be completed with the exception.

…blestream/supervisor/SeekableStreamSupervisor.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

kfaraz · 2024-12-18T03:49:35Z

Thanks for the changes, @georgew5656 !

Fix supervisor stop

882b3f6

github-actions Bot added the Area - Ingestion label Dec 4, 2024

fix checkstyle

870d5b1

georgew5656 requested review from clintropolis, gianm and kfaraz December 5, 2024 20:01

georgew5656 mentioned this pull request Dec 6, 2024

Separate stop/start logic for LeaderLatch #17546

Merged

10 tasks

georgew5656 changed the title ~~Fix supervisor stop logic~~ Parallelize supervisor stop logic to make it run faster Dec 10, 2024

kfaraz reviewed Dec 12, 2024

View reviewed changes

georgew5656 added 6 commits December 13, 2024 11:46

use a new executor

7b11512

PR changes

c875367

add some logging

5ebd372

add test

ed06f88

add another test

93024c7

fix supervisormanagertest

f8ce821

georgew5656 requested a review from kfaraz December 16, 2024 20:20

kfaraz reviewed Dec 17, 2024

View reviewed changes

georgew5656 and others added 2 commits December 17, 2024 11:18

Update indexing-service/src/main/java/org/apache/druid/indexing/seeka…

ec4601f

…blestream/supervisor/SeekableStreamSupervisor.java Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

PR comments

5b124c9

georgew5656 requested a review from kfaraz December 17, 2024 18:04

kfaraz approved these changes Dec 18, 2024

View reviewed changes

kfaraz merged commit 9ff1173 into apache:master Dec 18, 2024

adarshsanjeev added this to the 32.0.0 milestone Jan 16, 2025

adarshsanjeev mentioned this pull request Jan 28, 2025

[DRAFT] 32.0.0 release notes #17677

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize supervisor stop logic to make it run faster#17535

Parallelize supervisor stop logic to make it run faster#17535
kfaraz merged 10 commits intoapache:masterfrom
georgew5656:fixSupervisorStop

georgew5656 commented Dec 4, 2024 •

edited

Loading

Uh oh!

kfaraz left a comment

Uh oh!

kfaraz Dec 12, 2024 •

edited

Loading

Uh oh!

kfaraz Dec 12, 2024 •

edited

Loading

Uh oh!

georgew5656 Dec 12, 2024

Uh oh!

kfaraz Dec 13, 2024 •

edited

Loading

Uh oh!

georgew5656 Dec 13, 2024

Uh oh!

Uh oh!

kfaraz Dec 12, 2024

Uh oh!

georgew5656 Dec 12, 2024

Uh oh!

Uh oh!

Uh oh!

kfaraz Dec 17, 2024

Uh oh!

kfaraz Dec 17, 2024

Uh oh!

kfaraz commented Dec 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

georgew5656 commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

kfaraz Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfaraz Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

georgew5656 Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

kfaraz Dec 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

georgew5656 Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kfaraz Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

georgew5656 Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kfaraz Dec 17, 2024

Choose a reason for hiding this comment

Uh oh!

kfaraz Dec 17, 2024

Choose a reason for hiding this comment

Uh oh!

kfaraz commented Dec 18, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

georgew5656 commented Dec 4, 2024 •

edited

Loading

kfaraz Dec 12, 2024 •

edited

Loading

kfaraz Dec 12, 2024 •

edited

Loading

kfaraz Dec 13, 2024 •

edited

Loading