Skip to content

Add graceful shutdown timeout#4716

Closed
niketh wants to merge 6 commits intoapache:masterfrom
niketh:add-shutdown-timeout
Closed

Add graceful shutdown timeout#4716
niketh wants to merge 6 commits intoapache:masterfrom
niketh:add-shutdown-timeout

Conversation

@niketh
Copy link
Copy Markdown
Contributor

@niketh niketh commented Aug 24, 2017

We see queries failing while realtime handoff is happening mainly because Jetty is not being shutdown gracefully. Here is the error
Failure getting results from[http://host:port/druid/v2/] because of [org.jboss.netty.channel.ChannelException: Channel disconnected

The solution is to add a graceful shutdown. This will also reduce failed queries on shutdown on the broker and historical also.

@niketh niketh force-pushed the add-shutdown-timeout branch 2 times, most recently from fda7651 to 27fcbf8 Compare August 24, 2017 23:39
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread.currentThread().interrupt(); maybe and/or rethrow

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Addressed it.

@niketh niketh force-pushed the add-shutdown-timeout branch from 27fcbf8 to e9d329a Compare August 25, 2017 00:01
public void stop()
{
if (activeConnections.get() != 0) {
long graceFullShutDownTime = (int) server.getAttribute(GRACEFUL_SHUTDOWN_TIME);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gracefulShutDownTime?

}
catch (InterruptedException e) {
log.error("Sleep has been interrupted, while waiting for active requests to be zero");
Thread.currentThread().interrupt();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this further wait if the current thread is interrupted early?

@niketh niketh closed this Aug 29, 2017
@niketh niketh reopened this Aug 29, 2017
@niketh
Copy link
Copy Markdown
Contributor Author

niketh commented Aug 29, 2017

@jihoonson Incorporated your comments

@niketh niketh closed this Aug 29, 2017
@niketh niketh reopened this Aug 29, 2017
@niketh niketh closed this Aug 29, 2017
@niketh niketh reopened this Aug 29, 2017
{
if (activeConnections.get() != 0) {
long graceFullShutDownTime = (int) server.getAttribute(GRACEFUL_SHUTDOWN_TIME);
long graceFulShutDownTime = (long) server.getAttribute(GRACEFUL_SHUTDOWN_TIME);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graceFul should be graceful.

@Override
public void stop()
{
if (activeConnections.get() != 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think activeConnections.get() > 0 is more appropriate because we don't have to wait even if it returns a negative value due to some bugs.

}
catch (InterruptedException e) {
log.error("Sleep has been interrupted, while waiting for active requests to be zero");
stopImmediately();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you added to docs, graceful shutdown is useful to wait for running queries to be completed. So, I think when an InterruptedException occurs earlier while waiting for the given GRACEFUL_SHUTDOWN_TIME, we first check activeConnections is 0 and then further wait for remaining time if there remains more active connections.
What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an InterruptedException occurs, the thread should finish whatever it was doing as soon as possible. In this specific case , it should shutdown ASAP

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Looks good to me.

private static final Logger log = new Logger(JettyServerModule.class);

private static final AtomicInteger activeConnections = new AtomicInteger();
private static final String GRACEFUL_SHUTDOWN_TIME = "graceful_shutdown_time";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: gracefulShutdownTimeout would be more consistent with the property name.

log.info("Waiting for [%s] milliseconds for active requests to be zero, current active requests=[%s]",
gracefulShutDownTime, activeConnections.get());
try {
Thread.sleep(gracefulShutDownTime);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple questions.

  1. Will the server always unannounce itself before entering this stop() code? How is that assured? If it happens, that's good; but if not, that would mean new requests could keep coming in continuously even while waiting for shutdown. Inevitably the last few will fail.
  2. Could this code check activeConnections every few ms, rather than always sleeping for the gracefulShutdownTimeout? Otherwise, it looks like an extra 5s of potentially needless delay.
  3. If both of the above are assured, then I think it could make sense to set the default gracefulShutdownTimeout to somewhat longer, like a few minutes.

public void stop()
{
if (activeConnections.get() > 0) {
long gracefulShutDownTime = (long) server.getAttribute(GRACEFUL_SHUTDOWN_TIME);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: gracefulShutdownTimeout would be more consistent with the property name.

@niketh niketh force-pushed the add-shutdown-timeout branch from 57826ae to 127ed33 Compare September 5, 2017 22:37
@niketh niketh closed this Sep 6, 2017
@niketh niketh reopened this Sep 6, 2017
@niketh
Copy link
Copy Markdown
Contributor Author

niketh commented Sep 6, 2017

@gianm

  1. Servers always will announce before they shutdown
  2. Added code to check activeConnections periodically
  3. Increases timeout to 15s


@JsonProperty
@NotNull
private Period gracefulShutdownTimeout = new Period("PT15s");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation says it is 5s by default. should probably be PT5s .

long startTime = System.currentTimeMillis();
long gracefulShutDownTimeout = (long) server.getAttribute(GRACEFUL_SHUTDOWN_TIMEOUT);

while (activeConnections.get() > 0 &&
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think activeConnections is number of socket connections setup and not necessarily number of requests in progress or queued. connection pooling clients would generally never close connections and the count here might not reduce.
is there a more reliable way of looking at number of requests being processed instead ?

@niketh niketh force-pushed the add-shutdown-timeout branch 3 times, most recently from 374937d to e6181c4 Compare September 12, 2017 21:27
@drcrallen
Copy link
Copy Markdown
Contributor

Done in #5429

@drcrallen drcrallen closed this Mar 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants