Skip to content

Profiler can hang the JVM when request is exited after ABORTED state #43

@alexandru-es

Description

@alexandru-es

In the
void APIThrottler::OnCreationError(const grpc::Status& st) {
method, there is a sleep being set for when the retry operation should happen. When that happens, our timeout varies, but on average, it's around 38 minutes.

clock_->SleepFor(NanosToTimeSpec(backoff_ns));

If the JVM wants to exit before that, it's kept hanging by the sleep that was set.

The throttler has a closed_ field that indicates that the jvm is shutting down, that should probably be used to determine if we still need to sleep or if we can close early to release the resources.

So instead of an uninterruptable "clock_->SleepFor 38 minutes"
There could be something that's more premissive, like

wait_start_time = time before wait
while (now - wait_start_time < 38 minutes AND !closed_) {
    sleep 30s
}

or any other interruptable timeout such that when the jvm wants to exit it doesn't have to wait for this.

The JVM shutting down signal is set here

LOG(INFO) << "On VM death";

The next line stops the worker which then notifies the throttler that it's stopping but if it's stuck in the sleep cycle, it won't react to it until after it's over, preventing exit

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions