-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Timeout for LockAcquireAction #4461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
e88e9bf
0151042
d9ad40c
40930e0
b7cbb86
fb6dec4
751ced2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -40,7 +40,7 @@ | |
| import io.druid.java.util.common.Pair; | ||
| import io.druid.java.util.common.guava.Comparators; | ||
| import io.druid.java.util.common.guava.FunctionalIterable; | ||
|
|
||
| import io.druid.server.initialization.ServerConfig; | ||
| import org.joda.time.DateTime; | ||
| import org.joda.time.Interval; | ||
|
|
||
|
|
@@ -52,6 +52,7 @@ | |
| import java.util.NavigableSet; | ||
| import java.util.Set; | ||
| import java.util.TreeMap; | ||
| import java.util.concurrent.TimeUnit; | ||
| import java.util.concurrent.locks.Condition; | ||
| import java.util.concurrent.locks.ReentrantLock; | ||
|
|
||
|
|
@@ -67,6 +68,7 @@ public class TaskLockbox | |
| private final TaskStorage taskStorage; | ||
| private final ReentrantLock giant = new ReentrantLock(true); | ||
| private final Condition lockReleaseCondition = giant.newCondition(); | ||
| protected final long lockTimeoutMillis; | ||
|
|
||
| private static final EmittingLogger log = new EmittingLogger(TaskLockbox.class); | ||
|
|
||
|
|
@@ -76,10 +78,21 @@ public class TaskLockbox | |
|
|
||
| @Inject | ||
| public TaskLockbox( | ||
| TaskStorage taskStorage | ||
| TaskStorage taskStorage, | ||
| ServerConfig serverConfig | ||
| ) | ||
| { | ||
| this.taskStorage = taskStorage; | ||
| this.lockTimeoutMillis = serverConfig.getMaxIdleTime().getMillis(); | ||
| } | ||
|
|
||
| public TaskLockbox( | ||
| TaskStorage taskStorage, | ||
| long lockTimeoutMillis | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should every lockAcquireAction have the same timeout? I think it would be valuable if we can change the timeout depending on task specs in the future. For example, a task can have a long timeout if it should acquire a lock for a long interval.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Using maxIdleTime as a default value sounds good.
I think we need to set different timeouts for each lock request in very near future because I'm working on prioritized locking (#4479, #1679) and this timeout feature will be great if tasks can set different timeouts according to their priorities.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I'll create a separate PR where client can send locktimeout, if not default
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. Thanks!
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| ) | ||
| { | ||
| this.taskStorage = taskStorage; | ||
| this.lockTimeoutMillis = lockTimeoutMillis; | ||
| } | ||
|
|
||
| /** | ||
|
|
@@ -179,15 +192,36 @@ public int compare(Pair<Task, TaskLock> left, Pair<Task, TaskLock> right) | |
| */ | ||
| public TaskLock lock(final Task task, final Interval interval) throws InterruptedException | ||
| { | ||
| long timeout = lockTimeoutMillis; | ||
| giant.lock(); | ||
| try { | ||
| Optional<TaskLock> taskLock; | ||
| while (!(taskLock = tryLock(task, interval)).isPresent()) { | ||
| lockReleaseCondition.await(); | ||
| long startTime = System.currentTimeMillis(); | ||
| lockReleaseCondition.await(timeout, TimeUnit.MILLISECONDS); | ||
| long timeDelta = System.currentTimeMillis() - startTime; | ||
| if (timeDelta >= timeout) { | ||
| log.error( | ||
| "Task [%s] can not acquire lock for interval [%s] within [%s] ms", | ||
| task.getId(), | ||
| interval, | ||
| lockTimeoutMillis | ||
| ); | ||
|
|
||
| throw new InterruptedException(String.format( | ||
| "Task [%s] can not acquire lock for interval [%s] within [%s] ms", | ||
| task.getId(), | ||
| interval, | ||
| lockTimeoutMillis | ||
| )); | ||
| } else { | ||
| timeout -= timeDelta; | ||
| } | ||
| } | ||
|
|
||
| return taskLock.get(); | ||
| } finally { | ||
| } | ||
| finally { | ||
| giant.unlock(); | ||
| } | ||
| } | ||
|
|
@@ -247,6 +281,12 @@ private Optional<TaskLock> tryLock(final Task task, final Interval interval, fin | |
| if (foundPosse.getTaskLock().getInterval().contains(interval) && foundPosse.getTaskLock().getGroupId().equals(task.getGroupId())) { | ||
| posseToUse = foundPosse; | ||
| } else { | ||
| //Could be a deadlock for LockAcquireAction: same task trying to acquire lock for overlapping interval | ||
| if (foundPosse.getTaskIds().contains(task.getId())) { | ||
| log.makeAlert("Same Task is trying to acquire lock for overlapping interval") | ||
| .addData("task", task.getId()) | ||
| .addData("interval", interval); | ||
| } | ||
| return Optional.absent(); | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maxIdelTime looks to be deprecated (http://druid.io/docs/latest/configuration/indexing-service.html).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serverConfig.getMaxIdleTime() corresponds to
druid.server.http.maxIdleTimeand is not deprecated.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right.