Scheduler/TaskReaper: handle unassigned tasks marked for shutdown#2574
Conversation
|
|
||
| // Also ignore tasks that have not yet been assigned but desired state is beyond TaskStateRunning | ||
| // This can happen if you update, delete or scale down a service before its tasks were assigned. | ||
| if t.Status.State == api.TaskStatePending && t.DesiredState > api.TaskStateRunning { |
There was a problem hiding this comment.
Why is this comparison not t.Status.State <= api.TaskStatePending
There was a problem hiding this comment.
The < is already handled in the condition above @nishanttotla
There was a problem hiding this comment.
@anshulpundir sorry maybe I'm missing something, but for case api.EventUpdateTask:, I don't see a specific condition about this.
There was a problem hiding this comment.
My bad, please ignore my comments. I got confused between two changes.
Codecov Report
@@ Coverage Diff @@
## master #2574 +/- ##
==========================================
- Coverage 62.26% 61.55% -0.72%
==========================================
Files 134 134
Lines 21736 21805 +69
==========================================
- Hits 13535 13422 -113
- Misses 6754 6935 +181
- Partials 1447 1448 +1 |
|
|
||
| // Also clean up tasks which have not yet been assigned but have been | ||
| // marked for shutdown (likley because of a service update). | ||
| if t.Status.State < api.TaskStateAssigned && t.DesiredState == api.TaskStateShutdown { |
There was a problem hiding this comment.
Based on IRL discussion with @anshulpundir, we think there can be a race condition here (see also related discussion in #2557):
I'm unable to come up with an exact scenario, but it's something to think about cc @dperny @cyli. If y'all think this looks good, we can merge it.
There was a problem hiding this comment.
UPDATE: Based on another IRL discussion with @anshulpundir, we concluded that this change looks fine, in conjunction with the change in manager/scheduler/scheduler.go. The main idea is that as long as current state is <= PENDING, the task will not have reached the agent. Hence, it is safe to more aggressively remove it in the task reaper. After this change, we think it is also safe to merge #2557.
There was a problem hiding this comment.
Actually apologies for being so mercurial, but I have 3 questions on further examination:
-
For task reaper, do we also need to find pending tasks that are in desired state shutdown when the task reaper first starts up? I wasn't entirely sure from reading that moby issue, but do the tasks get stuck permanently in pending/shutdown, and if so, would they get stuck there forever if an update was lost due to a change in leadership?
-
Would it be possible to add tests for reaping these cases?
-
Would it make sense to also add a test for the case where a scheduler won't schedule a task if it's pending but the desired state is > running?
I think they get stuck regardless since the reaper doesn't consider tasks with current state < assigned.
yup, updating the PR! |
@anshulpundir Er sorry, I asked that question backwards. I meant do we need to add a similar check for current state < assigned and desired state == shutdown to https://github.com/anshulpundir/swarmkit/blob/b1bbd23bf33f90d643016a54f31bfcb4eaaa9c6e/manager/orchestrator/taskreaper/task_reaper.go#L72? Because if the task reaper gets the update that a pending task should shutdown, but then dies because of a leader change, and the next leader starts up, it may not get the update event. So would that pending/shutdown task get stuck forever? |
|
Ahh yes, I meant to update that. But yes, thx for pointing that out! @cyli |
… running. Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
|
Good point @cyli. On that note, we might want to think about modularizing some code in the task reaper, since we've been having to put every new condition in two places. |
|
ping @cyli @nishanttotla for review. |
|
|
||
| // Set both task states to RUNNING. | ||
| updatedTask1 := observedTask1.Copy() | ||
| updatedTask1.DesiredState = api.TaskStateRunning |
There was a problem hiding this comment.
@anshulpundir sorry I don't follow why this is removed?
There was a problem hiding this comment.
because desired state is api.TaskStateRunning when a task is created.
| // Normally task2/task3 should get assigned first since its a preassigned task. | ||
| assignment3 := watchAssignment(t, watch) | ||
| assert.Equal(t, assignment3.ID, "task1") | ||
| assert.Equal(t, assignment3.NodeID, "node1") |
There was a problem hiding this comment.
Should we also assert here that task2/task3 are still in the same state?
cyli
left a comment
There was a problem hiding this comment.
Apologies @anshulpundir, just one question on task history and pending/shutdown tasks.
| if err != nil { | ||
| log.G(ctx).WithError(err).Error("failed to find tasks with desired state SHUTDOWN in task reaper init") | ||
| } | ||
| for _, t := range shutdownTasks { |
There was a problem hiding this comment.
Non-blocking optimization: I was wondering if it'd make sense to move this loop outside of the view? If there are a lot of shutdown tasks, this could block whatever lock the view function is holding.
There was a problem hiding this comment.
also a great point.
| } | ||
| for _, t := range shutdownTasks { | ||
| if t.Status.State < api.TaskStateAssigned { | ||
| tr.cleanup = append(tr.cleanup, t.ID) |
There was a problem hiding this comment.
Would this cleanup right away, if these are added here? If there are any orphaned or remove tasks, line 126 calls tr.tick(), which cleans up all tasks in tr.cleanup immediately.
Perhaps this loop should be moved to below the orphaned and removed tasks check? Would it make sense to add a check for this to make sure that these tasks are handled through the regular history process?
There was a problem hiding this comment.
Nice catch. I'll fix this. I think I can also add a unit-test for this.
|
|
||
| testutils.Expect(t, watch, state.EventCommit{}) | ||
|
|
||
| // Set the task state for the restarted task PENDING to simulate allocation. |
There was a problem hiding this comment.
Non-blocking: To simulate what actually happens, should this be set to pending before the force update, so that the task is already in pending state when it is shut down? If not, why must it occur here, instead of before the force update?
…n assigned but have been marked for shutdown. Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
| // This check is important to ignore tasks which are running or need to be running, | ||
| // but to delete tasks which are either past running, | ||
| // or have not reached running but need to be shutdown (because of a service update, for example). | ||
| if t.DesiredState == api.TaskStateRunning && t.Status.State <= api.TaskStateRunning { |
There was a problem hiding this comment.
I'm not sure this is really needed. Don't new tasks always start with desired state running?
There was a problem hiding this comment.
There was a problem hiding this comment.
Not sure why you think this is not needed. We need to shutdown tasks with desired state SHUTDOWN and actual state < ASSIGNED.
This condition is to skip tasks which are not those covered by the condition above. LMK if this is not clear and we can discuss IRL. @nishanttotla
There was a problem hiding this comment.
Oh oops, I missed the change from || to &&, because previously it would count a task with desired state of SHUTDOWN and current state of PENDING to be running - thanks for fixing.
There was a problem hiding this comment.
Oops, I missed it too. This is needed, sorry for the sliip.
Relevant changes: - moby/swarmkit#2551 RoleManager will remove deleted nodes from the cluster membership - moby/swarmkit#2574 Scheduler/TaskReaper: handle unassigned tasks marked for shutdown - moby/swarmkit#2561 Avoid predefined error log - moby/swarmkit#2557 Task reaper should delete tasks with removed slots that were not yet assigned - moby/swarmkit#2587 [fips] Agent reports FIPS status - moby/swarmkit#2603 Fix manager/state/store.timedMutex Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Relevant changes: - moby/swarmkit#2551 RoleManager will remove deleted nodes from the cluster membership - moby/swarmkit#2574 Scheduler/TaskReaper: handle unassigned tasks marked for shutdown - moby/swarmkit#2561 Avoid predefined error log - moby/swarmkit#2557 Task reaper should delete tasks with removed slots that were not yet assigned - moby/swarmkit#2587 [fips] Agent reports FIPS status - moby/swarmkit#2603 Fix manager/state/store.timedMutex Signed-off-by: Sebastiaan van Stijn <github@gone.nl> Upstream-commit: 333b2f28fef4ba857905e7263e7b9bbbf7c522fc Component: engine
Relevant changes: - moby/swarmkit#2551 RoleManager will remove deleted nodes from the cluster membership - moby/swarmkit#2574 Scheduler/TaskReaper: handle unassigned tasks marked for shutdown - moby/swarmkit#2561 Avoid predefined error log - moby/swarmkit#2557 Task reaper should delete tasks with removed slots that were not yet assigned - moby/swarmkit#2587 [fips] Agent reports FIPS status - moby/swarmkit#2603 Fix manager/state/store.timedMutex Signed-off-by: Sebastiaan van Stijn <github@gone.nl> (cherry picked from commit 333b2f28fef4ba857905e7263e7b9bbbf7c522fc) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Adresses moby/moby#36699
scheduler: changes to ignore unassigned tasks.
task reaper: changes to cleanup unassigned tasks in terminal state.