Task reaper should delete tasks with removed slots that were not yet assigned#2557
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2557 +/- ##
==========================================
+ Coverage 61.74% 62.19% +0.45%
==========================================
Files 51 134 +83
Lines 7112 21761 +14649
==========================================
+ Hits 4391 13535 +9144
- Misses 2264 6783 +4519
- Partials 457 1443 +986 |
|
LGTM |
| // haven't been assigned yet can be cleaned up right away | ||
| for _, t := range removeTasks { | ||
| if t.Status.State >= api.TaskStateCompleted { | ||
| if t.Status.State >= api.TaskStateCompleted || t.Status.State < api.TaskStateAssigned { |
There was a problem hiding this comment.
nit: I'd suggest making it:
t.Status.State < api.TaskStateAssigned || t.Status.State >= api.TaskStateCompleted
slightly better for reading
| // add tasks that are yet unassigned or have progressed beyond COMPLETE, with | ||
| // desired state REMOVE. These tasks are associated with slots that were removed | ||
| // as part of a service scale down or service removal. | ||
| if t.DesiredState == api.TaskStateRemove && (t.Status.State >= api.TaskStateCompleted || t.Status.State < api.TaskStateAssigned) { |
|
@anshulpundir good point about races. Unless the agent doesn't try to run tasks that have desired state As of now, I don't think the agent does this check cc @stevvooe |
| // add tasks that are yet unassigned or have progressed beyond COMPLETE, with | ||
| // desired state REMOVE. These tasks are associated with slots that were removed | ||
| // as part of a service scale down or service removal. | ||
| if t.DesiredState == api.TaskStateRemove && (t.Status.State >= api.TaskStateCompleted || t.Status.State < api.TaskStateAssigned) { |
There was a problem hiding this comment.
Also, do we expect to get task update events for tasks which are marked REMOVE before they are assigned to the agent ? If a task is marked REMOVE, does the orchestrator even assign it to an agent ? @nishanttotla
There was a problem hiding this comment.
Good point @anshulpundir. It would be fair to expect the orchestrator to not assign those tasks at all. Let me check this.
There was a problem hiding this comment.
@anshulpundir so it seems like the dispatcher never checks desired state, only current state to decide to send down tasks: https://github.com/docker/swarmkit/blob/master/manager/dispatcher/dispatcher.go#L819
Does it make sense to check desired state here as well? Why should the dispatcher send a task to an agent if its desired state is past SHUTDOWN?
There was a problem hiding this comment.
From discussion IRL with @nishanttotla: It looks like the controller on the agent is responsible for looking at the desired state and acting on it (either by shutting down or not running at all). The dispatcher sends down all the tasks that are assigned to the agent, and any changes to those tasks. So if a task changes from DesiredState = RUNNING to DesiredState = SHUTDOWN it needs to send that update to the agent so that on the agent side the task gets shut down. If the dispatcher skips sending the update, the task will just keep running on the agent.
09f9f94 to
c61b8ed
Compare
c61b8ed to
1df0bf7
Compare
|
Rebased on #2574 and added a test case. cc @cyli @anshulpundir @dperny |
| Slot: 3, | ||
| DesiredState: api.TaskStateRemove, | ||
| Status: api.TaskStatus{ | ||
| State: api.TaskStatePending, |
There was a problem hiding this comment.
Please also add a test for TaskStateNew
There was a problem hiding this comment.
Will do. Regarding the comments below, I'll also rebase and add some more tests for the case of a running task reaper.
|
LGTM after updating the tests for NEW state. |
cyli
left a comment
There was a problem hiding this comment.
LGTM, but can we add a test for task reaper reaping REMOVEd tasks that are unassigned while the task reaper is running? (e.g. not on task reaper init).
|
Nishant mentioned this yesterday. I think there are other tests in this
file that don’t cover both the scenarios.
…On Thu, Apr 5, 2018 at 10:37 AM Ying Li ***@***.***> wrote:
***@***.**** approved this pull request.
LGTM, but can we add a test for task reaper reaping REMOVEd tasks that
are unassigned while the task reaper is running? (e.g. not on task reaper
init).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2557 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AM7HYV0xuhowBeBbwsncYNNOCsroerRFks5tllY6gaJpZM4Sppa0>
.
|
|
@anshulpundir I'm wondering if it'd be worth creating an exhaustive list of tasks in every combination of desired state + current state, and making sure the correct ones are reaped on start, and doing the same to create/update. |
|
@cyli exhaustive list can't hurt. It'll help systematize task reaper tests, and possibly catch bugs faster. I can take on that task in a separate PR. |
…assigned Signed-off-by: Nishant Totla <nishanttotla@gmail.com>
1df0bf7 to
7f63825
Compare
Relevant changes: - moby/swarmkit#2551 RoleManager will remove deleted nodes from the cluster membership - moby/swarmkit#2574 Scheduler/TaskReaper: handle unassigned tasks marked for shutdown - moby/swarmkit#2561 Avoid predefined error log - moby/swarmkit#2557 Task reaper should delete tasks with removed slots that were not yet assigned - moby/swarmkit#2587 [fips] Agent reports FIPS status - moby/swarmkit#2603 Fix manager/state/store.timedMutex Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Relevant changes: - moby/swarmkit#2551 RoleManager will remove deleted nodes from the cluster membership - moby/swarmkit#2574 Scheduler/TaskReaper: handle unassigned tasks marked for shutdown - moby/swarmkit#2561 Avoid predefined error log - moby/swarmkit#2557 Task reaper should delete tasks with removed slots that were not yet assigned - moby/swarmkit#2587 [fips] Agent reports FIPS status - moby/swarmkit#2603 Fix manager/state/store.timedMutex Signed-off-by: Sebastiaan van Stijn <github@gone.nl> Upstream-commit: 333b2f28fef4ba857905e7263e7b9bbbf7c522fc Component: engine
Relevant changes: - moby/swarmkit#2551 RoleManager will remove deleted nodes from the cluster membership - moby/swarmkit#2574 Scheduler/TaskReaper: handle unassigned tasks marked for shutdown - moby/swarmkit#2561 Avoid predefined error log - moby/swarmkit#2557 Task reaper should delete tasks with removed slots that were not yet assigned - moby/swarmkit#2587 [fips] Agent reports FIPS status - moby/swarmkit#2603 Fix manager/state/store.timedMutex Signed-off-by: Sebastiaan van Stijn <github@gone.nl> (cherry picked from commit 333b2f28fef4ba857905e7263e7b9bbbf7c522fc) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This PR addresses #2555.
cc @docker/core-swarmkit-maintainers
Signed-off-by: Nishant Totla nishanttotla@gmail.com