Skip to content

Better handling of Supervisor spec udpate #4812

@pjain1

Description

@pjain1

If a modified supervisor spec is posted to overlord for an already existing and currently running supervisor then the overlord posts a GracefulShutdownNotice to the running supervisor and starts the new one with updated spec. When the new Supervisor starts running then ideally it should start new tasks at offsets where the old tasks are supposed to finish (being optimistic). However, currently I have observed that it tires to signal the task to stop without publish and starts new tasks at the same start offset as old ones.
As far as I can understand it does that because when the new Supervisor runs and runs the discoverTasks method, it discovers the old tasks but they might not be PUBLISHING state as tasks will take some time to persist pending data and then switch to publishing state, thus the supervisor will match the task spec with current spec and realize that they are not current and thus try to stop. Ideally, the tasks should be in pendingCompletionTaskGroups so that supervisor waits for them to finish and also start the new tasks from where the old tasks are finishing.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions