-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Description
There are many scenarios a user may clear running TaskInstances. The current behavior is to set the TaskInstance state to SHUTDOWN and then SIGTERM is sent to the task, causing it to fail and its on_failure_callback is called. This can be noisy. It usually makes more sense to silently clear + terminate the running instance and retry. Here are some example scenarios:
- A long running task needs to pick up a code change. The user makes the change and clears the task. The user probably doesn't want
on_failure_callbackto be called when he clears it. He just wants the task to be restarted with his code change, gracefully. - A task failed for some external reason. The user fixed the underlying issue and cleared the failed task to retry. Soon after he cleared it, he realizes that the fix he introduced is not good enough so he introduced another fix. Then he cleared the task again while it's still running. This kills the task and makes it fail. A better behavior is to silently kill the task and retry gracefully.
The point I'm making is that clearing a running TaskInstance is different from marking a running TaskInstance failed. At the moment, both operations do the same thing: the TaskInstance is first set to SHUTDOWN and then FAILED.
One suggestion is to introduce a new State called CLEARED_WHEN_RUNNING. As the name suggests, a TaskInstance should be set to this state when it's cleared while running. Most of the places can handle this state the same way SHUTDOWN is handled, except in TaskInstance.is_eligible_to_retry, where it should always be treated as eligible for retry.
Are you willing to submit a PR?
Yes!