-
-
Notifications
You must be signed in to change notification settings - Fork 748
Send active task durations from worker to scheduler #4192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| active_durations={ | ||
| key: now - self.tasks[key].start_time | ||
| for key in self.active_threads.values() | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should reclaim the term executing from above? That seems to be derivative state from this
|
Another situation where this might be useful is if the submitted tasks are very long running but the cluster hasn't seen them before. In the current situation this forces users to configure some estimation of a tasks runtime in the configuration to get reasonable responsive scaling behaviour using the |
Agreed. I think/hope that this should remove most of the need for the config values in the future. |
|
That's a great point @fjetter, I totally agree with you. I suspect there are a few situations where we can use this duration information to make more informed scheduling decisions. This PR is just a first step to get active task durations to the scheduler |
|
Not sure why |
This PR updates the worker's heartbeat to the scheduler to include the duration of actively running tasks. This information can help us make more informed scheduling decisions when there's a discrepancy between the expected task duration (stored on the corresponding
TaskPrefix.duration_average) and the actual duration of a running task. For example, this situation can arise in work stealing scenarios when a task takes much longer than expected based on the average task duration. Additionally, sending this information to the scheduler via worker heartbeat helps avoid significant message-per-task overhead.cc @mrocklin @sheer-coiled for thoughts