-
-
Notifications
You must be signed in to change notification settings - Fork 748
Description
At the moment, we use loop.call_later and loop.call_later in several places to fire-and-forget schedule background tasks. Since we do not keep track of them, they are never cleaned up until they are done. See the snippet for an example:
distributed/distributed/scheduler.py
Lines 4287 to 4295 in 50d2911
| def remove_worker_from_events(): | |
| # If the worker isn't registered anymore after the delay, remove from events | |
| if address not in self.workers and address in self.events: | |
| del self.events[address] | |
| cleanup_delay = parse_timedelta( | |
| dask.config.get("distributed.scheduler.events-cleanup-delay") | |
| ) | |
| self.loop.call_later(cleanup_delay, remove_worker_from_events) |
One issue this causes is the prevention of garbage collection in tests (see #6353). To ensure that we clean up all scheduled tasks, we to implement functionality for managing background tasks and ensure proper cleanup on pending tasks. This should be implemented on the Server to make it available to all subclasses.
To advance our efforts to get rid of tornado, we want to implement this functionality using asyncio and remove the calls to loop.call_later and loop.add_callback.
Sketch of a possible implementation in Server:
class Server(...):
...
def __init__(...):
...
self._background_tasks: set[asyncio.Task]
...
def add_background_task(self, coro):
task = create_task(coro())
self._background_tasks.add(task)
task.add_done_callback(lambda _: self._background_tasks.remove(task))
...
async def close(self, ...):
for ts in self._background_tasks:
ts.cancel()
await asyncio.gather(*self._background_tasks)cc @fjetter, @graingert