Skip to content

Replace loop.call_later and loop.add_callback with background tasks #6359

@hendrikmakait

Description

@hendrikmakait

At the moment, we use loop.call_later and loop.call_later in several places to fire-and-forget schedule background tasks. Since we do not keep track of them, they are never cleaned up until they are done. See the snippet for an example:

def remove_worker_from_events():
# If the worker isn't registered anymore after the delay, remove from events
if address not in self.workers and address in self.events:
del self.events[address]
cleanup_delay = parse_timedelta(
dask.config.get("distributed.scheduler.events-cleanup-delay")
)
self.loop.call_later(cleanup_delay, remove_worker_from_events)

One issue this causes is the prevention of garbage collection in tests (see #6353). To ensure that we clean up all scheduled tasks, we to implement functionality for managing background tasks and ensure proper cleanup on pending tasks. This should be implemented on the Server to make it available to all subclasses.

To advance our efforts to get rid of tornado, we want to implement this functionality using asyncio and remove the calls to loop.call_later and loop.add_callback.

Sketch of a possible implementation in Server:

class Server(...):
    ...    
    def __init__(...):
        ...
        self._background_tasks: set[asyncio.Task]
        ...

    def add_background_task(self, coro):
        task = create_task(coro())
        self._background_tasks.add(task)
        task.add_done_callback(lambda _: self._background_tasks.remove(task))
        ...

    async def close(self, ...):
        for ts in self._background_tasks:
            ts.cancel()
        await asyncio.gather(*self._background_tasks)

cc @fjetter, @graingert

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions