Skip to content

Graceful handling of sockets (or whatever) getting closed while in use #36

@njsmith

Description

@njsmith

Suppose we have one task happily doing its thing:

async def task1(sock):
    await sock.sendall(b"...")

and simultaneously, another task is a jerk:

async def task2(sock):
    sock.close()

It would be nice to handle this gracefully.

How graceful can we get? There is a limit here, which is that (a) it's Python so we can't actually stop people from closing things if they insist, and (b) the OS APIs we depend on don't necessarily handle this in a helpful way. Specifically I believe that for epoll and kqueue, if a file descriptor that's they're watching gets closed, they just silently stop watching it, which in the situation above would mean task1 blocks forever or until cancelled. (Windows select -- or at least the select.select wrapper on Windows -- seems to return immediately with the given socket object marked as readable.)

As an extra complication, there are really two cases here: the one where the object gets closed just before we hand it to the IO layer, and the one where it gets closed while in possession of the IO layer.

And for sockets, one more wrinkle: when a stdlib socket.socket object is closed, then its fileno() starts returning -1. This is actually kinda convenient, because at least we can't accidentally pass in a valid fd/handle that has since been assigned to a different object.

Some things we could do:

  • In our close methods, first check with the IOManager whether the object is in use, and if so cancel those uses first. (On Windows we can't necessarily cancel immediately, but I guess that's OK b/c on Windows it looks like closing the handle will essentially trigger a cancellation already; it's the other platforms where we have to emulate this.)

  • In IOManager methods that take an object-with-fileno()-or-fd-or-handle, make sure to validate the fd/handle while still in the caller's context. I think on epoll/kqueue we're OK right now because the wait_* methods immediately register the fd, and on Windows the register_for_iocp method is similar. But for Windows select, the socket could be invalid and we won't notice until it gets selected on in the select thread. Or it could become invalid on its way to the select thread, or in between calls to select... right now I think this will just cause the select loop to blow up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions