Skip to content

[discussion] Mechanism for "sharing" tasks between different branches of the task tree #266

@njsmith

Description

@njsmith

Use case 1: there's an expensive idempotent operation that multiple tasks might want to call. Examples within trio itself would include getaddrinfo for a particular host, or WaitForSingleObject on a single object (as needed for e.g. the windows version of Popen.wait). If two tasks make the same call at the same time, then we'd like to coalesce those into a single underlying call and then report the result back to both callers.

Use case 2: there are multiple user-level objects that underneath are multiplexed onto a single object. (For example: multiple HTTP/2 channels on top of a single TCP connection.) Now you need the invariant: whenever at least one task is blocked in await channel.receive_some(), there needs to be a task reading from the underlying TCP connection -- but there should never be more than one task receiving from the underlying TCP connection. In some cases the right solution to this is probably to create one background task to handle receiving from the TCP connection. For example, in HTTP/2 it's expected that you can respond to pings at any time, even if all the logical channels are paused, so HTTP/2 isn't really a good example here -- the TCP receiver needs to be running constantly for as long as the connection is open. But in other protocols you might want to only receive on the TCP channel when there are tasks waiting for data. (IIRC something like this shows up in ZeroMQ's API, or the SSH protocol has logical channels but I don't think it has any pings.) In this case it's possible in principle to have the first task that arrives in channel.receive_some initiate a tcp_transport.receive_some, and later tasks block waiting for it, and then if the first task gets cancelled then it hands off the job of calling tcp_transport.receive_some to someone else.... but this is super complicated. It would be nice if there were some standard way for all the tasks in channel.receive_some to simply "share" a single call to tcp_transport.receive_some (or more realistically: protocol_object.pump_underlying_transport).

The interesting thing about these is that they don't fit well into trio's nursery system, BUT they're still possible to do without violating the invariants that the nursery system was created to enforce -- in particular, if the shared call crashes, then we have somewhere to report that. (Though there might be a minor issue with throwing the same exception into multiple receivers and still getting a sensible traceback -- maybe Error.unwrap should copy.copy the exception object before raising it?)

I was originally thinking about something like allowing tasks to be in a special "coparented" state, where multiple nurseries supervise them at the same time. But this gets complicated quickly: you need special cases in the nursery code, and there are issues like... if this is a background task providing a service to the code in this nursery (and also other nurseries), then do we need to make sure that it stays alive as long as the non-shared tasks? So should we shield it from cancellation until after all the other tasks exit? That seems tough.

I think a better idea is: don't model it as a task, just model it as a call (and it might be combined with other identical calls). This is exactly what you want for "use case 1" anyway, not a background task. For "use case 2", there's some subtlety because when the protocol pump receives some data on channel 7, it probably shouldn't exit immediately and wake everyone up; instead it just wants to wake up the task that was waiting for data on channel 7. But this can be done by having that task spawn one task to run the pump, and a second task to wait on a Queue or whatever, and then when the second task gets woken by the pump it cancels the first task, which either leaves the shared task to keep pumping on behalf of other tasks, or else if this was the only task waiting on a channel then it cancels the pump task. (So obviously the pump task needs to be cancel-safe, but that's probably needed anyway.)

Maybe:

call = CoalescableCall(afn, *args)
await call.run()

Attributes: call.started, call.finished, call.cancelled (I guess we want to not hold onto the result, and make run an error if finished or cancelled?). And we punt to the user to figure out how they want to pass these things between tasks, what key to use to look it up, etc.

CoalescableCall is a terrible name. CombinedRun?

Bonus: doing it this way should actually be pretty straightforward using spawn_system_task. (It's sort of like run_in_trio_thread.)

Prior art / inspiration: Flatt & Findler (2004), Kill-safe synchronization abstractions. They're worried about cases like a queue that has a worker task attached to it, and then is accessed from multiple tasks; if a task that's doing queue.put gets cancelled, you don't want this to kill the worker thread or allow the queue object to be gc'ed, because it might leave the object in an inconsistent state. OTOH if both the queue.put and the queue.get tasks get cancelled, then it's ok (in fact mandatory) to kill the worker thread and gc the queue object. So they define a kind of co-parenting primitive.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions