-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
This issue is one of the main goals of the 0.5.0 release cycle. It is blocking stable event-based I/O, which is blocking networking in the standard library, which is blocking the Zig Package Manager (#943).
Note: Zig's coroutines are and will continue to be "stackless" coroutines. They are nothing more than a language-supported way to do Continuation Passing Style.
Background:
Status quo coroutines have some crippling problems:
- Because Zig uses LLVM's coroutine support, it has to follow the paradigm they set, which is that the coroutine allocates its own memory and destroys itself. In Zig this means that calling a coroutine can fail, even though that doesn't have to be possible. It also causes issues such as provide guarantees about whether memory goes in the coroutine frame or stack frame #1194.
- It also means that every coroutine has to be generic and accept a hidden allocator parameter. This makes it difficult to take advantage of coroutines to solve safe recursion (safe recursion #1006).
- One of the slowest things LLVM does in debug builds, discovered with
-ftime-report, is the coroutine splitting pass. Putting the pass in zig frontend code will speed it up. I examined the implementation of LLVM's coroutine support, and it does not appear to provide any advantages over doing the splitting in the frontend. - Optimizations are completely disabled due to LLVM bugs with coroutines (enable optimizations for coroutines #802)
- Other related issues: coroutine semantics are unsound. proposal to fix them #1363 syntax for guaranteed coroutine frame allocation elision #1260 ability to "bundle" more errors before the first suspend point in an async function #1197 safety assertion for resuming a coroutine which is awaiting #1163 fs tests hangs on FreeBSD #1870
The Plan
Step 1. Result Location Mechanism
Implement the "result location" mechanism from the Copy Elision proposal (#287). This makes it so that every expression has a "result location" where the result will go. If you were to for example do this:
test "example result location" {
var w = hi();
}
const Point = struct {
x: f32,
y: f32,
}
fn hi() Point {
return Point{
.x = 1,
.y = 2,
};
}What actually ends up happening is that hi gets a secret pointer parameter which is the address of w and initializes it directly.
Step 2. New Calling Syntax
Next, instead of coroutines being generic across an allocator parameter, they will use the result location as the coroutine frame. So the callsite can choose where the memory for the coroutine frame goes.
pub fn main() void {
var x = myAsyncFunction();
}In the above example, the coroutine frame goes into the variable x, which in this example is in the stack frame (or coroutine frame) of main.
The type of x is @Frame(myAsyncFunction). Every function will have a unique type associated with its coroutine frame. This means the memory can be manually managed, for example it can be put into the heap like this:
const ptr = try allocator.create(@Frame(myAsyncFunction));
ptr.* = myAsyncFunction();@Frame could also be used to put a coroutine frame into a struct, or in static memory. It also means that, for now, it won't be possible to call a function that isn't comptime known. E.g. function pointers don't work unless the function pointer parameter is comptime.
The @Frame type will also represent the "promise"/"future" (#858) of the return value. The await syntax on this type will suspend the current coroutine, putting a pointer to its own handle into the awaited coroutine, which will tail-call resume the awaiter when the value is ready.
Next Steps
From this point the current plan is to start going down the path of #1778 and try it out. There are some problems to solve.
- Does every suspend point in a coroutine represent a cancellation point? Does it cascade (if a coroutine which is awaiting another one gets canceled, does it cancel the target as well)?
- How do
deferanderrdeferinteract with suspension and cancellation? How does resource management work if the cleanup is after a suspend/cancellation point? - Is cancellation even a thing? Since the caller has ownership of the memory of the coroutine frame, it might not be necessary to support in the language. Coroutines could simply document whether they had to be awaited to avoid resource leaks or not.
- Do we want to try to solve generators?
This proposal for coroutines in Zig gets us closer to a final iteration of how things will be, but you can see it may require some more design iterations as we learn more through experimentation.