ability to write code that is agnostic of blocking vs async I/O

This is a pretty popular blog post from 2015 that helps illustrate
one way of thinking about concurrency and async I/O:
[What Color is Your Function?](http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/)

Here's a proposal along these lines. I think we should at least consider it.

Depends on:

 * #287 (copy elision)
 * #2377 (coroutine rewrite)
 * #1764 (single threaded vs multithreaded build option)
 * #1047 (remove namespace type and make every file an empty struct)
 * #924 (thread local variables)
 * #2189 put the root source file aliased as `@import("root")` in all packages, so that the standard library and third party packages can access it.

Goals:

 * Ability to write code that works in an event-driven or blocking context.
   For example, libraries should be able to use OS features such as the
   file system and networking without having an opinion about whether to
   use blocking or event-based APIs.
 * In an application that never creates an event loop, there should be
   no code generated to deal with event loops.
 * In an application that never uses blocking I/O, there should be no
   runtime overhead associated with detecting the event loop, and no code
   generated at all to do blocking I/O.
 * An OS kernel should be able to use this mechanism effectively.
   Potentially even would be able to use external Zig packages that make
   standard library API calls, and have those directed to functions in the
   root source file.
 * Code can even express concurrency, such as two independent writes,
   and then wait for them both to be done, and then if this code is used
   from a blocking I/O application, with --single-threaded, it is as if
   it were implemented fully in a blocking fashion.

In summary, writing Zig code should be maximally portable. That means proper
Zig libraries could work in constrained memory environments, multi-threaded environments,
single-threaded environments, blocking I/O applications, evented I/O applications,
inside OS kernels, and userland applications. And not only work correctly in these environments, but work *optimally*.

Implementation:

 * Drop the `async` notation from functions. A function is a coroutine if it has a
   `await` or `suspend` in it. This is still part of the function's prototype; however
   whether it is a coroutine or not is inferred.
   * Note that this gives functions the property that, depending on comptime values,
     a function could either be a coroutine or a normal function.
 * A function is also a coroutine if it calls a coroutine naively, without storing the coroutine frame somewhere, because this implies an await.

Standard library functions that perform I/O, such as `std.os.File.write`, have bodies that look like
this:

```zig
pub fn write(file: *File, bytes: []const u8) usize {
    // Note that this is an if with comptime-known condition.
    if (std.event.loop.instance) |event_loop| {
        const msg = std.event.fs.Msg{
            .Write = std.event.fs.Msg.Write{
                .handle = file.handle,
                .ptr = bytes.ptr,
                .len = bytes.len,
                .result = undefined,
            },
        };
        suspend {
            event_loop.queueFsWrite(@handle(), &msg);
        }
        return msg.Write.result;
    } else {
        // blocking call
        return std.os.linux.write(file.handle, bytes.ptr, bytes.len);
    }
}
```

In `std/event/loop.zig`:

```zig
threadlocal var per_thread_instance: ?*Loop = null;
var global_state: Loop = undefined;
const io_mode = @fieldOrDefault(@import("root"), "io_mode", IoMode.Blocking);
const default_instance: ?*Loop = switch (io_mode) {
    .Blocking => null,
    .Evented => &global_state,
    .Mixed => per_thread_instance,
};
const instance: ?*Loop = @fieldOrDefault(@import("root", "event_loop", default_instance);
```

In the root source file, `pub const io_mode` determines whether
the application is 100% blocking, 100% evented, or mixed (per-thread).
If nothing is specified then the application is 100% blocking.

Or an application can take even more control, and set the event loop instance directly.
This would potentially be used for OS kernels, which need an event loop specific to their
own code.

When the IO method is Mixed, in the standard library event loop implementation,
worker threads get a thread local variable `per_thread_instance` set to the event loop
instance pointer. The "main thread" which calls `run` also sets this thread local
variable to the event loop instance pointer. In this way, sections of the codebase
can be isolated from each other; threads which are not in the thread pool of the
event loop get blocking I/O (or potentially belong to a different event loop) while
threads belonging to a given event loop's thread pool, find their owner instance.

Now let's look at some everyday code that wants to call `write`:

```zig
fn foo() void {
    const rc = file.write("hello\n");
}

// assume this is in root source file
pub const io_mode = .Evented;
```

Because `io_mode` is `Evented` the `write` function ends up calling `suspend`
and is therefore a coroutine. And following this, `foo` is also therefore a
coroutine, since it calls `write`.

When a coroutine calls a coroutine in this manner, it does a tail-async-await
coroutine call using its own stack.

But what if the code wants to express possible concurrency?

```zig
fn foo() void {
    var future1 = async file1.write("hello1\n");
    var future2 = async file2.write("hello2\n");
    const rc1 = await future1;
    const rc2 = await future2;
}
```

`async` is a keyword that takes any expression, which could be a block, but in
this case is a function call. The async expression itself returns a coroutine frame
type, which supports `await`. The inner expression result is stored in the frame,
and is the result when using `await`. I'll elaborate more on `async` blocks later.

If an application is compiled with `IoMode.Blocking`, then the `write` function is
blocking. How `async` interacts with an expression that is all blocking,
is to have the result of the async expression be the result of the inner expression.
So then the type of `future1` and `future2` remain the same as before for consistency,
but in the code generated, they are just the result values, and then the `await`
expressions are no-ops. The function is essentially rewritten as:

```zig
fn foo() void {
    const rc1 = file1.write("hello1\n");
    const rc2 = file2.write("hello2\n");
}
```

Which makes `foo` blocking as well.

What about a CPU bound task?

```zig
fn areTheyEqual() bool {
    var pi_frame = async blk: {
        std.event.loop.startCpuTask();
        break :blk calculatePi();
    };
    var e_frame = async blk: {
        std.event.loop.startCpuTask();
        break :blk calculateE();
    };
    const pi = await pi_frame;
    const e = await e_frame;
    return pi == e;
}
```

Here, `startCpuTask` is defined as:

```zig
fn startCpuTask() void {
    if (@import("builtin").is_single_threaded) {
        return;
    } else if (std.event.loop.instance) |event_loop| {
        suspend {
            event_loop.onNextTick(@handle());
        }
    }
}
```

So, if you build this function in multi-threaded mode, with `io_mode != IoMode.Blocking`,
the async expression suspends in the startCpuTask and then gets resumed by the event
loop on another thread. `areTheyEqual` becomes a coroutine. So even though
`calculatePi` and `calculateE` are blocking functions, they end up executing in different
threads.

If you build this application in `--single-threaded` mode, `startCpuTask` ends up being
`return;`. It is thus not a coroutine. And so the async expressions in
`areTheyEqual` have only blocking calls, which means they turn into normal expressions,
and after the function is analyzed, it looks like this:

```zig
fn areTheyEqual() bool {
    const pi = calculatePi();
    const e = calculateE();
    return pi == e;
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ability to write code that is agnostic of blocking vs async I/O #1778

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ability to write code that is agnostic of blocking vs async I/O #1778

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions