Skip to content

Conversation

@oldnewthing
Copy link
Member

@oldnewthing oldnewthing commented Oct 14, 2021

Normally, when awaiting an IAsyncInfo, we register a completion handler and resume the coroutine from the handler. However, this results in stack build-up if the IAsyncInfo has already completed because the IAsyncInfo will call the completion handler synchronously. The stack is

  • caller -> awaiter -> await_suspend -> Completed -> Handler -> caller

We want to avoid the deep stack that leads back to the caller. This is particularly problematic when the co_await happens in a loop: Each iteration of the loop eats more stack.

One way to do this is to sniff the IAsyncInfo's Status in await_ready and return false if it is already in a terminal state. However, this adds three virtual calls to the co_await code path. (QI + Status + Release.)

Instead, we detect the situation by having the handler check whether it is being called while await_suspend is still running:

  • caller -> awaiter -> await_suspend -> Completed -> Handler

If the Handler realizes that it's inside await_suspend, it delegates responsibility to resume the coroutine to await_suspend and returns immediately. This unwinds the stack back to await_suspend:

  • caller -> awaiter -> await_suspend

The await_suspend detects that the Handler wants to resume the coroutine, so it just returns false. This unwinds the stack all the way to the caller:

  • caller

which can now resume with no stack usage.

This works great, except that there is a code generation bug in MSVC C++17 prerelease coroutines that occurs if await_suspend returns bool. (Improper promotion decision.) This bug is fixed in MSVC 16.11.

However, MSVC 16.11 prerelease coroutines are ABI-compatible with pre-16.11 prerelease coroutines, and it's possible that somebody is going to mix some pre-16.11 coroutine code with post-16.11 coroutine code. Changing the the behavior of await_suspend based on the compiler version would introduce ODR violations.

Therefore, if we are using MSVC prerelease coroutines, we cannot have await_suspend return bool, just in case this code ever links against code compiled with pre-16.11. await_suspend must return void. The best we can do is have it resume the coroutine inline.

  • caller -> awaiter -> await_suspend -> caller

This is not great, but it's the best we can do with MSVC prerelease coroutines. Hopefully, everybody will move to C++20 coroutines, either by compiling in C++20 mode or opting into C++20 style coroutines in C++17 mode via /await:strict. Or at least they will upgrade their toolchain to MSVC 16.11 or higher. Then we can finally drop support for MSVC prerelease coroutines less than 16.11.

We take this opportunity to simplify the await_suspend by consolidating the three things that the Completion handler has to manage into the single disconnect_aware_handler. (The three things are disconnection, apartment switching, and now synchronous completion.)

Note that the suspending flag must be atomic, because it's possible that the IAsyncInfo completes on another thread. The completion uses release semantics (and await_suspend uses acquire semantics) to ensure that all state is properly transferred from the completing thread to the awaiting thread. Normally, resume_apartment would establish the necessary memory barriers, but in this short-circuit, we can't use resume_apartment. (Fortunately, we don't need it because the await_suspend is already running in the correct apartment.)

… C++20 mode)

Normally, when awaiting an IAsyncInfo, we register a completion handler
and resume the coroutine from the handler. However, this results in stack
build-up if the IAsyncInfo has already completed because the IAsyncInfo will
call the completion handler synchronously. The stack is

* caller -> awaiter -> await_suspend -> Completed -> Handler -> caller

We want to avoid the deep stack that leads back to the caller.

One way to do this is to sniff the IAsyncInfo's Status in `await_ready`
and return `false` if it is already in a terminal state. However, this
adds a virtual call to the `co_await` code path.

Instead, we detect the situation by having the handler check whether
it is being called while `await_suspend` is still running:

* caller -> awaiter -> await_suspend -> Completed -> Handler

If the Handler realizes that it's inside `await_suspend`, it delegates
responsibility to resume the coroutine to `await_suspend` and returns
immediately. This unwinds the stack back to `await_suspend`:

* caller -> awaiter -> await_suspend

The `await_suspend` detects that the Handler wants to resume the
coroutine, so it just returns `false`. This unwinds the stack all
the way to the caller:

* caller

which can now resume with no stack usage.

This works great, except that there is a code generation bug
in MSVC C++17 prerelease coroutines that occurs if
`await_suspend` returns `bool`. (Improper promotion decision.)
This bug is [fixed in MSVC 16.11](https://devblogs.microsoft.com/cppblog/cpp20-coroutine-improvements-in-visual-studio-2019-version-16-11/).

However, MSVC 16.11 prerelease coroutines are ABI-compatible with
pre-16.11 prerelease coroutines, and it's possible that somebody
is going to mix some pre-16.11 coroutine code with post-16.11
coroutine code. Changing the the behavior of `await_suspend` based
on the compiler version would introduce ODR violations.

Therefore, if we are using MSVC prerelease coroutines, we cannot
have `await_suspend` return `bool`. It must return `void`.
The best we can do is have `await_suspend` resume the coroutine
inline.

* caller -> awaiter -> await_suspend -> caller

This is not great, but it's the best we can do with MSVC prerelease
coroutines. Hopefully, everybody will move to C++20 coroutines,
either by compiling in C++20 mode or asking for C++20 coroutines
in C++17 mode via `/await:strict`. Or at least they will upgrade
their toolchain to MSVC 16.11 or higher. Then we can finally
drop support for MSVC prerelease coroutines less than 16.11.

We take this opportunity to simplify the `await_suspend` by
consolidating the three things that the Completion handler has
to manage into the single `disconnect_aware_handler`.
(The three things are disconnection, apartment switching, and
now synchronous completion.)

Note that the `suspending` flag must be atomic, because it's
possible that the IAsyncInfo completes on another thread.
The completion uses release semantics (and `await_suspend`
uses acquire semantics) to ensure that all state is properly
transferred from the completing thread to the awaiting thread.
Normally, `resume_apartment` would establish the necessary memory
barriers, but in this short-circuit, we can't use `resume_apartment`.
(Fortunately, we don't need it because the `await_suspend` is already
running in the correct apartment.)
initial = approximate_stack_pointer();
co_await AlreadyCompleted();
uintptr_t consumed = initial - approximate_stack_pointer();
#ifdef _RESUMABLE_FUNCTIONS_SUPPORTED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are both branches being tested? Perhaps we need to copy this test to the test_cpp20 project so we can test both C++17 (old) and C++20 (new) behaviors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I manually tested both ways by hacking the header to pretend we're always in C++20 mode, but we should exercise the code in C++20 mode for realsies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that our C++20 test was broken. It was using /await which means "Give me C++17 style coroutines."

@jlaanstra
Copy link
Contributor

This is not great, but it's the best we can do with MSVC prerelease coroutines. Hopefully, everybody will move to C++20 coroutines, either by compiling in C++20 mode or opting into C++20 style coroutines in C++17 mode via /await:strict. Or at least they will upgrade their toolchain to MSVC 16.11 or higher. Then we can finally drop support for MSVC prerelease coroutines less than 16.11.

Do we want to raise a build warning when a project is targeting C++ 17 with /await instead of /await:strict?

@sylveon
Copy link
Contributor

sylveon commented Oct 15, 2021

This uncovered the fact that the C++20 tests were not actually
using C++20 coroutines (!) because we were still passing the
`/await` flag, which sends us back to C++17 coroutines.

Updated the `cppwinrt.props` so you can opt into stdcpplatest.
(If you say nothing, then you get the default, which is currently
C++17 with `/await`, but that could change as time passes.)
Copy link
Collaborator

@kennykerr kennykerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@kennykerr kennykerr merged commit 4c7e2b5 into microsoft:master Oct 21, 2021
@oldnewthing oldnewthing deleted the sync-completion branch October 21, 2021 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants