-
Notifications
You must be signed in to change notification settings - Fork 263
Awaiting an already-completed IAsyncAction does not consume stack (in C++20 mode) #1040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… C++20 mode) Normally, when awaiting an IAsyncInfo, we register a completion handler and resume the coroutine from the handler. However, this results in stack build-up if the IAsyncInfo has already completed because the IAsyncInfo will call the completion handler synchronously. The stack is * caller -> awaiter -> await_suspend -> Completed -> Handler -> caller We want to avoid the deep stack that leads back to the caller. One way to do this is to sniff the IAsyncInfo's Status in `await_ready` and return `false` if it is already in a terminal state. However, this adds a virtual call to the `co_await` code path. Instead, we detect the situation by having the handler check whether it is being called while `await_suspend` is still running: * caller -> awaiter -> await_suspend -> Completed -> Handler If the Handler realizes that it's inside `await_suspend`, it delegates responsibility to resume the coroutine to `await_suspend` and returns immediately. This unwinds the stack back to `await_suspend`: * caller -> awaiter -> await_suspend The `await_suspend` detects that the Handler wants to resume the coroutine, so it just returns `false`. This unwinds the stack all the way to the caller: * caller which can now resume with no stack usage. This works great, except that there is a code generation bug in MSVC C++17 prerelease coroutines that occurs if `await_suspend` returns `bool`. (Improper promotion decision.) This bug is [fixed in MSVC 16.11](https://devblogs.microsoft.com/cppblog/cpp20-coroutine-improvements-in-visual-studio-2019-version-16-11/). However, MSVC 16.11 prerelease coroutines are ABI-compatible with pre-16.11 prerelease coroutines, and it's possible that somebody is going to mix some pre-16.11 coroutine code with post-16.11 coroutine code. Changing the the behavior of `await_suspend` based on the compiler version would introduce ODR violations. Therefore, if we are using MSVC prerelease coroutines, we cannot have `await_suspend` return `bool`. It must return `void`. The best we can do is have `await_suspend` resume the coroutine inline. * caller -> awaiter -> await_suspend -> caller This is not great, but it's the best we can do with MSVC prerelease coroutines. Hopefully, everybody will move to C++20 coroutines, either by compiling in C++20 mode or asking for C++20 coroutines in C++17 mode via `/await:strict`. Or at least they will upgrade their toolchain to MSVC 16.11 or higher. Then we can finally drop support for MSVC prerelease coroutines less than 16.11. We take this opportunity to simplify the `await_suspend` by consolidating the three things that the Completion handler has to manage into the single `disconnect_aware_handler`. (The three things are disconnection, apartment switching, and now synchronous completion.) Note that the `suspending` flag must be atomic, because it's possible that the IAsyncInfo completes on another thread. The completion uses release semantics (and `await_suspend` uses acquire semantics) to ensure that all state is properly transferred from the completing thread to the awaiting thread. Normally, `resume_apartment` would establish the necessary memory barriers, but in this short-circuit, we can't use `resume_apartment`. (Fortunately, we don't need it because the `await_suspend` is already running in the correct apartment.)
test/test/async_completed.cpp
Outdated
| initial = approximate_stack_pointer(); | ||
| co_await AlreadyCompleted(); | ||
| uintptr_t consumed = initial - approximate_stack_pointer(); | ||
| #ifdef _RESUMABLE_FUNCTIONS_SUPPORTED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are both branches being tested? Perhaps we need to copy this test to the test_cpp20 project so we can test both C++17 (old) and C++20 (new) behaviors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I manually tested both ways by hacking the header to pretend we're always in C++20 mode, but we should exercise the code in C++20 mode for realsies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out that our C++20 test was broken. It was using /await which means "Give me C++17 style coroutines."
Do we want to raise a build warning when a project is targeting C++ 17 with /await instead of /await:strict? |
|
Then brand new projects would have that warning: https://github.com/microsoft/cppwinrt/blob/master/nuget/Microsoft.Windows.CppWinRT.targets#L882 |
This uncovered the fact that the C++20 tests were not actually using C++20 coroutines (!) because we were still passing the `/await` flag, which sends us back to C++17 coroutines. Updated the `cppwinrt.props` so you can opt into stdcpplatest. (If you say nothing, then you get the default, which is currently C++17 with `/await`, but that could change as time passes.)
kennykerr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Normally, when awaiting an IAsyncInfo, we register a completion handler and resume the coroutine from the handler. However, this results in stack build-up if the IAsyncInfo has already completed because the IAsyncInfo will call the completion handler synchronously. The stack is
We want to avoid the deep stack that leads back to the caller. This is particularly problematic when the
co_awaithappens in a loop: Each iteration of the loop eats more stack.One way to do this is to sniff the IAsyncInfo's Status in
await_readyand returnfalseif it is already in a terminal state. However, this adds three virtual calls to theco_awaitcode path. (QI + Status + Release.)Instead, we detect the situation by having the handler check whether it is being called while
await_suspendis still running:If the Handler realizes that it's inside
await_suspend, it delegates responsibility to resume the coroutine toawait_suspendand returns immediately. This unwinds the stack back toawait_suspend:The
await_suspenddetects that the Handler wants to resume the coroutine, so it just returnsfalse. This unwinds the stack all the way to the caller:which can now resume with no stack usage.
This works great, except that there is a code generation bug in MSVC C++17 prerelease coroutines that occurs if
await_suspendreturnsbool. (Improper promotion decision.) This bug is fixed in MSVC 16.11.However, MSVC 16.11 prerelease coroutines are ABI-compatible with pre-16.11 prerelease coroutines, and it's possible that somebody is going to mix some pre-16.11 coroutine code with post-16.11 coroutine code. Changing the the behavior of
await_suspendbased on the compiler version would introduce ODR violations.Therefore, if we are using MSVC prerelease coroutines, we cannot have
await_suspendreturnbool, just in case this code ever links against code compiled with pre-16.11.await_suspendmust returnvoid. The best we can do is have it resume the coroutine inline.This is not great, but it's the best we can do with MSVC prerelease coroutines. Hopefully, everybody will move to C++20 coroutines, either by compiling in C++20 mode or opting into C++20 style coroutines in C++17 mode via
/await:strict. Or at least they will upgrade their toolchain to MSVC 16.11 or higher. Then we can finally drop support for MSVC prerelease coroutines less than 16.11.We take this opportunity to simplify the
await_suspendby consolidating the three things that the Completion handler has to manage into the singledisconnect_aware_handler. (The three things are disconnection, apartment switching, and now synchronous completion.)Note that the
suspendingflag must be atomic, because it's possible that the IAsyncInfo completes on another thread. The completion uses release semantics (andawait_suspenduses acquire semantics) to ensure that all state is properly transferred from the completing thread to the awaiting thread. Normally,resume_apartmentwould establish the necessary memory barriers, but in this short-circuit, we can't useresume_apartment. (Fortunately, we don't need it because theawait_suspendis already running in the correct apartment.)