Skip to content

Conversation

@oldnewthing
Copy link
Member

The cancellation token learned a new method enable_propagation(). It takes an optional boolean parameter (default true) which specifies whether propagation of cancellation into child coroutines is enabled. It returns the previous setting, in case you want to restore it.

If cancellation propagation is enabled in a coroutine, then calling IAsyncXxx.Cancel() on the coroutine's asynchronous activity will try to propagate the cancellation into whatever the coroutine is co_awaiting for. Currently, the following awaitables are supported:

  • IAsyncXxx
  • resume_after()
  • resume_on_signal()

Example:

IAsyncAction<bool> CheckAsync()
{
    // Enable cancellation propagation.
    auto cancel = co_await get_cancellation_token();
    cancel.enable_propagation();

    HttpClient client;
    auto result = co_await client.TryGetAsync(Uri(L"https://www.microsoft.com"));

    co_return result.Succeeded() && result.ResponseMessage().IsSuccessStatusCode();
}

IAsyncAction<bool> checkOperation;

fire_and_forget StartButton_Click()
{
    checkOperation = CheckAsync();
    try
    {
        if (co_await checkOperation) {
            // Everything is just fine.
        }
        else
        {
            // Couldn't connect to server.
        }
    }
    catch (hresult_canceled const&)
    {
        // Operation was canceled.
    }
}

void CancelButton_Click()
{
    checkOperation.Cancel();
}

When the user clicks the Start button, we begin the CheckAsync() operation and save it. If the operation is taking too long and the user clicks the Cancel button, we Cancel the operation. Normally, this would cancel the CheckAsync() operation, but that's all. The TryGetAsync() call inside CheckAsync() continues to run to its normal completion. The CheckAsync() operation has been cancelled, and the catch clause runs, but the TryGetAsync() call is still running.

Enabling cancellation propagation means that when the CheckAsync() call is canceled, the TryGetAsync() call is also canceled.

Cancellation propagation allows a coroutine to respond more quickly to cancellation. Without cancellation propagation, the coroutine would have to wait for the TryGetAsync() to complete naturally (probably due to network timeout) before it could clean up. In some cases, the natural completion of the awaitable could be very long (resume_after(24h)) or may never happen at all (resume_on_signal(never_signaled)), causing the coroutine to become effectively leaked, since it is waiting for something that will never happen.

Note: All names are provisional. Suggestions for better names are gratefully welcomed.

An awaiter can mark itself as supporting cancellation propagation by inheriting publically from enable_await_cancellation. (This is analogous to enable_shared_from_this.)

If the awaiter is created by a coroutine in which cancellation propagation has been enabled, its enable_cancellation() method is called with a cancellable_promise pointer.

The cancellable_promise represents a hook into the cancellation of a coroutine, so that an awaiter can cancel the awaited-for object when the calling coroutine is cancelled.

The awaiter's enable_cancellation() calls the cancellable promise's set_cancelled() method, passing a callback function pointer and a context pointer (typically the awaiter's this pointer). If the outer coroutine is cancelled, then the callback function is called with the context pointer as a parameter. The callback function should arrange for the cancellation of the awaited-on object so that the co_await can complete. It should also arrange that the await_resume() method throws the hresult_canceled() exception.

The cancellation callback is called under a spinlock, so it should work quickly. If it needs to do expensive work, it should finish the work asynchronously. Note that once the cancellation callback returns, the awaiter is eligible for destruction, so make sure to extend the lifetimes of any objects you need if they would normally be destructed by the awaiter.

An example of this pattern can be found in the standard awaiter for IAsyncAction (known internally as await_adapter). Here is a simplified version:

template<typename Async>
struct await_adapter : enable_await_cancellation
{
    await_adapter(Async const& async) : async(async) { }

    Async const& async;

    void enable_cancellation(cancellable_promise* promise)
    {
        promise->set_canceller([](void* context)
        {
            cancel_asynchronously(reinterpret_cast<await_adapter*>(context)->async);
        }, this);

    fire_and_forget cancel_asynchronously(Async async)
    {
        co_await resume_background();
        async.Cancel();
    }

    ... other awaiter stuff unchanged ...
};

When cancellation is enabled, we set a lambda as our canceller, with the await_adapter itself as the context parameter. If cancellation is indeed requested, we make a copy of the async (which is a single call to AddRef, which should be fast), and pass it to cancel_asynchronously() to continue the work on a background thread. (We cleverly use coroutines to help implement coroutines.)

Cancelling on a background thread solves two problems. The first is that it allows execution on the main thread to continue, releasing the spinlock. The second is that it avoids deep recursion, because cancelling the awaited-for asynchronous activity could very well trigger another cancellation propagation if that asynchronous activity was itself waiting for something else. The cancellation propagates all the way down the call chain until it reaches a primitive like resume_on_signal(), or reaches something that does not support cancellation propagation.

We do not need to make any changes to await_resume() because the asynchronous operation will complete with the Canceled status, and the existing code already converts that to an hresult_canceled exception.

The actual implementation is slightly more complicated in that the call to async.Cancel() is itself done under a try/catch to deal with the possibility that cancellation fails, for example, because the RPC server has died. The cancellation is best effort, and in the case of RPC server death, the loss of the operation will be detected by the disconnect_aware_handler and converted into an RPC exception.

Awaiters for custom objects may also derive from enable_await_cancellation in order to participate in cancellation propagation. For example:

struct io_operation
{
    struct awaitable : enable_await_cancellation
    {
        io_operation& m_op;

        awaitable(io_operation& op) : m_op(op) { }

        void enable_cancellation(cancellable_promise* promise)
        {
            promise->set_canceller([](void* context)
            {
                auto op = static_cast<io_operation*>(context);
                CancelIoEx(op->handle, &op->overlapped);
            }, std::addressof(m_op));
        }

        ... other awaitable members ...
    };

    auto operator co_await()
    {
        return awaitable{ *this };
    }

    ... other io_operation members ...
};

The awaitable needs to be ready to cancel the operation even before await_ready() is called, or after await_resume() has been called. (In the latter case, of course, the operation has already completed, so any attempt to cancel it is moot. Nevertheless, the code needs to handle the case of a too-late cancellations.)

Implementation details

One of the main motivations behind the design is to keep the "no cancellation" code path fast. Cancellation is extremely rare, so it is better to make the "no cancellation" code path fast, even if it comes at the expense of making the cancellation code path slower or more complicated.

We use a callback function and pointer because they are lighter weight than an object with a virtual method.

The portion of the coroutine promise that is exposed to clients is the cancellable_promise. The methods are as follows:

  • set_canceller: Called by the implementation of the awaiter.
  • revoke_canceller: Call by the destructor of enable_cancellation to clean up when the awaiter destructs.
  • cancel: Called by the promise when the enclosing coroutine is cancelled.

The canceller function pointer can hold one of three values:

  • nullptr: No canceller has been set, or cancellation has already been attempted.
  • canceller: The canceller to call.
  • cancelling_ptr: Sentinel value that indicates that cancellation is in progress.

The set_canceller method sets the canceller pointer with release semantics so that the state of the awaiter is published to the cancelling thread.

The revoke_canceller method sets the canceller to nullptr, but spins if the previous value was cancelling_ptr, indicating that we should not destruct the awaiter because cancellation is in progress. It uses acquire memory order so that the state of the awaiter is properly received from the cancelling thread (if any) before we destruct it.

The cancel method atomically obtains the current canceller and sets the canceller to the cancelling_ptr sentinel value to indicate that cancellation is in progress. This exchange is performed with acquire semantics to ensure that the state of the awaiter is properly received from the publishing thread before we call the callback.

The cancel method then uses an RAII type to ensure that the canceller returns to nullptr when the cancel method is done. An RAII type is important here, so that we properly clean up even if the canceller callback throws an exception. If there is a canceller callback, we call it. (If the function pointer is cancelling_ptr, then it means that cancellation is already in progress, so we shouldn't try to cancel again.)

The RAII type returns the canceller to nullptr with release semantics so that the state of the awaiter is published to awaiting thread.

The enable_await_cancellation class serves both as a marker class (so that we know that the awaiter supports cancellation propagation) as well as handling the bookkeeping of revoking the canceller when the awaiter destructs. Detecting the enable_await_cancellation is done by is_convertible_v to avoid edge cases like operator co_await returning a const object as an awaiter. (It also means that an awaiter can delegate the cancellation to a sub-object if it wants to take finer control over object layout.)

The notify_awaiter is augmented to receive a cancellable_promise pointer as part of its construction, which is propagated into the wrapped awaiter's enable_await_cancellation if present. This also triggers the call to enable_cancellation.

The promise_base contains a cancellable_promise, which it passes to the notify_awaiter constructor if cancellation propagation is enabled. It calls the cancellable_promise::cancel() method when the coroutine is cancelled in order to propagate the cancellation.

The cancellation_token has a new enable_propagation method which calls back into the promise to specify whether cancellation propagation is now enabled or not, and it returns the previous state, in case you want to restore it.

Cancellation propagation support for resume_after and resume_on_signal are very similar. To avoid race conditions if the coroutine is cancelled while await_suspend is still setting up the thread pool objects, we track the state of the awaiter in an atomic variable whose values are either idle (no thread pool object is active), pending (waiting for thread pool object to queue a callback), and canceled (the operation has been cancelled).

After creating the thread pool objects, we transition from idle to pending and accelerate to completion if the coroutine has already been cancelled. This transition uses release semantics so that the state of the awaiter is published to any potential cancelling thread.

When resuming, we transition back to idle and throw the hresult_canceled() exception if we had been cancelled. This transition can be done with relaxed semantics because the cancellation callback does not mutate the awaiter.

If cancellation is requested, we transition unconditionally to the cancelled state, and if the previous state was pending, we also have to accelerate to completion so that we can get out of the pending state. This transition requires acquire semantics because we need to read the awaiter state that was published by await_suspend.

To accelerate to completion, we first call the corresponding Set...Ex function with the magic parameters that cancel the thread pool callback. The possible return values are

  • If the thread pool object hasn't been set yet, then it returns FALSE. This case cannot happen because we don't transition to pending state until after we set the thread pool callback.
  • If the thread pool object was set and hasn't been signaled, then it returns TRUE. In this case, we need to force a new callback.
  • If the thread pool object was set and already scheduled a thread pool callback (which may or not have been called yet), then it returns FALSE. In this case, we don't need to force a callback, because one is on its way or has already occurred.

Therefore, we need to force a new callback if the Set...Ex function returns TRUE. For timers, we do this by resetting the timer with a due time of zero, meaning "now". For waits, we do this by resetting the wait with the current process handle (which will never be signaled) and a timeout of zero, meaning "now".

The subtlety in the case of a wait is that we use the current process handle rather than the original handle, because the original handle may have gone invalid in the meantime, and we don't want the SetThreadpoolWaitEx to fail with ERROR_INVALID_HANDLE.

Additional cost for co_await is a boolean test (to see whether cancellation propagation is enabled) that is used to decide whether to pass the address of a member variable or a null pointer. That pointer is then tested for non-null, and if non-null, the set_canceller method is called to register the cancellation callbacks. This pointer is also checked when the awaiter is destructed, and if non-null, then cleanup is performed.

Cost summary:

A bool (slips into padding) and two pointers in the promise.

If cancellation propagation is not enabled, a boolean test, a memory write of a null pointer, and two null pointer tests. Constant propagation can remove the first null pointer test. The memory write is to the awaiter, which is being constructed, so it will share the cache line with the other members of the awaiter.

If cancellation propagation is enabled, but no cancellation occurs, then a boolean test, a null pointer test, writing two pointers to memory, and a release fence; when the await completes, a null pointer test and an acquire fence.

If cancellation propagation is enabled, and cancellation occurs, then the cost goes up significantly, but that is expected because that's the point of the feature.

Code generation for MSVC-x64 at await:

    xor     rbx, rbx
    lea     rcx, m_canceller
    cmp     byte ptr m_propagate_cancellation, 0
    cmove   rcx, rbx                ; rcx = nullptr or &m_canceller
    mov     awaiter.m_promise, rbx
    (initialize other members of the awaiter)
    test    rcx, rcx                ; Q: is cancellation propagation enabled?
    jz      skip                    ; N: Don't register for cancellation
    lea     rax, awaiter            ; m_context for canceller
    mov     awaiter.m_promise, rcx  ; awaiter.m_promise = &m_canceller
    mov     awaiter.m_context, rax  ; cancellation context
    lea     rax, [callback]         ; callback function
    mov     awaiter.m_canceller, rax
skip:

Code generation at resume (awaiter destruction):

    mov     rbx, m_promise
    test    rbx, rbx                ; Q: need to clear the callback?
    je      done                    ; N: Nothing to do
    xor     eax, eax
    xchg    rax, m_promise->m_canceller ; try to clear the callback
    cmp     rax, 1                  ; Q: racing against canceller?
    jne     done                    ; N: nothing to do
spin:
    call    _Thrd_yield             ; Let the canceller finish
    xor     eax, eax
    xchg    rax, m_promise->m_canceller ; try to clear the callback
    cmp     rax, 1                  ; Q: racing against canceller?
    je      spin
done:

Profile-guided optimization may help the compiler decide which branch to optimize as the fallthrough case.

Fixes #690

The cancellation token learned a new method `enable_propagation()`.
It takes an optional boolean parameter (default true) which specifies
whether propagation of cancellation into child coroutines is enabled.
It returns the previous setting, in case you want to restore it.

If cancellation propagation is enabled in a coroutine, then
calling `IAsyncXxx.Cancel()` on the coroutine's asynchronous
activity will try to propagate the cancellation into whatever
the coroutine is `co_await`ing for. Currently, the following
awaitables are supported:

* `IAsyncXxx`
* `resume_after()`
* `resume_on_signal()`

Example:

```cpp
IAsyncAction<bool> CheckAsync()
{
    // Enable cancellation propagation.
    auto cancel = co_await get_cancellation_token();
    cancel.enable_propagation();

    HttpClient client;
    auto result = co_await client.TryGetAsync(Uri(L"https://www.microsoft.com"));

    co_return result.Succeeded() && result.ResponseMessage().IsSuccessStatusCode();
}

IAsyncAction<bool> checkOperation;

fire_and_forget StartButton_Click()
{
    checkOperation = CheckAsync();
    try
    {
        if (co_await checkOperation) {
            // Everything is just fine.
        }
        else
        {
            // Couldn't connect to server.
        }
    }
    catch (hresult_canceled const&)
    {
        // Operation was canceled.
    }
}

void CancelButton_Click()
{
    checkOperation.Cancel();
}
```

When the user clicks the Start button, we begin the `CheckAsync()`
operation and save it. If the operation is taking too long and
the user clicks the Cancel button, we Cancel the operation.
Normally, this would cancel the `CheckAsync()` operation, but
that's all. The `TryGetAsync()` call inside `CheckAsync()` continues
to run to its normal completion. The `CheckAsync()` operation has
been cancelled, and the `catch` clause runs, but the `TryGetAsync()`
call is still running.

Enabling cancellation propagation means that when the
`CheckAsync()` call is canceled, the `TryGetAsync()` call is
also canceled.

Cancellation propagation allows a coroutine to respond more
quickly to cancellation. Without cancellation propagation,
the coroutine would have to wait for the `TryGetAsync()` to
complete naturally (probably due to network timeout) before it
could clean up. In some cases, the natural completion of the
awaitable could be very long (`resume_after(24h)`) or may
never happen at all (`resume_on_signal(never_signaled)`),
causing the coroutine to become effectively leaked, since
it is waiting for something that will never happen.

> **Note**: All names are provisional. Suggestions for better
> names are gratefully welcomed.

An awaiter can mark itself as supporting cancellation
propagation by inheriting publically from `enable_await_cancellation`.
(This is analogous to `enable_shared_from_this`.)

If the awaiter is created by a coroutine in which
cancellation propagation has been enabled, its
`enable_cancellation()` method is called with a
`cancellable_promise` pointer.

The `cancellable_promise` represents a hook into the
cancellation of a coroutine, so that an awaiter can cancel
the awaited-for object when the calling coroutine is cancelled.

The awaiter's `enable_cancellation()` calls the cancellable
promise's `set_cancelled()` method, passing a callback function
pointer and a context pointer (typically the awaiter's `this` pointer).
If the outer coroutine is cancelled, then the callback function
is called with the context pointer as a parameter. The callback
function should arrange for the cancellation of the awaited-on
object so that the `co_await` can complete. It should also arrange
that the `await_resume()` method throws the `hresult_canceled()`
exception.

The cancellation callback is called under a spinlock, so it
should work quickly. If it needs to do expensive work, it should
finish the work asynchronously. Note that once the cancellation
callback returns, the awaiter is eligible for destruction, so make
sure to extend the lifetimes of any objects you need if they
would normally be destructed by the awaiter.

An example of this pattern can be found in the standard
awaiter for `IAsyncAction` (known internally as `await_adapter`).
Here is a simplified version:

```cpp
template<typename Async>
struct await_adapter : enable_await_cancellation
{
    await_adapter(Async const& async) : async(async) { }

    Async const& async;

    void enable_cancellation(cancellable_promise* promise)
    {
        promise->set_canceller([](void* context)
        {
            cancel_asynchronously(reinterpret_cast<await_adapter*>(context)->async);
        }, this);

    fire_and_forget cancel_asynchronously(Async async)
    {
        co_await resume_background();
        async.Cancel();
    }

    ... other awaiter stuff unchanged ...
};
```

When cancellation is enabled, we set a lambda as our canceller,
with the `await_adapter` itself as the context parameter. If
cancellation is indeed requested, we make a copy of the `async`
(which is a single call to `AddRef`, which should be fast),
and pass it to `cancel_asynchronously()` to continue the work
on a background thread. (We cleverly use coroutines to help
implement coroutines.)

Cancelling on a background thread solves two problems. The
first is that it allows execution on the main thread to
continue, releasing the spinlock. The second is that it
avoids deep recursion, because cancelling the awaited-for
asynchronous activity could very well trigger another
cancellation propagation if that asynchronous activity was
itself waiting for something else. The cancellation propagates
all the way down the call chain until it reaches a primitive
like `resume_on_signal()`, or reaches something that does
not support cancellation propagation.

We do not need to make any changes to `await_resume()` because
the asynchronous operation will complete with the `Canceled`
status, and the existing code already converts that to an
`hresult_canceled` exception.

The actual implementation is slightly more complicated in that
the call to `async.Cancel()` is itself done under a try/catch
to deal with the possibility that cancellation fails, for example,
because the RPC server has died. The cancellation is best effort,
and in the case of RPC server death, the loss of the operation
will be detected by the `disconnect_aware_handler` and converted
into an RPC exception.

Awaiters for custom objects may also derive from
`enable_await_cancellation` in order to participate in
cancellation propagation. For example:

```cpp
struct io_operation
{
    struct awaitable : enable_await_cancellation
    {
        io_operation& m_op;

        awaitable(io_operation& op) : m_op(op) { }

        void enable_cancellation(cancellable_promise* promise)
        {
            promise->set_canceller([](void* context)
            {
                auto op = static_cast<io_operation*>(context);
                CancelIoEx(op->handle, &op->overlapped);
            }, std::addressof(m_op));
        }

        ... other awaitable members ...
    };

    auto operator co_await()
    {
        return awaitable{ *this };
    }

    ... other io_operation members ...
};
```

The awaitable needs to be ready to cancel the operation
even before `await_ready()` is called, or after `await_resume()`
has been called. (In the latter case, of course, the operation
has already completed, so any attempt to cancel it is moot.
Nevertheless, the code needs to handle the case of a too-late
cancellations.)

## Implementation details

One of the main motivations behind the design is to keep the
"no cancellation" code path fast. Cancellation is extremely rare,
so it is better to make the "no cancellation" code path fast,
even if it comes at the expense of making the cancellation code
path slower or more complicated.

We use a callback function and pointer because they are
lighter weight than an object with a virtual method.

The portion of the coroutine promise that is exposed to
clients is the `cancellable_promise`. The methods are
as follows:

* `set_canceller`: Called by the implementation of the
  awaiter.
* `revoke_canceller`: Call by the destructor of
   `enable_cancellation` to clean up when the awaiter
  destructs.
* `cancel`: Called by the promise when the enclosing
  coroutine is cancelled.

The canceller function pointer can hold one of three
values:

* `nullptr`: No canceller has been set, or cancellation
  has already been attempted.
* canceller: The canceller to call.
* `cancelling_ptr`: Sentinel value that indicates that
  cancellation is in progress.

The `set_canceller` method sets the canceller pointer with
release semantics so that the state of the awaiter
is published to the cancelling thread.

The `revoke_canceller` method sets the canceller to
`nullptr`, but spins if the previous value was `cancelling_ptr`,
indicating that we should not destruct the awaiter because
cancellation is in progress. It uses acquire memory order
so that the state of the awaiter is properly received from
the cancelling thread (if any) before we destruct it.

The `cancel` method atomically obtains the current canceller
and sets the canceller to the `cancelling_ptr` sentinel value
to indicate that cancellation is in progress. This exchange
is performed with acquire semantics to ensure that the state
of the awaiter is properly received from the publishing
thread before we call the callback.

The `cancel` method then uses an RAII type to ensure that the
canceller returns to `nullptr` when the cancel method is done.
An RAII type is important here, so that we properly clean up
even if the canceller callback throws an exception. If there
is a canceller callback, we call it. (If the function pointer
is `cancelling_ptr`, then it means that cancellation is already
in progress, so we shouldn't try to cancel again.)

The RAII type returns the canceller to `nullptr` with release
semantics so that the state of the awaiter is published to
awaiting thread.

The `enable_await_cancellation` class serves both as a marker
class (so that we know that the awaiter supports cancellation
propagation) as well as handling the bookkeeping of revoking
the canceller when the awaiter destructs. Detecting the
`enable_await_cancellation` is done by `is_convertible_v`
to avoid edge cases like `operator co_await` returning a const
object as an awaiter. (It also means that an awaiter can
delegate the cancellation to a sub-object if it wants to
take finer control over object layout.)

The `notify_awaiter` is augmented to receive a
`cancellable_promise` pointer as part of its construction,
which is propagated into the wrapped awaiter's `enable_await_cancellation`
if present. This also triggers the call to `enable_cancellation`.

The `promise_base` contains a `cancellable_promise`, which
it passes to the `notify_awaiter` constructor if cancellation
propagation is enabled. It calls the `cancellable_promise::cancel()`
method when the coroutine is cancelled in order to propagate
the cancellation.

The `cancellation_token` has a new `enable_propagation` method
which calls back into the promise to specify whether cancellation
propagation is now enabled or not, and it returns the previous state,
in case you want to restore it.

Cancellation propagation support for `resume_after` and
`resume_on_signal` are very similar. To avoid race conditions
if the coroutine is cancelled while `await_suspend` is still
setting up the thread pool objects, we track the state of
the awaiter in an atomic variable whose values are either
`idle` (no thread pool object is active),
`pending` (waiting for thread pool object to queue a callback),
and `canceled` (the operation has been cancelled).

After creating the thread pool objects, we transition from
`idle` to `pending` and accelerate to completion if the
coroutine has already been cancelled. This transition uses
release semantics so that the state of the awaiter is published
to any potential cancelling thread.

When resuming, we transition back to `idle` and throw the
`hresult_canceled()` exception if we had been cancelled.
This transition can be done with relaxed semantics
because the cancellation callback does not mutate the awaiter.

If cancellation is requested, we transition unconditionally
to the cancelled state, and if the previous state was pending,
we also have to accelerate to completion so that we can get
out of the pending state. This transition requires acquire
semantics because we need to read the awaiter state that
was published by `await_suspend`.

To accelerate to completion, we first call the corresponding
`Set...Ex` function with the magic parameters that cancel the
thread pool callback. The possible return values are

* If the thread pool object hasn't been set yet,
  then it returns `FALSE`. This case cannot happen because
  we don't transition to `pending` state until after we set
  the thread pool callback.
* If the thread pool object was set and hasn't been signaled,
  then it returns `TRUE`. In this case, we need to force a
  new callback.
* If the thread pool object was set and already scheduled a
  thread pool callback (which may or not have been called yet),
  then it returns `FALSE`. In this case, we don't need to force
  a callback, because one is on its way or has already occurred.

Therefore, we need to force a new callback if the `Set...Ex` function
returns `TRUE`. For timers, we do this by resetting the timer with
a due time of zero, meaning "now". For waits, we do this by resetting
the wait with the current process handle (which will never be signaled)
and a timeout of zero, meaning "now".

The subtlety in the case of a wait is that we use the current process
handle rather than the original handle, because the original handle
may have gone invalid in the meantime, and we don't want the
`SetThreadpoolWaitEx` to fail with `ERROR_INVALID_HANDLE`.

Additional cost for `co_await` is a boolean test (to see whether
cancellation propagation is enabled) that is used to decide whether
to pass the address of a member variable or a null pointer. That
pointer is then tested for non-null, and if non-null, the
`set_canceller` method is called to register the cancellation callbacks.
This pointer is also checked when the awaiter is destructed,
and if non-null, then cleanup is performed.

Cost summary:

A bool (slips into padding) and two pointers in the promise.

If cancellation propagation is not enabled, a boolean test,
a memory write of a null pointer, and two null pointer tests.
Constant propagation can remove the first null pointer test.
The memory write is to the awaiter, which is being constructed,
so it will share the cache line with the other members of the
awaiter.

If cancellation propagation is enabled, but no cancellation
occurs, then a boolean test, a null pointer test, writing two
pointers to memory, and a release fence; when the await completes,
a null pointer test and an acquire fence.

If cancellation propagation is enabled, and cancellation occurs,
then the cost goes up significantly, but that is expected because
that's the point of the feature.

Code generation for MSVC-x64 at await:

    xor     rbx, rbx
    lea     rcx, m_canceller
    cmp     byte ptr m_propagate_cancellation, 0
    cmove   rcx, rbx                ; rcx = nullptr or &m_canceller
    mov     awaiter.m_promise, rbx
    (initialize other members of the awaiter)
    test    rcx, rcx                ; Q: is cancellation propagation enabled?
    jz      skip                    ; N: Don't register for cancellation
    lea     rax, awaiter            ; m_context for canceller
    mov     awaiter.m_promise, rcx  ; awaiter.m_promise = &m_canceller
    mov     awaiter.m_context, rax  ; cancellation context
    lea     rax, [callback]         ; callback function
    mov     awaiter.m_canceller, rax
skip:

Profile-guided optimization may help the compiler decide which
branch to optimize as the fallthrough case.

Code generation at resume (awaiter destruction):

    mov     rbx, m_promise
    test    rbx, rbx                ; Q: need to clear the callback?
    je      done                    ; N: Nothing to do
    xor     eax, eax
    xchg    rax, m_promise->m_canceller ; try to clear the callback
    cmp     rax, 1                  ; Q: racing against canceller?
    jne     done                    ; N: nothing to do
spin:
    call    _Thrd_yield             ; Let the canceller finish
    xor     eax, eax
    xchg    rax, m_promise->m_canceller ; try to clear the callback
    cmp     rax, 1                  ; Q: racing against canceller?
    je      spin
done:

Profile-guided optimization may help the compiler realize that
spinning almost never happens.

Fixes #690
@oldnewthing oldnewthing marked this pull request as ready for review August 13, 2020 22:03
int32_t __stdcall WINRT_IMPL_TrySubmitThreadpoolCallback(void(__stdcall *callback)(void*, void* context), void* context, void*) noexcept;
winrt::impl::ptp_timer __stdcall WINRT_IMPL_CreateThreadpoolTimer(void(__stdcall *callback)(void*, void* context, void*), void* context, void*) noexcept;
void __stdcall WINRT_IMPL_SetThreadpoolTimer(winrt::impl::ptp_timer timer, void* time, uint32_t period, uint32_t window) noexcept;
int32_t __stdcall WINRT_IMPL_SetThreadpoolTimerEx(winrt::impl::ptp_timer timer, void* time, uint32_t period, uint32_t window, void* reserved) noexcept;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This signature doesn't match the exported function - also why the x86 build is failing.

WINRT_IMPL_LINK(TrySubmitThreadpoolCallback, 12)
WINRT_IMPL_LINK(CreateThreadpoolTimer, 12)
WINRT_IMPL_LINK(SetThreadpoolTimer, 16)
WINRT_IMPL_LINK(SetThreadpoolTimerEx, 20)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be 16.

@kennykerr
Copy link
Collaborator

Is this PR still active?

@oldnewthing
Copy link
Member Author

Is this PR still active?

Was hoping to get naming feedback before fixing the other issues.

@kennykerr
Copy link
Collaborator

Thanks, will review the names asap.

@kennykerr
Copy link
Collaborator

Thanks Raymond, I reviewed the names. I couldn't think of better names. 😉

Copy link
Collaborator

@kennykerr kennykerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - thanks!

@kennykerr kennykerr merged commit c20a75b into microsoft:master Sep 18, 2020
@oldnewthing oldnewthing deleted the chained-cancellation branch September 20, 2020 04:39
oldnewthing added a commit to microsoft/Windows-universal-samples that referenced this pull request Dec 24, 2020
* AudioCreation: Add microphone selection to scenario 2
* BarcodeScanner: Add missing namespace qualifier #1253
* HttpClient: Take advantage of C++/WinRT cancellation feature microsoft/cppwinrt#721
* LockScreenApps: Provide workaround for Settings app issue #1249
* MediaTranscoding: Explicitly set the H.264 encoder level to 5.2 when needed
* MobileBroadband: Fix order of parameters #1255
* C++/WinRT template: Use angle brackets for C++/WinRT headers #1231
* Fix broken links and typos in README files #1235 #1245 #1256
* New samples: WebSocket (C++WinRT)
knuthartmark pushed a commit to knuthartmark/Windows-universal-samples that referenced this pull request Jun 28, 2024
* AudioCreation: Add microphone selection to scenario 2
* BarcodeScanner: Add missing namespace qualifier microsoft#1253
* HttpClient: Take advantage of C++/WinRT cancellation feature microsoft/cppwinrt#721
* LockScreenApps: Provide workaround for Settings app issue microsoft#1249
* MediaTranscoding: Explicitly set the H.264 encoder level to 5.2 when needed
* MobileBroadband: Fix order of parameters microsoft#1255
* C++/WinRT template: Use angle brackets for C++/WinRT headers microsoft#1231
* Fix broken links and typos in README files microsoft#1235 microsoft#1245 microsoft#1256
* New samples: WebSocket (C++WinRT)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cancellation_token<TPromise> doesn't remove cancellation callback, requiring more defensive code

2 participants