Add support for runtime async in the scanner#121622
Add support for runtime async in the scanner#121622MichalStrehovsky merged 5 commits intodotnet:mainfrom
Conversation
Before we start generating native code using RyuJIT, we do an IL scanning pass where we build the whole program view. The whole program view builds a dependency graph that is similar to the one we create during code generation. Instead of generating code, we look at method IL and generate whatever dependency nodes are going to be needed for real codegen (ConstructedEETypeNode for newobj, more scanned method nodes for calls, etc.). This PR adds support for scanning methods that compile into state machine. It also suppresses some of the whole program optimizations around the special `Continuation` type.
|
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas |
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for runtime async in the IL scanner phase of Native AOT compilation. The scanner builds a whole program view by analyzing method IL and generating dependency nodes before actual code generation begins.
Key Changes:
- Added pattern matching to detect task await patterns in async methods during IL scanning
- Introduced infrastructure to track and add async-related dependencies (continuation types, async helpers, resumption stubs)
- Refactored continuation type access to use a centralized property to improve code maintainability
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| ILImporter.Scanner.cs | Implements async pattern detection (MatchTaskAwaitPattern) and adds dependencies for async infrastructure when scanning calls to async methods from async state machines |
| ILScanner.cs | Prevents devirtualization optimizations on continuation types since they're dynamically generated and not well tracked |
| CorInfoImpl.cs | Refactors to use the new centralized ContinuationType property instead of directly calling GetKnownType |
| CompilerTypeSystemContext.Async.cs | Introduces public ContinuationType property for centralized access to the base continuation type and refactors internal hashtable to use it |
src/coreclr/tools/aot/ILCompiler.Compiler/IL/ILImporter.Scanner.cs
Outdated
Show resolved
Hide resolved
src/coreclr/tools/aot/ILCompiler.Compiler/IL/ILImporter.Scanner.cs
Outdated
Show resolved
Hide resolved
| // If this is the task await pattern, we're actually going to call the variant | ||
| // so switch our focus to the variant. | ||
| if (method.GetMethodDefinition().Signature.ReturnsTaskOrValueTask() | ||
| && MatchTaskAwaitPattern()) |
There was a problem hiding this comment.
In the JIT case the call to MatchTaskAwaitPattern is conditioned on a non-release flag. When the await pattern optimization is disabled the code should still work correctly and is a good "stress" test for the implementation.
There are ways how the optimization can be defeated (store the call result in a field, await the field), so "unoptimized" scenarios are possible, but may not be covered by regular tests as we tend not to do such things intentionally. Thus disabling the optimization is an extra test coverage that can be useful sometimes.
Just something to consider.
There was a problem hiding this comment.
Also note that the MatchTaskAwaitPattern starts with a non-async call to something task-returning and then the result is fed into Await call (or possibly to ConfigAwaite first and then to Await).
The pattern match may need to be reordered with respect to the above
&& method.IsAsyncThere was a problem hiding this comment.
An interesting mental exercise is what happens to:
int x = await await ReturnsTaskOfTaskOfInt();In the inner await, the task may represent asynchrony in the ReturnsTaskOfTask and thus optimizable into an async call. For the outer await, the Task<int> is just a data type of the inner result. The Await helper will unwrap the result in async-friendly way, but it would not be optimizable into an async call.
There was a problem hiding this comment.
In the JIT case the call to
MatchTaskAwaitPatternis conditioned on a non-release flag. When the await pattern optimization is disabled the code should still work correctly and is a good "stress" test for the implementation.
We don't run the scanner unless we're optimizing. The scanner can and should assume the optimization will happen (we also special case various intrinsics as mustExpand when compiling for native AOT). If the optimization doesn't happen when the method is NoOptimization (can't tell - looks like it still does), we could gate it on that, but otherwise we don't want to assume this optimization doesn't happen when building whole program view.
Assuming it might not happen means we'd waste virtual slots because we'd need to assume both variant slots are always used when scanning (one of the objectives of scanning is to eliminate unused virtual slots). We really do care about working set.
There was a problem hiding this comment.
Thus disabling the optimization is an extra test coverage that can be useful sometimes.
Like I wrote on Teams, we don't have coverage of unoptimized async codegen in the src/tests/async tree because all the tests do <Optimize>True</Optimize>
There was a problem hiding this comment.
The await transform based on the IL pattern match happens even in debug codegen. I do not think we would ever change that. The ability to disable it is a JIT debug/checked only option based on an environment variable. I do not think ILC needs to try to support it. (IIUC this would be a problem since the scanner would underestimate the set.)
Down the road it is likely we will build more cases where we transform to direct calls to async variants. For example, await x ? FooAsync() : BarAsync() if the JIT is able to fold x away. I imagine ILC won't be precise in these scenarios but that we will end up with an overestimate and that is fine.
There was a problem hiding this comment.
I did not mean a flag for hooking it up to <Optimize> or to do a regular runs with flag disabled.
This optimization is always on in released compiler.
The only reason to disable the optimization are:
- when investigating something.
- trying to get some extra test coverage during bring up.
Same can be achieved by simply commenting out the call to pattern matcher in the source and rebuilding. A knob could be slightly more conveninent.
This was just a mild suggestion. If it does not fit into workflow with NativeAOT codebase, it is completely ok to not have a switch.
|
@dotnet/ilc-contrib this is ready for another round of reviews. This might be the last thing we need to start running src/tests/async (I do have the async tree passing locally already). |
|
@jakobbotsch Could you please review that this matches the JIT importer behavior? (Also, any changes in that behavior will need to be reflected here going forward.) |
Before we start generating native code using RyuJIT, we do an IL scanning pass where we build the whole program view. The whole program view builds a dependency graph that is similar to the one we create during code generation.
Instead of generating code, we look at method IL and generate whatever dependency nodes are going to be needed for real codegen (ConstructedEETypeNode for newobj, more scanned method nodes for calls, etc.).
This PR adds support for scanning methods that compile into state machine. It also suppresses some of the whole program optimizations around the special
Continuationtype.Cc @dotnet/ilc-contrib (@jakobbotsch for the
impMatchTaskAwaitPatterncopy)