Avoid resolving direct awaits to thunks (IL scanner part)#124536
Avoid resolving direct awaits to thunks (IL scanner part)#124536MichalStrehovsky merged 2 commits intodotnet:mainfrom
Conversation
Reacts to dotnet#124283. Fixes native AOT outerloops with runtime async turned on.
|
Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib |
There was a problem hiding this comment.
Pull request overview
This PR implements the IL scanner part of the optimization from PR #124283 to avoid resolving direct awaits to thunks in NativeAOT. When a direct call to a Task-returning method is encountered during async scanning, the compiler now avoids creating unnecessary async variant thunks, which reduces JIT overhead and enables better optimization.
Changes:
- Added test for static virtual interface methods with async to validate the fix
- Refactored
IsCallEffectivelyDirectinto a reusable extension method - Enhanced IL scanner to skip async variant resolution for direct calls that would result in thunks
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/tests/async/staticvirtual/staticvirtual.csproj | New test project using IL SDK with C# compilation |
| src/tests/async/staticvirtual/staticvirtual.cs | Test case for static virtual interface methods with async, validates the thunk avoidance optimization |
| src/coreclr/tools/Common/Compiler/MethodExtensions.cs | Added IsCallEffectivelyDirect extension method for reuse across codebase |
| src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs | Removed IsCallEffectivelyDirect private method and updated to use extension method |
| src/coreclr/tools/aot/ILCompiler.RyuJit/JitInterface/CorInfoImpl.RyuJit.cs | Updated call site to use extension method |
| src/coreclr/tools/aot/ILCompiler.Compiler/IL/ILImporter.Scanner.cs | Added logic to avoid resolving direct calls to async thunks, refactored async variant checks, and updated to use extension method |
There was a problem hiding this comment.
I don't quite understand why it is necessary for correctness. If we think we use the thunk then won't that be an overestimate of what is needed since the thunk also references the real function? This same kind of misreporting should happen whenever we inline something.
Similar JIT optimizations will happen in the future (what I was commenting about in #121622 (comment))
|
The problem here is that there are two virtual slots for each async method (the thunk and the real body each get a separate slot). IL scanner was predicting a different slot would be needed than what RyuJIT then asked for. We treeshake unused virtual slots based on whole program analysis. |
|
Even though these cases being optimized are not virtual? I expect in the future we will have JIT optimizations that will switch between thunk and the non-thunk versions in ways that will be impossible to predict by the IL scanner. Do you have any thoughts about how we can handle this? |
These were virtuals (see the added test case). Static virtuals, but we still need to track them like this if it's in shared code. Scanner predicted we're calling the async-call variant of the interface method, but RyuJIT needed the non-async one.
We can always go conservative in the scanner. At that point, we'll hopefully be able to measure real-world impact of being conservative and take it from there. Since right now we can't measure real world impacts and we can still model this precisely, we're doing precise. |
Does it mean the logic I added has a bug? If the call is virtually dispatched then we should make sure we switch to the async variant so that we don't pessimize the actual runtime async overrides. |
Maybe it was the other way around. I'm not sure how it was anymore; this was a constrained call case. Either way, the scanner will now match JitInterface and the compiler crash is gone. |
|
I do think there is a bug in the logic. In your test case I see a task-returning thunk created for I suppose we will need to pass additional flag for whether the call/callvirt is constrained to |
Reacts to #124283. Fixes native AOT outerloops with runtime async turned on.
Cc @dotnet/ilc-contrib @jakobbotsch