-
Notifications
You must be signed in to change notification settings - Fork 5.3k
JIT: detect address of field as an invariant inlining arg #845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: detect address of field as an invariant inlining arg #845
Conversation
Update the check for arg invariance to include addresses of fields in local structs. This allows the inliner to directly substitute more arguments into the body of the inlinee. Resolves dotnet/coreclr#27630.
|
@dotnet/jit-contrib PTAL Fixes the Diff summary for x86: and for x64: |
|
|
||
| if ((curArgVal->OperKind() & GTK_CONST) || | ||
| ((curArgVal->gtOper == GT_ADDR) && (curArgVal->AsOp()->gtOp1->gtOper == GT_LCL_VAR))) | ||
| if ((curArgVal->OperKind() & GTK_CONST) || isAddressInLocal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some code between these 2 changes that doesn't appear in GitHub:
if (curArgVal->gtFlags & GTF_ALL_EFFECT)
{
inlCurArgInfo->argHasGlobRef = (curArgVal->gtFlags & GTF_GLOB_REF) != 0;
inlCurArgInfo->argHasSideEff = (curArgVal->gtFlags & (GTF_ALL_EFFECT & ~GTF_GLOB_REF)) != 0;
}It looks to me that in some cases this will unnecessarily set argHasGlobRef when the arg is a local address. Not sure what impact that has on CQ, if any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw your comment about that over on dotnet/coreclr#27630. But I didn't see any global ref flag on the test case here, at least when inlining.
I can look into bypassing that for invariant args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, here I see it in M1 on FIELD and ADDR:
STMT00002 (IL 0x00F... ???)
[000012] I-C-G------- * CALL ref System.Int32.ToString (exactContextHnd=0x02F7E975)
[000011] ----G------- this in ecx \--* ADDR byref
[000010] ----G------- \--* FIELD int i
[000009] ------------ \--* ADDR byref
[000008] ------------ \--* LCL_VAR int V01 loc0
|
Note this change (when it fires, which isn't often) results in fewer temps. Looking at the regression for x86 on Seems like it would be better when decomposing a long copy to not create overlapping lifetimes, else we might need a lot of spilling, as we end up doing here here: mov dword ptr [ebp-40H], ecx
mov ecx, dword ptr [ebp+0CH]
mov dword ptr [ebp-44H], ecx
mov ecx, dword ptr [ebp-40H]
mov dword ptr [ebp-28H], ecx
mov ecx, dword ptr [ebp-44H]
mov dword ptr [ebp-2CH], ecx
mov ecx, dword ptr [ebp-28H]I also see the global flag set on the arg tree that ultmately feeds into for this, perhaps related in some way. |
|
|
Also conservative for a field of a struct param, on x64/arm64. Looking into cleaning up the global effect flag separately. I think the x64/arm64 conservatism for struct params may be needed because reflection can invoke methods with implicit by-ref structs referring to heap allocated structs, though one would hope it would not matter (as even if on the the heap, they should not alias one another, so not sure why we'd care)? Would have been nice to have the comment here be a bit more insightful. #if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
// These structs are passed by reference; we should probably be able to treat these
// as non-global refs, but downstream logic expects these to be marked this way. |
Maybe you could open an issue, and point to that case. |
Sure, though I don't know where it shows up without this change. Long decomposition in general is something we ought to improve (see https://github.com/dotnet/coreclr/issues/18339#issuecomment-468355602), but I'm not sure how much priority we should put on x86 CQ.... |
|
@AndyAyersMS - thanks for opening the issues. Presumably the long decomposition matters on arm32 as well as x86, though I'm not certain that changes the priority question. |
CarolEidt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks
Nope, not related... still see the long decomp issue even with the prototype fix for global flags. |
|
I think the arm test run finished but did not get reported back. Am going to ignore it. |
Update the check for arg invariance to include addresses of fields in local
structs. This allows the inliner to directly substitute more arguments into
the body of the inlinee.
Resolves dotnet/coreclr#27630.