-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix an incorrect CSE case with struct retyping. #34676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3f88767 to
b60c098
Compare
7a3b2ed to
c0300c4
Compare
|
PTAL @erozenfeld @briansull, cc @dotnet/jit-contrib |
src/coreclr/src/jit/morph.cpp
Outdated
| bool canCSE = tree->CanCSE(); | ||
| // That doesn't make any sense, was it supposed to clean all flags except GTF_NODE_MASK here? 2012< | ||
| tree->gtFlags &= GTF_NODE_MASK; | ||
| // If yes then why a year after (2013) was this cleaning added, that obviously does nothing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that you want to checkin these two comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will drop that commit before merge, want to see if somebody knows more about these changes.
src/coreclr/src/jit/importer.cpp
Outdated
| op = op1->AsOp()->gtOp1; | ||
| if (canCSE) | ||
| { | ||
| op->ClearDoNotCSE(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't necessary to guard this call to ClearDoNotCSE()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is not it? It could be a volatile LCL_VAR that needs that flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of CanCSE is
return ((gtFlags & GTF_DONT_CSE) == 0);
So this is basically saying if the flag is set then clear it using the AND NOT operation.
Not sure how a volatile LCL_VAR would behave any differently here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is basically saying if the flag is set then clear it using the AND NOT operation.
Maybe I am missing something, but I don't understand.
We are checking the flag on IND(ADDR(X)) tree and clear it on X. X was marked as DONT_CSE because it was ADDR source, IND(ADDR(X)) could be marked as DONT_CSE for other reasons that we don't change here, so if IND(ADDR(X)) was marked with that flag then after the transformation X should have this flag as well.
For example, IND1(ADDR2(IND3(ADDR4(X)) in the beginning X and IND3 have DONT_CSE because they are ADDR sources, when we optimize IND3(ADDR4(X)) we have to keep DONT_CSE on X .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I see.
I missed that we were overwriting op after setting canCSE.
bool canCSE = op->CanCSE();
op = op1->AsOp()->gtOp1;
This sequence is confusing and could use a comment, saying we are preserving any DONT_CSE that was set on the original version of op
Actually I'm not sure that this implementation does properly preserve this flag.
When a parent node sets DONT_CSE on a node it means that it wants the child node to always have this flag set.
src/coreclr/src/jit/importer.cpp
Outdated
| GenTree* byReferenceStruct = gtCloneExpr(thisptr->gtGetOp1()); | ||
| assert(byReferenceStruct != nullptr); | ||
| GenTreeLclVar* byReferenceStruct = gtCloneExpr(lclVar)->AsLclVar(); | ||
| assert(!byReferenceStruct->CanCSE()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to remove the assert? I believe that gtCloneExpr can return nullptr for some complex trees.
assert(byReferenceStruct != nullptr);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CanCSE does this dereferencing, so it is not necessary to check it separately, but I will return it if it reads better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An A/V in a Checked build is worse than an assert firing, in terms of debuggablity.
Checker phases are typically used to enforce correctness issues. Failing to perform a CSE isn't a correctness issue and often won't even significantly change the performance of the generated code. It will show up in code diffs, which looks like what your bug fix is trying to correct here.
I am fine with accepting a small amount of textual diffs and we often get them when code changes the behavior of the CSE heuristics . |
How is it different from, for example, |
|
GTF_EXCEPT flags are a bit different IMO, these are flags that get push up the tree to all of the parent nodes. The DONT_CSE flag is a local flag that only applies to the current node, and typically the parent node is responsible for setting it on one or both of its child nodes. That said I am not opposed to adding some kind of checker if it provides good value. |
|
We currently have 56 places in the code where we set
|
src/coreclr/src/jit/valuenum.cpp
Outdated
|
|
||
| if ((indType == TYP_REF) && (varTypeIsStruct(elemTyp))) | ||
| { | ||
| // This whole block is over-optimistic, we don't have any information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked that if we delete that unsafe condition we will get 8 bytes regression in framework assemblies (1 method) and 2 in SPMI (pri1 and BING).
That regression will be fixed with #33225, so I think I will just delete this block before the merge.
erozenfeld
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments.
src/coreclr/src/jit/morph.cpp
Outdated
| if (!canCSE) | ||
| { | ||
| tree->SetDoNotCSE(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment: why not set DONT_CSE without checking it first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I misread the change.
src/coreclr/src/jit/morph.cpp
Outdated
| tree->AsLclVarCommon()->SetLclNum(fieldLclIndex); | ||
| tree->gtType = fieldType; | ||
| bool canCSE = tree->CanCSE(); | ||
| // That doesn't make any sense, was it supposed to clean all flags except GTF_NODE_MASK here? 2012< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can help you find the change that introduced this.
Co-Authored-By: Eugene Rozenfeld <erozen@microsoft.com>
This reverts commit cd24383.
src/coreclr/src/jit/valuenum.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That change gives us one regression in framework crossgening:
8 (12.70% of base) : Newtonsoft.Json.dasm - Newtonsoft.Json.Serialization.JsonTypeReflector:get_DynamicCodeGeneration():bool
|
I have updated the PR, deleted the suspicious optimization from |
98a349c to
b1ef04d
Compare
briansull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now
erozenfeld
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The failing assert was:
That was a combination of two issues:
IND structunderADDRnot marked withDONT_CSE, it came fromINDEXmorphing;INDfor the sameLCL_VAR, but asreffrom call return retyping,VNApplySelectorsTypeChecksaw thisIND refand returnedstructtype for it without checking sizes/structHandles (we don't have handles in this case, because we don't have a mechanism to keep them on pointers and we access array through a ref (pointer) local var);-> CSE saw
IND structandIND refwith the same VN and withoutDONT_CSEand failed.The first issue was fixed the third commit 86a9f22,
it introduced a few diffs (20 methods in frameworks, then a few in BING/pri1 SPMI), all textual, not size diffs. I have fixed them in cd24383 commit.
The solution adds this flag on each node under
ADDRand checks it indebugCheckFlags. It is questionable because we will for sure forget to clearDONT_CSEflag from time to time and we don't have a checker to assert when it is set without a reason, as we do forCALL, ASG, EXCEPTflags.We could add a checker (in another PR) but it won't be cheap because:
DONT_CSEis used not only for correctness but for profitability as well. I had a change that cleaned it fromNULLCHECKnodes, but diffs were negative: CSE started to create CSE copies for them, but the null checks were later deleted by assertion propagation, so changing CSE logic to have zero diffs would be non-trivial;DONT_CSEis required for the correctness and there are comments with TODO to delete some that reference failing tests if you do.Maybe it would be better to drop the commit for SPMI diffs and accept textual diffs instead of adding these non-obvious conditions. I am sure there are some other places where we forgot to clean that, they just don't produce any diffs.
Commits:
6128613: Fix old printing issues.
df5266c: add a repro test that doesn't require crossgen2.
86a9f22: Check that all
ADDRsources are marked as DONT_CSE.cd24383: Fix text(no size) diffs found by SPMI (could be dropped).
c0300c4: Add a question/comment (will be dropped/fixed before merge).
Fixes #33884.