-
Notifications
You must be signed in to change notification settings - Fork 5.3k
JIT: extend copy prop to local fields #74384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allow copy prop to update GT_LCL_FLD nodes. Update local assertion gen for block opts to use the pre-morph tree to generate copy or zero assertions, since the semantics of the post-morph tree are often obscured by the copy/zero expansions.
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsAllow copy prop to update GT_LCL_FLD nodes. Update local assertion gen for block opts to use the pre-morph tree
|
|
This is an alternate take on #73719. Lots of good looking improvements. A few big regressions that I haven't investigated yet. Also some opportunities for follow up, eg if we have |
|
cc @dotnet/jit-contrib |
src/coreclr/jit/morph.cpp
Outdated
| if ((oldTree != nullptr) && oldTree->OperIsBlkOp()) | ||
| { | ||
| optAssertionGen(oldTree); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can really use oldTree here... It should really be a DEBUGARG, the contract is that it can be reshapen/destroyed/reused in arbitrary ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All we really need to know is the identity of the local(s) involved, so I suppose we can find some other way to convey this information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would calling this from somewhere inside fgMorphCopy/InitBlock not work? We are already doing this "manual" assertion generation for field assigns there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to survive fgKillDependentAssertions somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do it after the killing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I am missing something here.
By the time we get to fgMorphTreeDone, for a decomposed store, we'll have a COMMA chain, which will not be recognized as a store (and thus will not kill the assertions).
What I was thinking as "the kill point" is this code:
runtime/src/coreclr/jit/morphblock.cpp
Lines 255 to 259 in b06ab8c
| // Kill everything about m_dstLclNum (and its field locals) | |
| if (m_comp->optLocalAssertionProp && (m_comp->optAssertionCount > 0)) | |
| { | |
| m_comp->fgKillDependentAssertions(m_dstLclNum DEBUGARG(m_asg)); | |
| } |
Where we still have the original assign.
Edit: this does still leave the wrinkle of killing the assertions of non-decomposed stores, but that can, presumably, be dealt with by moving out all of struct copy assertion generation from fgMorphTreeDone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might work. It scatters the gen/kill logic around a bit more, but I suppose we've already crossed that bridge.
I certainly want to avoid having to inspect the expanded tree to figure out what to kill/preserve/gen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange that we kill assertions for init block but not for copy block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess we inherit that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried a variant of what's discussed above: putting the assertion gen over in MorphInitBlockHelper::Morph after the dest is prepared but before the copy is morphed.
This misses some cases where we turn the copy into a single assignment to its field.
;;; current PR ("hacky revival of old tree")
Morphing BB01 of 'SocketConnection:DisposeAsync():ValueTask:this'
fgMorphTree BB01, STMT00008 (before)
[000037] IA--------- * ASG struct (init)
[000034] D------N--- +--* LCL_VAR struct<System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder, 8>(P) V04 tmp1
+--* ref V04.m_task (offs=0x00) -> V09 tmp6
[000036] ----------- \--* CNS_INT int 0
Notify VM instruction set (SSE2) must be supported.
MorphInitBlock:
MorphBlock for dst tree, before:
[000034] D----+-N--- * LCL_VAR struct<System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder, 8>(P) V04 tmp1
* ref V04.m_task (offs=0x00) -> V09 tmp6
MorphBlock after:
[000034] D----+-N--- * LCL_VAR struct<System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder, 8>(P) V04 tmp1
* ref V04.m_task (offs=0x00) -> V09 tmp6
PrepareDst for [000034] have found a local var V04.
using field by field initialization.
GenTreeNode creates assertion:
[000047] -A--------- * ASG ref
In BB01 New Local Constant Assertion: V09 == null, index = #01
MorphInitBlock (after):
[000047] -A---+----- * ASG ref
[000045] D------N--- +--* LCL_VAR ref V09 tmp6
[000046] ----------- \--* CNS_INT ref null
The assignment [000047] using V09 removes: Constant Assertion: V09 == null
GenTreeNode creates assertion:
[000047] -A---+----- * ASG ref
In BB01 New Local Constant Assertion: V09 == null, index = #01
GenTreeNode creates assertion:
[000037] IA--------- * ASG struct (init)
In BB01 New Local Constant Assertion: V04 == ZeroObj, index = #02
compared to
;;; alternate version (revive during morph block init)
Morphing BB01 of 'SocketConnection:DisposeAsync():ValueTask:this'
fgMorphTree BB01, STMT00008 (before)
[000037] IA--------- * ASG struct (init)
[000034] D------N--- +--* LCL_VAR struct<System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder, 8>(P) V04 tmp1
+--* ref V04.m_task (offs=0x00) -> V09 tmp6
[000036] ----------- \--* CNS_INT int 0
Notify VM instruction set (SSE2) must be supported.
MorphInitBlock:
MorphBlock for dst tree, before:
[000034] D----+-N--- * LCL_VAR struct<System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder, 8>(P) V04 tmp1
* ref V04.m_task (offs=0x00) -> V09 tmp6
MorphBlock after:
[000034] D----+-N--- * LCL_VAR struct<System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder, 8>(P) V04 tmp1
* ref V04.m_task (offs=0x00) -> V09 tmp6
PrepareDst for [000034] have found a local var V04.
GenTreeNode creates assertion:
[000037] IA--------- * ASG struct (init)
In BB01 New Local Constant Assertion: V04 == ZeroObj, index = #01
using field by field initialization.
GenTreeNode creates assertion:
[000047] -A--------- * ASG ref
In BB01 New Local Constant Assertion: V09 == null, index = #02
MorphInitBlock (after):
[000047] -A---+----- * ASG ref
[000045] D------N--- +--* LCL_VAR ref V09 tmp6
[000046] ----------- \--* CNS_INT ref null
The assignment [000047] using V09 removes: Constant Assertion: V09 == null
The assignment [000047] using V04 removes: Constant Assertion: V04 == ZeroObj
GenTreeNode creates assertion:
[000047] -A---+----- * ASG ref
In BB01 New Local Constant Assertion: V09 == null, index = #01
I could either try and recognize those cases in fgMorphTreeDone and avoid the kill, or update optAssertionGen to understand that assertions about an entire field of a struct also apply to the struct, so that the assertion gets killed but then gets revived again.
| // | ||
| if (tree->OperIs(GT_LCL_FLD)) | ||
| { | ||
| if (copyVarDsc->IsEnregisterableLcl() || copyVarDsc->lvPromotedStruct()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: one almost never wants to call lvPromotedStruct() instead of just checking lvPromoted. If we start having non-struct promoted variables in the future, we'll want this behavior for them as well.
Orthogonally, this is a bit conservative. If we already know copyVarDsc is not going to be enregistered, it should be ok to propagate it, so lvaGetPromotionType(copyVarDsc) == PROMOTION_TYPE_INDEPENDENT should in theory be a better check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw quite a few more regressions with promoted structs. Will have to look more closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw quite a few more regressions with promoted structs
I suppose that addresses the PROMOTION_TYPE_INDEPENDENT idea. Guessing these could have been from SSA-based optimization not working as well (since promoted structs are excluded from all of them).
|
Also wondering if it's worth checking for transitive copies. For scalar types it matters less -- one hopes LSRA, say can clean up all the residual copies. But for structs it could be a bigger deal. Edit: I suppose this should just fall out, if we have say a chain of struct assigns and then a partial use: we should produce and then But perhaps we could see cases where we think forwarding |
|
SPMI: despite the overall code size reduction, throughput increases, which is a little puzzling. |
|
Interestingly we'll CSE LCL_FLD with zero VNs but will not constant prop zeros into LCL_FLDs. So will look into enabling constant prop I guess. Still not settled on how to actually plumb through the info. |
src/coreclr/jit/assertionprop.cpp
Outdated
| if (varTypeIsStruct(tree)) | ||
| { | ||
| tree->BashToZeroConst(TYP_INT); | ||
| } | ||
| else | ||
| { | ||
| tree->BashToZeroConst(tree->TypeGet()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the !varTypeIsStruct branch dead code?
(We have an assert to that effect at the top)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
src/coreclr/jit/assertionprop.cpp
Outdated
| // TODO: create proper simd zero constant | ||
| // | ||
| if (varTypeIsSIMD(tree)) | ||
| { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should skip creating ZeroObj assertions for varTypeIsSIMD. ZeroObj is really meant for TYP_STRUCT only.
| case GT_OBJ: | ||
| case GT_BLK: | ||
| { | ||
| GenTree* const addr = op2->AsIndir()->Addr(); | ||
|
|
||
| if (addr->OperIs(GT_ADDR)) | ||
| { | ||
| GenTree* const base = addr->AsOp()->gtOp1; | ||
|
|
||
| if (base->OperIs(GT_LCL_VAR)) | ||
| { | ||
| // layout compat? | ||
| op2 = base; | ||
| goto IS_COPY; | ||
| } | ||
| } | ||
|
|
||
| goto DONE_ASSERTION; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still necessary after #71584?
(If it is, it should have a TODO-ADDR: delete once <...> comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will check once I pick up those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still some cases where it helps -- about 700 all told across SPMI
;;; String:Replace(ushort,ushort):String:this (MethodHash=5aa7fa78) #16249 in ASP.NET
fgMorphTree BB15, STMT00064 (before)
[000326] -A--------- * ASG simd32 (copy)
[000324] D------N--- +--* LCL_VAR simd32<System.Numerics.Vector`1[System.UInt16]> V29 tmp12
[000208] n---------- \--* OBJ simd32<System.Numerics.Vector`1[System.UInt16]>
[000207] ----------- \--* ADDR byref
[000203] -------N--- \--* LCL_VAR simd32<System.Numerics.Vector`1[System.UInt16]> V14 loc11
;; (with above)
GenTreeNode creates assertion:
[000326] -A--------- * ASG simd32 (copy)
In BB15 New Local Copy Assertion: V29 == V14, index = #01
...
Assertion prop in BB15:
Copy Assertion: V29 == V14, index = #01
[000320] ----------- * LCL_VAR simd32<System.Numerics.Vector`1[System.UInt16]> V14 loc11
;; (without above)
no assertion, no copy prop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIMD types, as one would suspect...
So the comment should be TODO-ADDR: delete once local morph folds SIMD-typed indirections. and we definitely need to check for layout compatibility.
src/coreclr/jit/morph.cpp
Outdated
| if (tree->OperIsConst()) | ||
| { | ||
| goto DONE; | ||
| assert("ERROR: Did not morph this node!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert("ERROR: Did not morph this node!"); | |
| assert(!"ERROR: Did not morph this node!"); |
Is it intentional that we no longer check for double morphing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks. That checking depended on old tree so I'd have to pass something like that down.
Seems like an odd contract though.
Still producing SIMD ZEROOBJ but no longer looking for them.
|
@dotnet/jit-contrib think this is getting close. I want to spend some more time looking at regressions since some of them are sizeable. So far they fall into a few camps:
;; before
mov rdx, gword ptr [rbp-40H]
; gcrRegs +[rdx]
mov ecx, dword ptr [rbp-38H]
movsx rax, word ptr [rbp-34H]
mov gword ptr [rbp-60H], rdx
mov dword ptr [rbp-58H], ecx
mov word ptr [rbp-54H], ax
mov byte ptr [rbp-52H], 0
;; size=77 bbWeight=4 PerfScore 97.00
G_M43996_IG07: ; , nogc, extend
vmovdqu xmm0, xmmword ptr [rbp-60H]
vmovdqu xmmword ptr [rbp-30H], xmm0
;; size=10 bbWeight=4 PerfScore 16.00
G_M43996_IG08: ; , isz, extend
mov rdi, gword ptr [rbp-30H]
;; after
mov rdx, gword ptr [rbp-40H]
; gcrRegs +[rdx]
mov ecx, dword ptr [rbp-38H]
movsx rax, word ptr [rbp-34H]
mov gword ptr [rbp-70H], rdx
mov dword ptr [rbp-68H], ecx
mov word ptr [rbp-64H], ax
mov byte ptr [rbp-62H], 0
mov gword ptr [rbp-50H], rdx ;; dead copy
mov dword ptr [rbp-48H], ecx
mov word ptr [rbp-44H], ax
mov byte ptr [rbp-42H], 0
mov gword ptr [rbp-30H], rdx
mov dword ptr [rbp-28H], ecx
mov word ptr [rbp-24H], ax
mov byte ptr [rbp-22H], 0
mov rdi, gword ptr [rbp-30H](plus another identical case later) I suspect that second copy is perhaps partially dead but perhaps kept alive by the same issue with I wonder how hard it would be to model this in liveness. |
|
@kunalspathak can you review? |
Sure, will do it later today. |
kunalspathak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some questions specifically around optAssertionPropDone. I didn't quite understand it. Can you please explain the design around it?
| // Block ops handle assertion kill/gen specially. | ||
| // See PrepareDst and PropagateAssertions | ||
| // | ||
| if (optAssertionPropDone != nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we say that we are doing struct1 = struct2 and is candidate for copy prop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, all that happens over in optAssertionProp_LclVar and the new optAssertionProp_LclFld. before the tree is morphed.
This check is asking whether or not we've already killed and generated assertions from the post-morph tree.
Global morph runs over each basic block in order, and in each block, over each statement in order, and in each statement, each node in order. During global morph we enable "local" assertion prop; at the start of each block, we clear out the assertion table. Before morphing each node, we try and apply the currently active assertions to the tree. Then when we're done morphing the node, we kill off any assertions that are no longer true and try and generate new assertions. This is more or less standard forward data flow on assertions. The set of assertion we track is pretty limited: var = constant; var is nonnull; var = other var. The first part of this PR is to enable assertion prop into LCL_FLD uses. It was inspired by code I saw in #66776 where we copied a struct and then just read one field of the copy. The approach outlines above works pretty well for most nodes, but does not work well for struct inits and struct copies, because the code in assertion prop can't always recognize the zeroings and copes once they've been expanded. In particular when a struct copy gets expanded as field by field copy, the sequence of stores to the LCL_FLDs of the dest would kill any assertion about the dest. The second part of this PR tries to improve things generating assertions for struct inits and copies after the source and dest are processed but before the assign node itself is processed. (note we were already doing the kills early on). That way we know what assertions to gen. We need to remember that we've done this "early assertion gen / kill" for these particular trees so we don't do kills again later on in |
That makes sense. Thank you for the explaination. |
In dotnet#74384 I modified morph to generate assertions before a block op was morphed, but after the source and dest were morphed. This misses generating some assertions that arise once the expanded form is known. We now re-run assertion gen on the expanded tree, when we do an `OneAsgBlock` expansion. Other cases my also prove profitable. Closes dotnet#75229.
In #74384 I modified morph to generate assertions before a block op was morphed, but after the source and dest were morphed. This misses generating some assertions that arise once the expanded form is known. We now re-run assertion gen on the expanded tree, when we do an `OneAsgBlock` expansion. Other cases my also prove profitable. Closes #75229.


Allow copy prop to update GT_LCL_FLD nodes.
Update local assertion gen for block opts to use the pre-morph tree
to generate copy or zero assertions, since the semantics of the post-morph
tree are often obscured by the copy/zero expansions.