-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Fix lcl fld addr. #39424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lcl fld addr. #39424
Conversation
4c45fe4 to
e54b816
Compare
e54b816 to
69ecffc
Compare
|
PTAL @CarolEidt @dotnet/jit-contrib |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the same pattern of code added to quite a number of places. Should we make this into a new helper method?
I like the idea. I was thinking about adding can be extracted (but will have to use out params to pass results). I can make a try and see how it looks. |
CarolEidt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, with a couple of questions & suggestions. I also like Andy's suggestion to extract the repeated code, if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are asserting that addrNode is not contained, and yet calling genConsumeAddress which handles a contained LEA. Is that so that it will handle a contained address even if we don't expect it? If so, a comment would be good. If not, it should be calling genConsumeReg().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment about that.
src/coreclr/src/jit/codegenxarch.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here would be good - i.e. that this path is unexpected and untested, but should be correct in case it is encountered in a non-checked JIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I am planning to write a test that hits this path, if I won't be able to construct it I will add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the repro cases and deleted the asserts.
src/coreclr/src/jit/emitxarch.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that it could potentially be the case that we have no base or index, but in that case we should have a handle for the offset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a check || (indir->Offset() != 0), I think it is unreachable (based on my analysis of GenTreeIndir::Offset()) but it is just safer for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ldloca.s V_0
ldflda float64 Runtime_39424/Container::Double
volatile. ldind.r8
were generating what we needed. I have disabled CSE, added more than 4 fields to avoid struct promotion and marked indir as volatile to avoid the optimizations that were hiding the issue.
Combine long chains of `addr->OperGet() == GT_* ||` with `GT_LCL_VAR_ADDR` using `OperIs` to simplify future changes.
`emitter::emitHandleMemOp` has special logic for contained `memBase` and but the last block does not expect a contained node. A contained node doesn't produce a register so it is not correct to use result of `GetRegNum()` from a contained node as a valid register. However, adding an assert to `GetRegNum()` that `!this->isContained` is a bigger task that is out of this PR.
We have contained `LCL_VAR_ADDR` support there but make sure that contained `LCL_FLD_ADDR` can't reach it.
This is an additional optimization that makes future changes simpler.
In all these places we expect `LCL_VAR_ADDR` to be contained. If we had gotten a `LCL_VAR_ADDR` that is not contained we would have instantiated `LCL_VAR_ADDR` twice: in the register and the parent instruction. The register value would have been unused.
However, fire an assert if we think that this path is unreachable for now.
We have coverage for this asserts in the following tests: hwintrinsic 478: Ssse3_ro instr 11645: Runtime_39403 instr 1028 : Aes_ro hwintrinsic 716: pmi of Microsoft.Diagnostics.Tracing.TraceEvent
Delete the rest `assert(!"don't expect GT_LCL_FLD_ADDR");`.
6f28887 to
00eec91
Compare
|
The PR has been updated and is ready for the next round of review. |
|
ping @dotnet/jit-contrib |
AndyAyersMS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good, though I don't know how to be confident that all the places that needed to be changed have been changed.
I estimate the risk as low. We had had support for contained I think the new asserts in |
* Add a repro test. * Small ref. Combine long chains of `addr->OperGet() == GT_* ||` with `GT_LCL_VAR_ADDR` using `OperIs` to simplify future changes. * Add an assert that would fire in the repro test. `emitter::emitHandleMemOp` has special logic for contained `memBase` and but the last block does not expect a contained node. A contained node doesn't produce a register so it is not correct to use result of `GetRegNum()` from a contained node as a valid register. However, adding an assert to `GetRegNum()` that `!this->isContained` is a bigger task that is out of this PR. * Assert that `LCL_FLD_ADDR` is not contained in `genPutArgStk(Split)` We have contained `LCL_VAR_ADDR` support there but make sure that contained `LCL_FLD_ADDR` can't reach it. * Contain `GT_LCL_FLD_ADDR` under HW_INTRINSIC. This is an additional optimization that makes future changes simpler. * Add contained checks. In all these places we expect `LCL_VAR_ADDR` to be contained. If we had gotten a `LCL_VAR_ADDR` that is not contained we would have instantiated `LCL_VAR_ADDR` twice: in the register and the parent instruction. The register value would have been unused. * Support `FLD_ADDR` where `LCL_ADDR` is supported. However, fire an assert if we think that this path is unreachable for now. * Delete asserts in the reachable blocks. We have coverage for this asserts in the following tests: hwintrinsic 478: Ssse3_ro instr 11645: Runtime_39403 instr 1028 : Aes_ro hwintrinsic 716: pmi of Microsoft.Diagnostics.Tracing.TraceEvent * Review response. * Add repro cases. Delete the rest `assert(!"don't expect GT_LCL_FLD_ADDR");`. * Use `GetLclOffs` from `LclVarCommon`. * missed file.
This PR fixes #39403 introduced by #38316 (first two commits).
The safest solution would be to revert eefeb7e and ceec9c8 but it will lead to significant regressions, like
so after advising with Carol I have decided to fix the optimization rather than disable it.
The two bad things about the issue were:
so the main goal in the fix was to add as many relevant asserts as possible.
I recommend reviewing by commits.
Changes:
aaf2e99: Add a repro test.
69d33fb: Small ref.
Combine long chains of
addr->OperGet() == GT_* ||withGT_LCL_VAR_ADDRusingOperIsto simplify future changes.f8d8cac: Add an assert that would fire in the repro test.
emitter::emitHandleMemOphas special logic for containedmemBasebut the last else does notexpect a contained node. A contained node doesn't produce a register so it is not correct to use the result
of
GetRegNum()from a contained node as a valid register.However, adding an assert to
GetRegNum()that!this->isContainedis a bigger task that is out of this PR.578d0ed: Assert that
LCL_FLD_ADDRis not contained ingenPutArgStk(Split)We have contained
LCL_VAR_ADDRsupport there but make sure that containedLCL_FLD_ADDRcan't reach it.cd9d0b3: Contain
GT_LCL_FLD_ADDRunder HW_INTRINSIC.This is an additional optimization that makes future changes simpler.
7d68d3f: Add contained checks.
In all these places we expect
LCL_VAR_ADDRto be contained.If we had gotten a
LCL_VAR_ADDRthat is not contained we would have instantiatedLCL_VAR_ADDRtwice:in the register and the parent instruction.
The register value would have been unused.
1865d29: Support
FLD_ADDRwhereLCL_ADDRis supported.However, fire an assert if we think that this path is unreachable for now.
69ecffc: Delete asserts in the reachable blocks.
We have coverage for this asserts in the following tests:
hwintrinsic 478: Ssse3_ro
instr 11645: Runtime_39403
instr 1028 : Aes_ro
hwintrinsic 716: pmi of Microsoft.Diagnostics.Tracing.TraceEvent
From HWIntrinsic support we have some improvements (x64 Pmi):
crossgen linuxnonjit:
I am planning to write repro cases for the rest of
assert(!"don't expect GT_LCL_FLD_ADDR")before I finish this PR.