Fix extra register-dependency on mem-form of vcvtsd/s2ss#17560
Conversation
|
@CarolEidt @AndyAyersMS PTAL |
|
@fiigii - thanks!
So, could this have actually led to incorrect code, depending on the contents of XMM0? Or is it the case that, since this is effectively producing a scalar result, it won't matter because those upper bits won't be used? We should consider this issue when addressing #14523, if not sooner. |
Right, the second register just provides the upper-bits that are never used by scalar programs. |
|
test OSX10.12 x64 Checked Innerloop Build and Test |
|
Can we merge this PR? |
|
https://github.com/dotnet/coreclr/issues/17544 is a 2.1 issue, but it's now closed via this PR. I assume this change will be brought through ask mode for 2.1? |
I am trying to figure out now whether it meets the bar. It would be good to see it in. |
|
I would think that if it indeed also fixes #17603, then it's a good candidate for 2.1. |
Fix https://github.com/dotnet/coreclr/issues/17544
Originally, #14274 attempts to optimize
vcvtsd/s2ss/dby keeping the second and third register same to avoid unnecessary register-dependency.However, that does not work with the containment form. When the last op is a memory address, RyuJIT cannot determine the second register and just generate the default value of VEX.vvvv field (XMM0).
This PR reverts the change of #14274 to fix this performance regression, but we need to find a better solution for the containment form in the future.