Refactor integer cast codegen#19296
Conversation
ed42dfc to
9d7ed31
Compare
|
x64 crossgen diffs: x86 crossgen diffs: arm64 crossgen diffs: arm32 crossgen diffs: |
67942d2 to
42a1f69
Compare
|
I wasn't after CQ improvements with this but there are a couple of small ones. There is a trivial change that affects x64 - 64-to-32 bit narrowing casts were generating 64 bit MOVs when 32 bit would be sufficient. 32 bit MOVs do not need a REX prefix an that accounts for most of the 401 bytes improvement seen on x64. Then there are the overflow check codegen changes. Indiscriminate use of test was just making things worse, especially on ARM32: - movw r2, 0xd1ff
- movt r2, 0xd1ff
- tst r1, r2
+ cmp r1, 0x10000On ARM some max values required for overflow checking of small int type casts (32767 and 65535) cannot be encoded in the immediate operand but fortunately they can be replaced with power of 2 values (32768 and 65536) that can be encoded due to ARM's fancy imm shifting facility. This changes apply to xarch as well but there they don't have much of an impact on codesize due immediate operand encoding being far less restrictive. |
|
@mikedn Are you ready to have this reviewed? cc @dotnet/jit-contrib |
|
@BruceForstall Hmm, looks like I was, until it conflicted. |
|
@BruceForstall Conflict fixed |
CarolEidt
left a comment
There was a problem hiding this comment.
This LGTM. Very nice refactoring.
Could we get another review @dotnet/jit-contrib ?
The ARM version of
CodeGen::genIntToIntCastis completely different from the XARCH version, even thoughGT_CASTbehaves identically on both architectures and both ISAs have very similar instructions (e.g.movsx eax, alandsxtb w0, w0). This difference can cause arch specific bugs (e.g. #18063) and makes it more difficult to improve the codegen (e.g. #12676).Lowering::getCastDescriptionattempts to put some of the cast handling logic in a common place but it is only used by ARM codegen and anyway falls short of its goal - it provides a bunch of booleans that codegen has to figure out how to use properly (e.g.else if (castInfo.unsignedSource || castInfo.unsignedDest)).It also turns out that the overflow checking logic is unnecessarily complicated due to a mix of
testandcmpinstructions when most of the time justcmpwill suffice. This is especially bad on ARM32 where this translates into more complicated register requirements.This PR started as a bit of refactoring of
genIntToIntCastbut it turned into a complete rewrite. In the end the ARM and XARCH versions ended up being so similar that I reintroduced a form of "getCastDescription" that I initially removed. The net result is that the list of individual commits and the diff itself is kind of useless, the only thing that's probably worth looking at is the final result. Hopefully this will be compensated by the fact that the rather hairy cast handling logic is in one place and the actual codegen code has become trivial.