Fix genCodeForIndexAddr by mikedn · Pull Request #22528 · dotnet/coreclr

mikedn · 2019-02-11T20:52:30Z

This does some weird things - treats the array length as 64 bit when it's in fact 32 bit, fails to zero extend TYP_INT indices, creates new GT_IND/GT_LEA nodes out of thin air.

#20126 has a similar fix for ARM.

Sample generated code for a[i + 2]:

; before
       488B4510             mov      rax, gword ptr [rbp+10H]
       8B5518               mov      edx, dword ptr [rbp+18H]
       83C202               add      edx, 2
; 64 bit compare, typically works correctly due to the zero-extend to 64 bit model
; and because the array length field is followed by 4 bytes of padding that are
; normally 0
       483B5008             cmp      rdx, qword ptr [rax+8]
       730D                 jae      SHORT G_M55886_IG04
; use a TYP_INT value as 64 bit
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

; after
       488B4510             mov      rax, gword ptr [rbp+10H]
       8B5518               mov      edx, dword ptr [rbp+18H]
       83C202               add      edx, 2
       3B5008               cmp      edx, dword ptr [rax+8]
       730F                 jae      SHORT G_M55886_IG04
       8BD2                 mov      edx, edx
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

It's possible to construct contrived examples where the upper 32 bits are not zero: a[(int)checked(longVar + 2)]:

; before
       488B4510             mov      rax, gword ptr [rbp+10H]
       BA02000000           mov      edx, 2
       4863D2               movsxd   rdx, edx
       48035518             add      rdx, qword ptr [rbp+18H]
       7013                 jo       SHORT G_M55886_IG04
; index used as 64 bit, basically the cast to `int` was dropped
       483B5008             cmp      rdx, qword ptr [rax+8]
       7312                 jae      SHORT G_M55886_IG05
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

; after
       488B4510             mov      rax, gword ptr [rbp+10H]
       BA02000000           mov      edx, 2
       4863D2               movsxd   rdx, edx
       48035518             add      rdx, qword ptr [rbp+18H]
       7014                 jo       SHORT G_M55886_IG04
; 32 bit compare
       3B5008               cmp      edx, dword ptr [rax+8]
       7314                 jae      SHORT G_M55886_IG05
; zero extend to 64 bit
       8BD2                 mov      edx, edx
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

mikedn · 2019-02-11T20:59:40Z

FX minopts diff summary:

Total bytes of diff: -3308 (-0.01% of base)
    diff is an improvement.
Top file regressions by size (bytes):
        1019 : System.Private.DataContractSerialization.dasm (0.11% of base)
         929 : System.Data.Common.dasm (0.07% of base)
         421 : Microsoft.VisualBasic.dasm (0.14% of base)
         316 : System.Runtime.Numerics.dasm (0.34% of base)
         269 : Newtonsoft.Json.dasm (0.05% of base)
Top file improvements by size (bytes):
       -7452 : System.Private.CoreLib.dasm (-0.15% of base)
        -459 : NuGet.Frameworks.dasm (-0.40% of base)
        -369 : System.Text.RegularExpressions.dasm (-0.29% of base)
        -325 : System.Linq.Parallel.dasm (-0.04% of base)
        -244 : System.Net.HttpListener.dasm (-0.10% of base)
102 total files with size differences (18 improved, 84 regressed), 27 unchanged.
Top method regressions by size (bytes):
         204 ( 2.93% of base) : System.Private.DataContractSerialization.dasm - ArrayHelper`2:ReadArray(ref,ref,ref,int):ref:this (12 methods)
         141 ( 3.29% of base) : System.Drawing.Primitives.dasm - KnownColorTable:InitColorTable()
         120 ( 1.58% of base) : System.Private.CoreLib.dasm - DefaultBinder:BindToMethod(int,ref,byref,ref,ref,ref,byref):ref:this
          84 ( 1.37% of base) : System.Data.Common.dasm - FunctionNode:EvalFunction(int,ref,ref,int):ref:this
          82 ( 1.81% of base) : System.Private.Xml.dasm - CodeGenerator:.cctor()
Top method improvements by size (bytes):
        -737 (-2.61% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -652 (-4.32% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -614 (-2.08% of base) : System.Private.CoreLib.dasm - Enumerator:MoveNext():bool:this (141 methods)
        -480 (-2.77% of base) : System.Private.Xml.dasm - XsltLoader:.ctor():this
        -471 (-4.28% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
Top method regressions by size (percentage):
          17 ( 8.02% of base) : System.Linq.Parallel.dasm - SortHelper`2:Dispose():this
          15 ( 7.69% of base) : System.Private.DataContractSerialization.dasm - XmlBufferReader:GetChars(int,int,ref):int:this
          15 ( 7.21% of base) : System.Runtime.Numerics.dasm - BigIntegerCalculator:Compare(ref,ref):int
          12 ( 6.06% of base) : Microsoft.CSharp.dasm - MethodTypeInferrer:SetUnknownsToNotDependent():this
          22 ( 5.15% of base) : System.Private.CoreLib.dasm - TaskFactory:CommonCWAllLogic(ref):ref (2 methods)
Top method improvements by size (percentage):
         -20 (-10.58% of base) : System.Private.Xml.dasm - OutputScopeManager:LookupNamespace(ref):ref:this
         -23 (-9.24% of base) : System.Private.CoreLib.dasm - Hashtable:Clear():this
         -21 (-8.57% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetElementSibling(byref,byref,ref,ref):bool
         -19 (-8.23% of base) : System.Private.Xml.dasm - XmlWellFormedWriter:StartElementContent():this
         -32 (-8.10% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
5677 total methods with size differences (917 improved, 4760 regressed), 118640 unchanged.
1 files had text diffs but not size diffs.
System.Memory.dasm had 28 diffs

Improvements come from using 32 bit compares (that may no longer need a REX prefix) and use of 3 operand IMUL. Regressions come from the extra mov used to zero extend the index.

x86 has only improvements from the 3 operand IMUL:

Total bytes of diff: -23355 (-0.11% of base)
    diff is an improvement.
Top file improvements by size (bytes):
      -13704 : System.Private.CoreLib.dasm (-0.35% of base)
       -3228 : System.Private.Xml.dasm (-0.10% of base)
       -1199 : System.Data.Common.dasm (-0.12% of base)
        -678 : System.Linq.Parallel.dasm (-0.11% of base)
        -627 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.03% of base)
26 total files with size differences (26 improved, 0 regressed), 103 unchanged.
Top method improvements by size (bytes):
        -914 (-7.12% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -892 (-3.99% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -542 (-5.83% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
        -490 (-5.79% of base) : System.Private.CoreLib.dasm - Dictionary`2:FindEntry(ref):int:this (10 methods)
        -482 (-1.89% of base) : System.Private.CoreLib.dasm - Dictionary`2:.ctor(ref,ref):this (56 methods)
Top method improvements by size (percentage):
         -26 (-14.61% of base) : System.Private.Xml.dasm - XmlTextWriter:FindPrefix(ref):ref:this
         -23 (-13.69% of base) : System.Private.CoreLib.dasm - EventSource:AnyEventEnabled():bool:this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedExistanceList:MoveToFront(int):this
         -23 (-12.23% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetNonDescendant(byref,byref):bool
783 total methods with size differences (783 improved, 0 regressed), 123511 unchanged

This does some weird things - treats the array length as 64 bit when it's in fact 32 bit, fails to zero extend TYP_INT indices, creates new GT_IND/GT_LEA nodes out of thin air.

erozenfeld · 2019-02-12T07:02:05Z

@dotnet/jit-contrib

CarolEidt

LGTM - thanks!

CarolEidt · 2019-02-12T17:19:19Z

-            getEmitter()->emitInsLoadInd(ins_Load(TYP_INT), EA_4BYTE, arrLen.gtRegNum, &arrLen);
+            tmpReg = node->GetSingleTempReg();
+            getEmitter()->emitIns_R_AR(INS_mov, EA_4BYTE, tmpReg, baseReg, static_cast<int>(node->gtLenOffset));
+            getEmitter()->emitIns_R_R(INS_cmp, EA_8BYTE, indexReg, tmpReg);


Nice. I much prefer not modifying the register-related properties during codegen.

CarolEidt · 2019-02-12T17:45:17Z

test OSX10.12 x64 Checked Innerloop Build and Test

mikedn · 2019-02-12T18:20:31Z

+        // LEA needs 64-bit operands so we need to widen the index if it's TYP_INT.
+        // Since it's TYP_INT the upper 32 bits aren't used so we should be able
+        // to widen in place, without needing a temporary register.
+        getEmitter()->emitIns_R_R(INS_mov, EA_4BYTE, indexReg, indexReg);


I'm a bit concerned about this. Sure, it's a TYP_INT value so the upper 32 bits should not be used.

But what if LSRA in the future does something like assign a 64 bit register variable to a 32 bit use - a[(int)longVarReg] - and this code ends up zeroing out the upper 32 bits of longVarReg.

@CarolEidt Opinions ?

Ah, very good point. I'm believe that today that would be a cast and so we would only reuse the register if the source was a last use (and index would be a cast). To be conservative, one could assert that index is not a register candidate lclVar, or it is a TYP_INT, with a comment that, in future, if we make the cast "implicit" for this case, we need to use a temp if the lclVar is not a last use.

Looks like this kind of register "reinterpretation" already occurs, but in a different manner:

N014 ( 2, 3) [000007] ------------ t7 = * CAST long <- int REG rdx /--* t5 long +--* t7 long N016 ( 10, 11) [000008] ---X-------- t8 = * ADD long REG rax /--* t8 long N018 ( 14, 14) [000010] DA-X-------- * STORE_LCL_VAR long V02 tmp1 NA REG NA N020 ( 3, 4) [000018] -c--G--N---- t18 = CLS_VAR_ADDR byref Hnd=0xe8001558 REG NA ; V02 is long but here it is used as int and of course, the use gets the same register N022 ( 3, 2) [000013] C----------- t13 = LCL_VAR int V02 tmp1 rax REG rax N024 ( 1, 1) [000014] -c---------- t14 = CNS_INT int 3 REG NA /--* t13 int +--* t14 int N026 ( 8, 6) [000016] ------------ t16 = * MUL int REG rax

This happens even in minopts so it looks like I'll need to actually allocate and the temp register when the array index needs to be widened, it cannot be done in place.

Thanks for investigating this. This is rather troubling, but probably too fundamental to be easily changed.

Yes, IMO it would better to not have LCLVAR nodes with a type other than the variable itself but I'm not sure if it's possible to avoid all the CQ fallout from adding the required cast nodes. I would guess it should be possible to make those cast nodes contained, though it's probably a bit complicated.

But what bugs me the most is that this kind of reinterpretation occurs even in minopts mode.

mikedn · 2019-02-15T19:21:29Z

Added a commit to deal with the index widening issue. New x64 FX diff:

Total bytes of diff: 76 (0.00% of base)
    diff is a regression.
Top file regressions by size (bytes):
        1340 : System.Data.Common.dasm (0.11% of base)
        1160 : System.Private.DataContractSerialization.dasm (0.13% of base)
         646 : System.Private.Xml.dasm (0.02% of base)
         503 : Microsoft.VisualBasic.dasm (0.17% of base)
         371 : System.Runtime.Numerics.dasm (0.40% of base)
Top file improvements by size (bytes):
       -6525 : System.Private.CoreLib.dasm (-0.13% of base)
        -448 : NuGet.Frameworks.dasm (-0.39% of base)
        -268 : System.Text.RegularExpressions.dasm (-0.21% of base)
        -230 : System.Linq.Parallel.dasm (-0.03% of base)
        -206 : System.Net.HttpListener.dasm (-0.08% of base)
101 total files with size differences (15 improved, 86 regressed), 28 unchanged.
Top method regressions by size (bytes):
         204 ( 2.93% of base) : System.Private.DataContractSerialization.dasm - ArrayHelper`2:ReadArray(ref,ref,ref,int):ref:this (12 methods)
         141 ( 3.29% of base) : System.Drawing.Primitives.dasm - KnownColorTable:InitColorTable()
         138 ( 1.82% of base) : System.Private.CoreLib.dasm - DefaultBinder:BindToMethod(int,ref,byref,ref,ref,ref,byref):ref:this
          89 ( 0.64% of base) : System.Reflection.Metadata.dasm - MetadataReader:InitializeTableReaders(struct,ubyte,ref,ref):this
          84 ( 3.78% of base) : Microsoft.VisualBasic.dasm - IDOUtils:CreateFuncCallSiteAndInvoke(ref,ref,ref):ref
Top method improvements by size (bytes):
        -703 (-2.49% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -626 (-4.15% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -610 (-2.07% of base) : System.Private.CoreLib.dasm - Enumerator:MoveNext():bool:this (141 methods)
        -480 (-2.77% of base) : System.Private.Xml.dasm - XsltLoader:.ctor():this
        -447 (-4.06% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
Top method regressions by size (percentage):
          20 ( 9.62% of base) : System.Runtime.Numerics.dasm - BigIntegerCalculator:Compare(ref,ref):int
          17 ( 8.02% of base) : System.Linq.Parallel.dasm - SortHelper`2:Dispose():this
          15 ( 7.69% of base) : System.Private.DataContractSerialization.dasm - XmlBufferReader:GetChars(int,int,ref):int:this
          12 ( 6.06% of base) : Microsoft.CSharp.dasm - MethodTypeInferrer:SetUnknownsToNotDependent():this
          18 ( 5.92% of base) : System.Private.Xml.dasm - XmlSqlBinaryReader:GetInt64(int):long:this
Top method improvements by size (percentage):
         -20 (-10.58% of base) : System.Private.Xml.dasm - OutputScopeManager:LookupNamespace(ref):ref:this
         -23 (-9.24% of base) : System.Private.CoreLib.dasm - Hashtable:Clear():this
         -20 (-8.16% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetElementSibling(byref,byref,ref,ref):bool
         -32 (-8.10% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
         -32 (-8.10% of base) : Microsoft.VisualBasic.dasm - FixedExistanceList:MoveToFront(int):this
5678 total methods with size differences (914 improved, 4764 regressed), 118639 unchanged.
2 files had text diffs but not size diffs.
NuGet.Common.dasm had 51 diffs
System.Memory.dasm had 36 diffs

Worse than before. Oh well, it's minopts code.

mikedn · 2019-02-15T19:32:05Z

x86 FX diff improved:

Total bytes of diff: -24349 (-0.11% of base)
    diff is an improvement.
Top file improvements by size (bytes):
      -13946 : System.Private.CoreLib.dasm (-0.35% of base)
       -3361 : System.Private.Xml.dasm (-0.11% of base)
       -1321 : System.Data.Common.dasm (-0.13% of base)
        -693 : System.Linq.Parallel.dasm (-0.11% of base)
        -649 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.03% of base)
37 total files with size differences (37 improved, 0 regressed), 92 unchanged.
Top method improvements by size (bytes):
        -929 (-7.23% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -892 (-3.99% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -542 (-5.83% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
        -490 (-5.79% of base) : System.Private.CoreLib.dasm - Dictionary`2:FindEntry(ref):int:this (10 methods)
        -482 (-1.89% of base) : System.Private.CoreLib.dasm - Dictionary`2:.ctor(ref,ref):this (56 methods)
Top method improvements by size (percentage):
         -26 (-14.61% of base) : System.Private.Xml.dasm - XmlTextWriter:FindPrefix(ref):ref:this
         -23 (-13.69% of base) : System.Private.CoreLib.dasm - EventSource:AnyEventEnabled():bool:this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedExistanceList:MoveToFront(int):this
         -23 (-12.23% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetNonDescendant(byref,byref):bool
946 total methods with size differences (946 improved, 0 regressed), 123348 unchanged.

It turns out that LSRA build code was also messed up, it was allocating a temp register whenever the index type was TYP_I_IMPL. On x86 that would be... all the time.

CarolEidt · 2019-02-15T21:28:48Z

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test

mikedn · 2019-02-16T07:27:07Z

Meh, OSX and java.nio.channels.ClosedChannelException again...

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test

Fix genCodeForIndexAddr Commit migrated from dotnet/coreclr@e3d4b9c

mikedn force-pushed the idx-addr branch from 449b698 to c9f76b8 Compare February 11, 2019 21:18

Fix genCodeForIndexAddr

9b78e13

This does some weird things - treats the array length as 64 bit when it's in fact 32 bit, fails to zero extend TYP_INT indices, creates new GT_IND/GT_LEA nodes out of thin air.

mikedn force-pushed the idx-addr branch from c9f76b8 to 9b78e13 Compare February 12, 2019 06:38

CarolEidt approved these changes Feb 12, 2019

View reviewed changes

mikedn commented Feb 12, 2019

View reviewed changes

Widen index to a temporary register

091bc56

mikedn force-pushed the idx-addr branch from 5c50ee2 to 091bc56 Compare February 15, 2019 19:24

CarolEidt merged commit e3d4b9c into dotnet:master Feb 16, 2019

mikedn deleted the idx-addr branch March 9, 2019 20:33

CarolEidt mentioned this pull request May 28, 2019

Sort out ARM load/store instruction size issues #20126

Merged

BruceForstall added the area-CodeGen label Oct 4, 2019

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022

Merge pull request dotnet/coreclr#22528 from mikedn/idx-addr

b69cb02

Fix genCodeForIndexAddr Commit migrated from dotnet/coreclr@e3d4b9c

Conversation

mikedn commented Feb 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mikedn commented Feb 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erozenfeld commented Feb 12, 2019

Uh oh!

CarolEidt left a comment

Choose a reason for hiding this comment

Uh oh!

CarolEidt Feb 12, 2019

Choose a reason for hiding this comment

Uh oh!

CarolEidt commented Feb 12, 2019

Uh oh!

mikedn Feb 12, 2019

Choose a reason for hiding this comment

Uh oh!

CarolEidt Feb 13, 2019

Choose a reason for hiding this comment

Uh oh!

mikedn Feb 14, 2019

Choose a reason for hiding this comment

Uh oh!

CarolEidt Feb 15, 2019

Choose a reason for hiding this comment

Uh oh!

mikedn Feb 15, 2019

Choose a reason for hiding this comment

Uh oh!

mikedn commented Feb 15, 2019

Uh oh!

mikedn commented Feb 15, 2019

Uh oh!

CarolEidt commented Feb 15, 2019

Uh oh!

mikedn commented Feb 16, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mikedn commented Feb 11, 2019 •

edited

Loading

mikedn commented Feb 11, 2019 •

edited

Loading