Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Fix genCodeForIndexAddr#22528

Merged
CarolEidt merged 2 commits into
dotnet:masterfrom
mikedn:idx-addr
Feb 16, 2019
Merged

Fix genCodeForIndexAddr#22528
CarolEidt merged 2 commits into
dotnet:masterfrom
mikedn:idx-addr

Conversation

@mikedn
Copy link
Copy Markdown

@mikedn mikedn commented Feb 11, 2019

This does some weird things - treats the array length as 64 bit when it's in fact 32 bit, fails to zero extend TYP_INT indices, creates new GT_IND/GT_LEA nodes out of thin air.

#20126 has a similar fix for ARM.

Sample generated code for a[i + 2]:

; before
       488B4510             mov      rax, gword ptr [rbp+10H]
       8B5518               mov      edx, dword ptr [rbp+18H]
       83C202               add      edx, 2
; 64 bit compare, typically works correctly due to the zero-extend to 64 bit model
; and because the array length field is followed by 4 bytes of padding that are
; normally 0
       483B5008             cmp      rdx, qword ptr [rax+8]
       730D                 jae      SHORT G_M55886_IG04
; use a TYP_INT value as 64 bit
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

; after
       488B4510             mov      rax, gword ptr [rbp+10H]
       8B5518               mov      edx, dword ptr [rbp+18H]
       83C202               add      edx, 2
       3B5008               cmp      edx, dword ptr [rax+8]
       730F                 jae      SHORT G_M55886_IG04
       8BD2                 mov      edx, edx
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

It's possible to construct contrived examples where the upper 32 bits are not zero: a[(int)checked(longVar + 2)]:

; before
       488B4510             mov      rax, gword ptr [rbp+10H]
       BA02000000           mov      edx, 2
       4863D2               movsxd   rdx, edx
       48035518             add      rdx, qword ptr [rbp+18H]
       7013                 jo       SHORT G_M55886_IG04
; index used as 64 bit, basically the cast to `int` was dropped
       483B5008             cmp      rdx, qword ptr [rax+8]
       7312                 jae      SHORT G_M55886_IG05
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

; after
       488B4510             mov      rax, gword ptr [rbp+10H]
       BA02000000           mov      edx, 2
       4863D2               movsxd   rdx, edx
       48035518             add      rdx, qword ptr [rbp+18H]
       7014                 jo       SHORT G_M55886_IG04
; 32 bit compare
       3B5008               cmp      edx, dword ptr [rax+8]
       7314                 jae      SHORT G_M55886_IG05
; zero extend to 64 bit
       8BD2                 mov      edx, edx
       488D449010           lea      rax, bword ptr [rax+4*rdx+16]

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Feb 11, 2019

FX minopts diff summary:

Total bytes of diff: -3308 (-0.01% of base)
    diff is an improvement.
Top file regressions by size (bytes):
        1019 : System.Private.DataContractSerialization.dasm (0.11% of base)
         929 : System.Data.Common.dasm (0.07% of base)
         421 : Microsoft.VisualBasic.dasm (0.14% of base)
         316 : System.Runtime.Numerics.dasm (0.34% of base)
         269 : Newtonsoft.Json.dasm (0.05% of base)
Top file improvements by size (bytes):
       -7452 : System.Private.CoreLib.dasm (-0.15% of base)
        -459 : NuGet.Frameworks.dasm (-0.40% of base)
        -369 : System.Text.RegularExpressions.dasm (-0.29% of base)
        -325 : System.Linq.Parallel.dasm (-0.04% of base)
        -244 : System.Net.HttpListener.dasm (-0.10% of base)
102 total files with size differences (18 improved, 84 regressed), 27 unchanged.
Top method regressions by size (bytes):
         204 ( 2.93% of base) : System.Private.DataContractSerialization.dasm - ArrayHelper`2:ReadArray(ref,ref,ref,int):ref:this (12 methods)
         141 ( 3.29% of base) : System.Drawing.Primitives.dasm - KnownColorTable:InitColorTable()
         120 ( 1.58% of base) : System.Private.CoreLib.dasm - DefaultBinder:BindToMethod(int,ref,byref,ref,ref,ref,byref):ref:this
          84 ( 1.37% of base) : System.Data.Common.dasm - FunctionNode:EvalFunction(int,ref,ref,int):ref:this
          82 ( 1.81% of base) : System.Private.Xml.dasm - CodeGenerator:.cctor()
Top method improvements by size (bytes):
        -737 (-2.61% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -652 (-4.32% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -614 (-2.08% of base) : System.Private.CoreLib.dasm - Enumerator:MoveNext():bool:this (141 methods)
        -480 (-2.77% of base) : System.Private.Xml.dasm - XsltLoader:.ctor():this
        -471 (-4.28% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
Top method regressions by size (percentage):
          17 ( 8.02% of base) : System.Linq.Parallel.dasm - SortHelper`2:Dispose():this
          15 ( 7.69% of base) : System.Private.DataContractSerialization.dasm - XmlBufferReader:GetChars(int,int,ref):int:this
          15 ( 7.21% of base) : System.Runtime.Numerics.dasm - BigIntegerCalculator:Compare(ref,ref):int
          12 ( 6.06% of base) : Microsoft.CSharp.dasm - MethodTypeInferrer:SetUnknownsToNotDependent():this
          22 ( 5.15% of base) : System.Private.CoreLib.dasm - TaskFactory:CommonCWAllLogic(ref):ref (2 methods)
Top method improvements by size (percentage):
         -20 (-10.58% of base) : System.Private.Xml.dasm - OutputScopeManager:LookupNamespace(ref):ref:this
         -23 (-9.24% of base) : System.Private.CoreLib.dasm - Hashtable:Clear():this
         -21 (-8.57% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetElementSibling(byref,byref,ref,ref):bool
         -19 (-8.23% of base) : System.Private.Xml.dasm - XmlWellFormedWriter:StartElementContent():this
         -32 (-8.10% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
5677 total methods with size differences (917 improved, 4760 regressed), 118640 unchanged.
1 files had text diffs but not size diffs.
System.Memory.dasm had 28 diffs

Improvements come from using 32 bit compares (that may no longer need a REX prefix) and use of 3 operand IMUL. Regressions come from the extra mov used to zero extend the index.

x86 has only improvements from the 3 operand IMUL:

Total bytes of diff: -23355 (-0.11% of base)
    diff is an improvement.
Top file improvements by size (bytes):
      -13704 : System.Private.CoreLib.dasm (-0.35% of base)
       -3228 : System.Private.Xml.dasm (-0.10% of base)
       -1199 : System.Data.Common.dasm (-0.12% of base)
        -678 : System.Linq.Parallel.dasm (-0.11% of base)
        -627 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.03% of base)
26 total files with size differences (26 improved, 0 regressed), 103 unchanged.
Top method improvements by size (bytes):
        -914 (-7.12% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -892 (-3.99% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -542 (-5.83% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
        -490 (-5.79% of base) : System.Private.CoreLib.dasm - Dictionary`2:FindEntry(ref):int:this (10 methods)
        -482 (-1.89% of base) : System.Private.CoreLib.dasm - Dictionary`2:.ctor(ref,ref):this (56 methods)
Top method improvements by size (percentage):
         -26 (-14.61% of base) : System.Private.Xml.dasm - XmlTextWriter:FindPrefix(ref):ref:this
         -23 (-13.69% of base) : System.Private.CoreLib.dasm - EventSource:AnyEventEnabled():bool:this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedExistanceList:MoveToFront(int):this
         -23 (-12.23% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetNonDescendant(byref,byref):bool
783 total methods with size differences (783 improved, 0 regressed), 123511 unchanged

This does some weird things - treats the array length as 64 bit when it's in fact 32 bit, fails to zero extend TYP_INT indices, creates new GT_IND/GT_LEA nodes out of thin air.
@erozenfeld
Copy link
Copy Markdown
Member

@dotnet/jit-contrib

Copy link
Copy Markdown

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks!

Comment thread src/jit/codegenxarch.cpp
getEmitter()->emitInsLoadInd(ins_Load(TYP_INT), EA_4BYTE, arrLen.gtRegNum, &arrLen);
tmpReg = node->GetSingleTempReg();
getEmitter()->emitIns_R_AR(INS_mov, EA_4BYTE, tmpReg, baseReg, static_cast<int>(node->gtLenOffset));
getEmitter()->emitIns_R_R(INS_cmp, EA_8BYTE, indexReg, tmpReg);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I much prefer not modifying the register-related properties during codegen.

@CarolEidt
Copy link
Copy Markdown

test OSX10.12 x64 Checked Innerloop Build and Test

Comment thread src/jit/codegenxarch.cpp Outdated
// LEA needs 64-bit operands so we need to widen the index if it's TYP_INT.
// Since it's TYP_INT the upper 32 bits aren't used so we should be able
// to widen in place, without needing a temporary register.
getEmitter()->emitIns_R_R(INS_mov, EA_4BYTE, indexReg, indexReg);
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about this. Sure, it's a TYP_INT value so the upper 32 bits should not be used.

But what if LSRA in the future does something like assign a 64 bit register variable to a 32 bit use - a[(int)longVarReg] - and this code ends up zeroing out the upper 32 bits of longVarReg.

@CarolEidt Opinions ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, very good point. I'm believe that today that would be a cast and so we would only reuse the register if the source was a last use (and index would be a cast). To be conservative, one could assert that index is not a register candidate lclVar, or it is a TYP_INT, with a comment that, in future, if we make the cast "implicit" for this case, we need to use a temp if the lclVar is not a last use.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this kind of register "reinterpretation" already occurs, but in a different manner:

N014 (  2,  3) [000007] ------------         t7 = *  CAST      long <- int REG rdx
                                                 /--*  t5     long   
                                                 +--*  t7     long   
N016 ( 10, 11) [000008] ---X--------         t8 = *  ADD       long   REG rax
                                                 /--*  t8     long   
N018 ( 14, 14) [000010] DA-X--------              *  STORE_LCL_VAR long   V02 tmp1          NA REG NA
N020 (  3,  4) [000018] -c--G--N----        t18 =    CLS_VAR_ADDR byref  Hnd=0xe8001558 REG NA
; V02 is long but here it is used as int and of course, the use gets the same register
N022 (  3,  2) [000013] C-----------        t13 =    LCL_VAR   int    V02 tmp1          rax REG rax
N024 (  1,  1) [000014] -c----------        t14 =    CNS_INT   int    3 REG NA
                                                 /--*  t13    int    
                                                 +--*  t14    int    
N026 (  8,  6) [000016] ------------        t16 = *  MUL       int    REG rax

This happens even in minopts so it looks like I'll need to actually allocate and the temp register when the array index needs to be widened, it cannot be done in place.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating this. This is rather troubling, but probably too fundamental to be easily changed.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, IMO it would better to not have LCLVAR nodes with a type other than the variable itself but I'm not sure if it's possible to avoid all the CQ fallout from adding the required cast nodes. I would guess it should be possible to make those cast nodes contained, though it's probably a bit complicated.

But what bugs me the most is that this kind of reinterpretation occurs even in minopts mode.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Feb 15, 2019

Added a commit to deal with the index widening issue. New x64 FX diff:

Total bytes of diff: 76 (0.00% of base)
    diff is a regression.
Top file regressions by size (bytes):
        1340 : System.Data.Common.dasm (0.11% of base)
        1160 : System.Private.DataContractSerialization.dasm (0.13% of base)
         646 : System.Private.Xml.dasm (0.02% of base)
         503 : Microsoft.VisualBasic.dasm (0.17% of base)
         371 : System.Runtime.Numerics.dasm (0.40% of base)
Top file improvements by size (bytes):
       -6525 : System.Private.CoreLib.dasm (-0.13% of base)
        -448 : NuGet.Frameworks.dasm (-0.39% of base)
        -268 : System.Text.RegularExpressions.dasm (-0.21% of base)
        -230 : System.Linq.Parallel.dasm (-0.03% of base)
        -206 : System.Net.HttpListener.dasm (-0.08% of base)
101 total files with size differences (15 improved, 86 regressed), 28 unchanged.
Top method regressions by size (bytes):
         204 ( 2.93% of base) : System.Private.DataContractSerialization.dasm - ArrayHelper`2:ReadArray(ref,ref,ref,int):ref:this (12 methods)
         141 ( 3.29% of base) : System.Drawing.Primitives.dasm - KnownColorTable:InitColorTable()
         138 ( 1.82% of base) : System.Private.CoreLib.dasm - DefaultBinder:BindToMethod(int,ref,byref,ref,ref,ref,byref):ref:this
          89 ( 0.64% of base) : System.Reflection.Metadata.dasm - MetadataReader:InitializeTableReaders(struct,ubyte,ref,ref):this
          84 ( 3.78% of base) : Microsoft.VisualBasic.dasm - IDOUtils:CreateFuncCallSiteAndInvoke(ref,ref,ref):ref
Top method improvements by size (bytes):
        -703 (-2.49% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -626 (-4.15% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -610 (-2.07% of base) : System.Private.CoreLib.dasm - Enumerator:MoveNext():bool:this (141 methods)
        -480 (-2.77% of base) : System.Private.Xml.dasm - XsltLoader:.ctor():this
        -447 (-4.06% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
Top method regressions by size (percentage):
          20 ( 9.62% of base) : System.Runtime.Numerics.dasm - BigIntegerCalculator:Compare(ref,ref):int
          17 ( 8.02% of base) : System.Linq.Parallel.dasm - SortHelper`2:Dispose():this
          15 ( 7.69% of base) : System.Private.DataContractSerialization.dasm - XmlBufferReader:GetChars(int,int,ref):int:this
          12 ( 6.06% of base) : Microsoft.CSharp.dasm - MethodTypeInferrer:SetUnknownsToNotDependent():this
          18 ( 5.92% of base) : System.Private.Xml.dasm - XmlSqlBinaryReader:GetInt64(int):long:this
Top method improvements by size (percentage):
         -20 (-10.58% of base) : System.Private.Xml.dasm - OutputScopeManager:LookupNamespace(ref):ref:this
         -23 (-9.24% of base) : System.Private.CoreLib.dasm - Hashtable:Clear():this
         -20 (-8.16% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetElementSibling(byref,byref,ref,ref):bool
         -32 (-8.10% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
         -32 (-8.10% of base) : Microsoft.VisualBasic.dasm - FixedExistanceList:MoveToFront(int):this
5678 total methods with size differences (914 improved, 4764 regressed), 118639 unchanged.
2 files had text diffs but not size diffs.
NuGet.Common.dasm had 51 diffs
System.Memory.dasm had 36 diffs

Worse than before. Oh well, it's minopts code.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Feb 15, 2019

x86 FX diff improved:

Total bytes of diff: -24349 (-0.11% of base)
    diff is an improvement.
Top file improvements by size (bytes):
      -13946 : System.Private.CoreLib.dasm (-0.35% of base)
       -3361 : System.Private.Xml.dasm (-0.11% of base)
       -1321 : System.Data.Common.dasm (-0.13% of base)
        -693 : System.Linq.Parallel.dasm (-0.11% of base)
        -649 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.03% of base)
37 total files with size differences (37 improved, 0 regressed), 92 unchanged.
Top method improvements by size (bytes):
        -929 (-7.23% of base) : System.Private.CoreLib.dasm - Dictionary`2:Resize(int,bool):this (26 methods)
        -892 (-3.99% of base) : System.Private.CoreLib.dasm - Dictionary`2:System.Collections.ICollection.CopyTo(ref,int):this (28 methods)
        -542 (-5.83% of base) : System.Private.CoreLib.dasm - Dictionary`2:CopyTo(ref,int):this (28 methods)
        -490 (-5.79% of base) : System.Private.CoreLib.dasm - Dictionary`2:FindEntry(ref):int:this (10 methods)
        -482 (-1.89% of base) : System.Private.CoreLib.dasm - Dictionary`2:.ctor(ref,ref):this (56 methods)
Top method improvements by size (percentage):
         -26 (-14.61% of base) : System.Private.Xml.dasm - XmlTextWriter:FindPrefix(ref):ref:this
         -23 (-13.69% of base) : System.Private.CoreLib.dasm - EventSource:AnyEventEnabled():bool:this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedList:MoveToFront(int):this
         -44 (-12.87% of base) : Microsoft.VisualBasic.dasm - FixedExistanceList:MoveToFront(int):this
         -23 (-12.23% of base) : System.Private.Xml.dasm - XPathNodeHelper:GetNonDescendant(byref,byref):bool
946 total methods with size differences (946 improved, 0 regressed), 123348 unchanged.

It turns out that LSRA build code was also messed up, it was allocating a temp register whenever the index type was TYP_I_IMPL. On x86 that would be... all the time.

@CarolEidt
Copy link
Copy Markdown

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Feb 16, 2019

Meh, OSX and java.nio.channels.ClosedChannelException again...

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test

@CarolEidt CarolEidt merged commit e3d4b9c into dotnet:master Feb 16, 2019
@mikedn mikedn deleted the idx-addr branch March 9, 2019 20:33
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants