Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Eliminate duplicate SSA number bookkeeping#15011

Merged
CarolEidt merged 2 commits into
dotnet:masterfrom
mikedn:ssa-mem-num
Aug 23, 2018
Merged

Eliminate duplicate SSA number bookkeeping#15011
CarolEidt merged 2 commits into
dotnet:masterfrom
mikedn:ssa-mem-num

Conversation

@mikedn
Copy link
Copy Markdown

@mikedn mikedn commented Nov 13, 2017

During SSA construction SsaRenameState keeps a definition count for each variable in an array. But each variable has a lvPerSsaData array that does almost the same thing, count definitions. "Almost" because lvPerSsaData is a JitExpandArray that tracks only the array size and not the number of elements actually stored in the array.

Replace JitExpandArray with purposely designed "array" that is in charge with allocating new SSA numbers and handles their intricacies - RESERVED_SSA_NUM, UNINIT_SSA_NUM and FIRST_SSA_NUM.

This also allows the removal of the allocator from the array. Allocating new SSA numbers happens only during SSA construction and it's reasonable to pass the allocator to AllocSsaNum rather than increase the size of PerSsaArray and LclVarDsc.

While working on this I realized that UNINIT_SSA_NUM isn't actually needed. SsaBuilder::RenameVariables starts by assigning SSA numbers to all variables that are live at the start of fgFirstBB so it's not possible to encounter an use without a def on the SSA rename stack. And if it happens, well, there's a bug somewhere and falling back to minopts via a noway_assert is the appropriate thing to do.

No jit diffs.

0.28% less instructions retired and 0.6% less used memory.

PIN data: https://1drv.ms/x/s!Av4baJYSo5pjgr15HNl1qFfUxIWI8g

MemStats diff: https://gist.github.com/mikedn/68c76ebca65ce1c26e9d145eee420eb3

@mikedn mikedn force-pushed the ssa-mem-num branch 2 times, most recently from c70ee9c to 349dbe5 Compare November 23, 2017 22:24
@RussKeldorph RussKeldorph added this to the Future milestone Apr 13, 2018
@mikedn
Copy link
Copy Markdown
Author

mikedn commented Apr 19, 2018

Test failure in x86 Release build appears to be unrelated:

D:\j\workspace\x86_release_w---80850631\bin\tests\Windows_NT.x86.Release\TestWrappers\JIT.HardwareIntrinsics\JIT.HardwareIntrinsics.XUnitWrapper.cs(6197): error : JIT_HardwareIntrinsics._X86_General_VectorRet_ro_VectorRet_ro_._X86_General_VectorRet_ro_VectorRet_ro_cmd [FAIL] [D:\j\workspace\x86_release_w---80850631\tests\runtest.proj]
        
  
  Return code:      1
  Raw output file:      D:\j\workspace\x86_release_w---80850631\bin\tests\Windows_NT.x86.Release\Reports\JIT.HardwareIntrinsics\X86\General\VectorRet_ro\VectorRet_ro.output.txt
  Raw output:
  BEGIN EXECUTION
         "D:\j\workspace\x86_release_w---80850631\tests\..\bin\tests\Windows_NT.x86.Release\Tests\Core_Root\corerun.exe" VectorRet_ro.exe 
        Expected: 100
        Actual: 1073741855
        END EXECUTION - FAILED
        FAILED
        Test Harness Exitcode is : 1
        
  To run the test:
  > set CORE_ROOT=D:\j\workspace\x86_release_w---80850631\bin\tests\Windows_NT.x86.Release\Tests\Core_Root
  > D:\j\workspace\x86_release_w---80850631\bin\tests\Windows_NT.x86.Release\JIT\HardwareIntrinsics\X86\General\VectorRet_ro\VectorRet_ro.cmd
  
        Expected: True
        Actual:   False
          D:\j\workspace\x86_release_w---80850631\bin\tests\Windows_NT.x86.Release\TestWrappers\JIT.HardwareIntrinsics\JIT.HardwareIntrinsics.XUnitWrapper.cs(6197,0): at JIT_HardwareIntrinsics._X86_General_VectorRet_ro_VectorRet_ro_._X86_General_VectorRet_ro_VectorRet_ro_cmd()

Exit code seems to match STATUS_WX86_BREAKPOINT, whatever that means.

Retrying.

@dotnet-bot test Windows_NT x86 Release Innerloop Build and Test

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Apr 19, 2018

@dotnet/jit-contrib Has anyone seen a test failing as above, with that specific exit code? I can't reproduce the failure locally and I don't get any jit-diffs. I thought that it's just some sort of glitch in CI but on retry it failed exactly the same way. Any suggestions how to figure out what's going on?

@CarolEidt
Copy link
Copy Markdown

The HardwareIntrinsics tests are sensitive to the target that they are run on. It may be that the CI machine that it's running on has a different level of hardware support. You could try setting the various COMPlus_Enable variables to false to see if it repros.
Also, most of the intrinsics will fail to compile during crossgen, so jit-diffs aren't very useful there.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Apr 20, 2018

@CarolEidt Thanks! I actually did diffs by running the test with COMPLUS_JitDisasm=* but your comment nudged me in the right direction - in my quest for diffs I accidentally ended up using the checked JIT even to run the tests. But this bug cannot be reproduced with a checked build.

The root cause is that compInit does not initialize Vector128ULongHandle and Vector256ULongHandle. My own changes moved things in memory such that in some cases Vector128ULongHandle happens to contain the handle of Vector128<float> and this results in a intrinsic node being imported with an unsupported base type. Intrinsic codegen ends up emitting an instruction with INS_invalid and that somehow generates bogus machine code, including an int 3 instruction that results in the mentioned exist status.

I can send a fix later today. IMO it should be ported to 2.1, I suppose this bug may happen even without my changes and then you'll be looking at some strange failures that are difficult to investigate.

@CarolEidt
Copy link
Copy Markdown

@mikedn - excellent! Thanks for tracking that down.

@mikedn mikedn force-pushed the ssa-mem-num branch 2 times, most recently from 88f1dab to 5573241 Compare April 25, 2018 15:44
@mikedn mikedn changed the title [WIP] Eliminate duplicate SSA number bookkeeping Eliminate duplicate SSA number bookkeeping Apr 25, 2018
@mikedn mikedn force-pushed the ssa-mem-num branch 3 times, most recently from 014b4a7 to 0282351 Compare May 21, 2018 15:42
@AndyAyersMS
Copy link
Copy Markdown
Member

Seems like we should tag this with jit throughput too.

@mikedn anything else we need here other than review signoff?

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Jul 19, 2018

@AndyAyersMS Yep, it's mainly about throughput, contributes to #15108. Though you could add a "CleanUpTheMess" label if you have one 😁. I'm not sure how the JIT ended up with bogus stuff like UNINIT_SSA_NUM...

I need to fix the SsaDefArray comment that still mentions UNINIT_SSA_NUM. Otherwise it's good for review.

mikedn added 2 commits July 19, 2018 07:29
During SSA construction SsaRenameState keeps a definition count for each variable in an array. But each variable has a lvPerSsaData array that does almost the same thing, count the definitions. "Almost" because lvPerSsaData is a JitExpandArray that tracks only the array size and not the number of elements actually stored in the array.

Replace JitExpandArray with purposely designed "array" that is in charge with allocating new SSA numbers  and handles their intricacies - RESERVED_SSA_NUM, UNINIT_SSA_NUM and FIRST_SSA_NUM.

This also allows the removal of the allocator from the array. Allocating new SSA numbers happens only during SSA construction and it's reasonable to pass the allocator to AllocSsaNum rather than increasing the size of PerSsaArray and LclVarDsc.
This is not used. It is generated by SsaRenameState::CountForUse when the lclvar stack is empty but that cannot happen - SsaBuilder::RenameVariables starts by assigning SSA numbers to all variables that are live at the start of fgFirstBB.

And if it happens, well, there's a bug somewhere and falling back to minopts via a noway_assert is the appropriate thing to do.
@mikedn
Copy link
Copy Markdown
Author

mikedn commented Jul 19, 2018

SsaDefArray comment fixed. 2 builds appear to have failed due to some kind of timeout:

https://ci.dot.net/job/dotnet_coreclr/job/master/job/x64_checked_osx10.12_corefx_innerloop_prtest/378/consoleText
https://ci.dot.net/job/dotnet_coreclr/job/master/job/x64_checked_ubuntu_innerloop_prtest/5452/consoleText

@dotnet-bot test OSX10.12 x64 Checked CoreFX Tests
test Ubuntu x64 Checked Innerloop Build and Test

@AndyAyersMS
Copy link
Copy Markdown
Member

Might as well retry one more time.

@dotnet-bot retest OSX10.12 x64 Checked CoreFX Tests
@dotnet-bot retest Ubuntu x64 Checked Innerloop Build and Test

Copy link
Copy Markdown
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, but perhaps we should have somebody else weigh in too...

@AndyAyersMS
Copy link
Copy Markdown
Member

Seems odd to me that we have a per-local dense array mapping ssa numbers to LclSsaVarDsc. Seems like this would be mostly wasted space. But maybe I don't understand what is going on well enough.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Jul 19, 2018

Seems odd to me that we have a per-local dense array mapping ssa numbers to LclSsaVarDsc.

You mean that perhaps there should be a single array of LclSsaVarDsc? I though about that a bit back then when I did this. Right now it's not clear if it would be better/worse/useful:

  • It's not clear what it would do throughput wise - large allocations are kind of worse for arena allocator as they may leave large unused holes at the end of arena pages. It could be that resizing the array would be more costly if it turns out that many vars have only one SSA def.
  • There are more changes required to make that happen. At the very least we need to do something about the VN code that assigns VNF_InitVal to parameters and initialized variables - it assumes that their SSA number is FIRST_SSA_NUM.
  • Right now it doesn't seem to be useful. It may be useful if the JIT starts doing more with SSA, like relying on it for register allocation. Then it probably look at SSA definition/numbers instead of lclvars and their numbers. And then probably copy propagation wouldn't have to maintain the SSA form as it does now. And who knows what else. But that's probably far in the future.
  • And if the JIT starts doing more with SSA it's not clear if this side representation of SSA defs is best. Maybe we should just get rid of lclvars and have only GT_DEF nodes that carry the basic block and VN pair that's now in LclSsaVarDsc.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Jul 22, 2018

Your comment got me thinking a bit more about using a single array for SSA definitions. It's actually pretty easy to implement on top of this, we just need a bit of care with array sizing. Using lvaTrackedCount * some small constant should do the job.

Using a single array has the advantage of making LclVarDsc 16 bytes smaller, on top of this PR that already removes 8 bytes. It also seems to reduce instructions retired by a further ~0.5%.

But it's a change that can go in a separate PR, this one can be considered as a step along the way. And will probably need #15351 as well to get rid of SSA numbers being assigned to non tracked lclvars.

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Aug 21, 2018

FWIW, it looks like it would be possible to replace the SSA number stored in LclVar nodes with a pointer, without increasing the node size. Then we wouldn't need the SSA def array anymore. But that starts to move into SSA redesign territory. And then, with a bit more work we may discover that we don't need even LclVar nodes. But that's far in the future, if ever.

@CarolEidt
Copy link
Copy Markdown

I was kind of hoping to see another, more "front-end" oriented JIT dev review this, but I've taken a look and it LGTM. I'm less familiar with the memory SSA than the lclVar SSA, but both sets of changes look reasonable to me.

These changes definitely improve the clarity - I might suggest (in future) that we change the "count" terminology (which has always thrown me).

Copy link
Copy Markdown

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@mikedn
Copy link
Copy Markdown
Author

mikedn commented Aug 23, 2018

I might suggest (in future) that we change the "count" terminology (which has always thrown me)

@CarolEidt Where? I already made many count -> ssaNum renames.

@CarolEidt
Copy link
Copy Markdown

We still refer to the ssa names / ssaNums as "counts" in places like CountForUse and CountForMemoryUse, which in that particular context seems particularly weird to me.

However, IMO we should merge this and consider doing that at another time.

@CarolEidt CarolEidt merged commit 4be947e into dotnet:master Aug 23, 2018
@mikedn
Copy link
Copy Markdown
Author

mikedn commented Aug 23, 2018

We still refer to the ssa names / ssaNums as "counts" in places like CountForUse and CountForMemoryUse, which in that particular context seems particularly weird to me.

Ah, those, I'm planning to rename as part of #15000 since that cleans up the rest of the rename stack code.

@CarolEidt
Copy link
Copy Markdown

I'm planning to rename as part of #15000 since that cleans up the rest of the rename stack code.

Awesome! A quick glance at that looks like it will be quite an improvement

@mikedn mikedn deleted the ssa-mem-num branch September 28, 2019 19:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants