Merged stores: Fix alignment-related issues and enable SIMD where possible#92939
Merged stores: Fix alignment-related issues and enable SIMD where possible#92939EgorBo merged 15 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsMerge e.g. two consecutive SIMD stores (e.g. 2x Vector256 into 1x Vector512). But I am still trying to build a mental model for the case with "multiple scalar stores -> SIMD store" (we currently don't do it).
* - only if target (e.g. struct) is known not to contain GC handles So far, it seems that x86/AMD64 doesn't offer any kind of guarantee for atomicity officially (even per component). Related issues: #76503, #51638,
|
4250c12 to
1a31d1a
Compare
Note this is from |
|
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
|
@jakobbotsch @dotnet/jit-contrib PTAL, Diffs (regression as expected because it made the whole #92852 algorithm more conservative, but the initial diffs were -400kb so most wins are expected to remain, obviously, most base addresses are TYP_REF like Jakob predicted). Wins on ARM64 due better SIMD guarantees. |
|
Improved Diffs on arm64 |
seems there are more regressions on linux/windows x64. Do we know why? |
|
these are reverted improvements from #92852 because they turned out to be not legal (but fortunately, most improvements remained) |
|
x86 SPMI jobs failed with timeout/"no space left", I'll check other runs |


Adjust rules when we can use unaligned stores for merged ones. Also, enable 2xLONG/REF -> SIMD. And 2xSIMD to wider SIMD.
Wider scalar primitives for naturally aligned data of primitives (>1B):
boundary?
SIMD for for naturally aligned data of primitives (>1B):
* both Intel and AMD
** it's very unlikely JIT can assume 16-byte alignment currently anyhow
PS: Merged stores are conservatively disabled on LA64 and RISC-V
Per "Arm Architecture Reference Manual":
@tannergooding said that x64 with AVX promises atomicy for 16B for 16B aligned data - so far it seems to be the only thing x64 can guarantee to us.
Related issues: #76503, #51638,