Skip to content

Conversation

@jakobbotsch
Copy link
Member

Some minor diffs expected from increased VN precision around newly recognized loops, which leads to different CSEs.

Some minor diffs expected from increased VN precision around newly
recognized loops, which leads to different CSEs.
@ghost ghost assigned jakobbotsch Dec 7, 2023
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 7, 2023
@ghost
Copy link

ghost commented Dec 7, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Some minor diffs expected from increased VN precision around newly recognized loops, which leads to different CSEs.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch
Copy link
Member Author

Hitting a loop alignment related assert I'll need to investigate:

ISSUE: <ASSERT> #567048 D:\a\_work\1\s\src\coreclr\jit\emit.cpp (4636) - Assertion failed '(jmpGroup->igFlags & IGF_HAS_ALIGN) == 0' in 'System.Security.Cryptography.Tests.RandomNumberGeneratorTests:VerifyDistribution[ushort](System.ReadOnlySpan`1[ushort],double)' during 'Generate code' (IL size 218; hash 0x53d500b4; Tier1)

@EgorBo
Copy link
Member

EgorBo commented Dec 7, 2023

Hitting a loop alignment related assert I'll need to investigate:

ISSUE: <ASSERT> #567048 D:\a\_work\1\s\src\coreclr\jit\emit.cpp (4636) - Assertion failed '(jmpGroup->igFlags & IGF_HAS_ALIGN) == 0' in 'System.Security.Cryptography.Tests.RandomNumberGeneratorTests:VerifyDistribution[ushort](System.ReadOnlySpan`1[ushort],double)' during 'Generate code' (IL size 218; hash 0x53d500b4; Tier1)

I think we have similar asserts in the outerloop PGO pipeline in Main, will file an issue

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Dec 8, 2023

It does seem to be related to the change here.
@kunalspathak I need some help to figure this out. We have this flow graph -- BB11 is the first block of a loop that we now want to align, so we mark BB10 to have alignment inserted. Before this change BB11 was not considered for alignment because we did not recognize its child loops, so the logic in AddContainsCallAllContainingLoops unmarked it for alignment. However, when we later generate BB08 we make BB08 a candidate for jump removal. We then generate BB10 without creating a new IG, so it becomes part of the same IG as BB08 and has IGF_HAS_ALIGN set. Finally, in emitRemoveJumpToNextInst we then hit the assert since the IG for BB08 and BB10 is marked for alignment and we also marked the jump from BB08 as a candidate for jump removal.

Any ideas? Do we need to force a new IG after an IG that is a candidate for jmp removal, or before a basic block that is marked for align insertion?

@kunalspathak
Copy link
Contributor

kunalspathak commented Dec 8, 2023

I think we have similar asserts in the outerloop PGO pipeline in Main, will file an issue

The one in #95755 seems to be different than what @jakobbotsch is getting and most likely happening due to recent changes in jump removal. I will take a look at both of these.

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Dec 8, 2023

The issue repros on main if you comment the following code:

#if FEATURE_LOOP_ALIGN
// If this is the inner most loop, reset the LOOP_ALIGN flag
// because a loop having call will not likely to benefit from
// alignment
if (!HasOldChildLoop(this, loop))
{
BasicBlock* top = loop->GetLexicallyTopMostBlock();
top->unmarkLoopAlign(this DEBUG_ARG("Loop with call"));
}
#endif

and then run linux-x64 libraries_Tests.run SPMI:

py .\src\coreclr\scripts\superpmi.py replay -f libraries_tests.run -target_os linux

@kunalspathak
Copy link
Contributor

Any ideas? Do we need to force a new IG after an IG that is a candidate for jmp removal, or before a basic block that is marked for align insertion?

We shouldn't force a new IG in such case because that defeats the purpose of placing the align instruction at that position. We are trying to hide it after jmp and if the jmp is removed, the align instruction will get executed. The assert actually was in wrong place, and I fixed it in #95845.

BTW, it occurred to me, I don't see that we mark the new loops detected in optFindNewLoops() for alignment. Is there a plan to do that?

optFindLoops();
m_dfsTree = fgComputeDfs();
optFindNewLoops();

@jakobbotsch
Copy link
Member Author

BTW, it occurred to me, I don't see that we mark the new loops detected in optFindNewLoops() for alignment. Is there a plan to do that?

Yes, but I am focusing on porting everything first, before enabling optimizations for new loops (which I will do in individual PRs after).
For loop alignment specifically there is #95836 first to move the candidate identification into placeLoopAlignmentInstructions first (reviving Bruce's old PR). That PR also switches to the new loop structures, but it currently still has a quirk that the loop was recognized by old loop finding, which will be removed separately. When we do that we may need some heuristic that the loop blocks are somewhat tightly packed together, since that's not guaranteed for new loops.

@jakobbotsch jakobbotsch marked this pull request as ready for review December 12, 2023 20:42
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @BruceForstall

Diffs. As mentioned above it's because of some increased precision in VN in some cases, leading to different CSEs.

@jakobbotsch jakobbotsch merged commit c7a51fd into dotnet:main Dec 13, 2023
@jakobbotsch jakobbotsch deleted the vn-remove-loop-quirks branch December 13, 2023 10:02
@github-actions github-actions bot locked and limited conversation to collaborators Jan 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants