JIT: Always compute loop iteration estimate in loop inversion if we have PGO data by amanasifkhalid · Pull Request #116104 · dotnet/runtime

amanasifkhalid · 2025-05-29T17:37:13Z

Ensure loop inversion always comes up with a loop iteration estimate better than BB_LOOP_WEIGHT_SCALE if we have PGO data.

Copilot

Pull Request Overview

This PR updates the loop inversion logic to skip inverting loops that are expected to iterate only a few times, based on profile weight data.

Simplifies the iteration count estimation by using the likely weight of the test block and the called count.
Removes the previous, more complex handling of profile weights and loop entry estimation.

amanasifkhalid · 2025-05-29T19:31:27Z

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs show large size decreases (with libraries_tests being the outlier), as well as some size increases from RBO being pessimized by less branch duplication. I'm not sure what the cutoff for inversion should be, so if these diffs seem too big, I can reduce it a bit.

It's worth noting that I'm not cutting off any loops when we don't have PGO data. Since inversion currently runs before optSetBlockWeights, I don't think I can make any assumptions about loop iteration counts here.

AndyAyersMS · 2025-05-29T21:19:41Z

I think this is a tricky one to get right.

If a loop has low average iteration count it can still have instances with high iteration counts.
If a loop has low iteration counts the method with the loop may be called frequently, or the loop may be inside another loop with high iteration counts, etc.

amanasifkhalid · 2025-05-29T21:59:47Z

or the loop may be inside another loop with high iteration counts

In this case, wouldn't we compute a high iteration count for the nested loop too (assuming the parent loop doesn't conditionally execute the child loop)?

I agree that this approach isn't sensitive to the other cases you mentioned. The loop inversion diffs that inspired this change didn't necessarily involve loops with low iteration counts; rather, they were loops that are more likely to fall through than loop, or otherwise weren't likely to run more than once per method call. It feels a bit crude, but I could take a safer approach and skip loops that don't iterate at least twice on average -- in other words, it has to behave like a loop on average to be inverted.

AndyAyersMS · 2025-05-29T22:25:21Z

or the loop may be inside another loop with high iteration counts

In this case, wouldn't we compute a high iteration count for the nested loop too (assuming the parent loop doesn't conditionally execute the child loop)?

Ah, I should have looked more closely. You are computing a method-entry relative count, not a loop-entry relative count... I just assumed "iteration count" meant the latter.

So yes what you are doing would handle the nested case ok.

I'd like to see what a size-based heuristic looks like. I think that is perhaps less prone to mis-estimating importance or potential benefit from inversion (?).

amanasifkhalid · 2025-05-29T22:30:11Z

I'd like to see what a size-based heuristic looks like.

I was thinking of reusing the size heuristic you added for loop cloning: If a loop is too big to likely benefit from cloning, then it's probably not tight enough to benefit from inversion. Does that seem like a reasonable starting point?

I don't think we can easily separate out the size heuristic change from #116017, since we need the loop data structures computed to easily compute the loop size. I can push a change to that PR with the size restriction and see how the diffs change.

AndyAyersMS · 2025-05-29T22:36:32Z

I'd like to see what a size-based heuristic looks like.

I was thinking of reusing the size heuristic you added for loop cloning: If a loop is too big to likely benefit from cloning, then it's probably not tight enough to benefit from inversion. Does that seem like a reasonable starting point?

Sure, using the same size threshold seems reasonable.

amanasifkhalid · 2025-05-30T19:41:55Z

Based on my trial and error with different size limits for loop inversion (comment), I think we're unlikely to pursue a loop iteration heuristic for now. I'm going to remove the heuristic portion and just make this into a refactor of the loop iteration computation, so that we're at least always doing it.

…manasifkhalid/runtime into loop-inversion-iteration-count

amanasifkhalid · 2025-08-05T17:56:28Z

@AndyAyersMS I thought I'd revive this to cut down on my PR backlog. The only material change in this is we always try to estimate the loop iteration count when we have PGO data, even if the weights into the loop are inconsistent. Because we run profile repair right before loop inversion, we don't encounter inconsistency all that often. From what I've seen, most of the cases where block weights are still inconsistent are under OSR, which is known to trip up profile repair. Under OSR, we can assume the loop is very hot, so even if the loop iteration count loses some imprecision from the lack of profile consistency, I suspect the computed value is always more realistic than BB_LOOP_WEIGHT_SCALE (8).

The diffs are small, and seem to be inflated by duplicate method contexts, according to the disasm summaries. Ex:

Top method regressions (bytes):
          44 (6.667% of base) : 45083.dasm - System.Text.Ascii:NarrowUtf16ToAscii_Intrinsified(ptr,ptr,nuint):nuint (Tier1)
          44 (6.667% of base) : 73098.dasm - System.Text.Ascii:NarrowUtf16ToAscii_Intrinsified(ptr,ptr,nuint):nuint (Tier1)
          44 (6.707% of base) : 46611.dasm - System.Text.Ascii:NarrowUtf16ToAscii_Intrinsified(ptr,ptr,nuint):nuint (Tier1)
          44 (6.667% of base) : 57315.dasm - System.Text.Ascii:NarrowUtf16ToAscii_Intrinsified(ptr,ptr,nuint):nuint (Tier1)
          44 (6.707% of base) : 41594.dasm - System.Text.Ascii:NarrowUtf16ToAscii_Intrinsified(ptr,ptr,nuint):nuint (Tier1)
          44 (6.707% of base) : 58719.dasm - System.Text.Ascii:NarrowUtf16ToAscii_Intrinsified(ptr,ptr,nuint):nuint (Tier1)
          44 (4.331% of base) : 41588.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (4.435% of base) : 45023.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (4.418% of base) : 45068.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (4.331% of base) : 58707.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (4.418% of base) : 57303.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (4.331% of base) : 46598.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (4.435% of base) : 67635.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (4.365% of base) : 73086.dasm - System.Text.Ascii:NarrowUtf16ToAscii(ptr,ptr,nuint):nuint (Tier1)
          44 (1.321% of base) : 41587.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ptr,int,ptr,int,byref,byref):int (Tier1)
          44 (1.327% of base) : 57302.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ptr,int,ptr,int,byref,byref):int (Tier1)
          44 (1.330% of base) : 45022.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ptr,int,ptr,int,byref,byref):int (Tier1)
          44 (1.330% of base) : 67634.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ptr,int,ptr,int,byref,byref):int (Tier1)
          44 (1.319% of base) : 46597.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ptr,int,ptr,int,byref,byref):int (Tier1)
          44 (1.329% of base) : 45067.dasm - System.Text.Unicode.Utf8Utility:TranscodeToUtf8(ptr,int,ptr,int,byref,byref):int (Tier1)

AndyAyersMS

Yes, let's take this one.

amanasifkhalid added 2 commits May 29, 2025 13:08

More precise loop iteration computation

672b2c0

Skip loops with low iteration counts

d8e96d6

Copilot AI review requested due to automatic review settings May 29, 2025 17:37

github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 29, 2025

dotnet-policy-service Bot assigned amanasifkhalid May 29, 2025

Copilot AI reviewed May 29, 2025

View reviewed changes

Comment thread src/coreclr/jit/optimizer.cpp Outdated

Comments

464d8e4

Merge branch 'main' into loop-inversion-iteration-count

21503d1

build-analysis Bot mentioned this pull request May 30, 2025

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

3 tasks

Remove heuristic

482aec8

amanasifkhalid changed the title ~~JIT: Don't invert loops with low iterations counts~~ JIT: Always compute loop iteration estimate in loop inversion if we have PGO data May 30, 2025

amanasifkhalid added 6 commits May 30, 2025 15:52

Merge branch 'loop-inversion-iteration-count' of https://github.com/a…

9aee3aa

…manasifkhalid/runtime into loop-inversion-iteration-count

Merge branch 'main' into loop-inversion-iteration-count

c08588c

Use loop entry-relative iteration computation

0e0ba90

Merge from main

414295f

Cleanup

c7083b6

Remove dead code

80cb61f

build-analysis Bot mentioned this pull request Aug 4, 2025

Timeout in HostFactoryResolverTests.NoSpecialEntryPointPatternCanRunInParallel #114704

Open

AndyAyersMS approved these changes Aug 5, 2025

View reviewed changes

amanasifkhalid merged commit 67541ba into dotnet:main Aug 5, 2025
103 of 105 checks passed

amanasifkhalid deleted the loop-inversion-iteration-count branch August 5, 2025 18:19

DrewScoggins mentioned this pull request Aug 7, 2025

[Perf] Linux/arm64: 1 Regression on 8/5/2025 9:21:29 PM +00:00 #118494

Open

github-actions Bot locked and limited conversation to collaborators Sep 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Always compute loop iteration estimate in loop inversion if we have PGO data#116104

JIT: Always compute loop iteration estimate in loop inversion if we have PGO data#116104
amanasifkhalid merged 11 commits intodotnet:mainfrom
amanasifkhalid:loop-inversion-iteration-count

amanasifkhalid commented May 29, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

amanasifkhalid commented May 29, 2025 •

edited

Loading

Uh oh!

AndyAyersMS commented May 29, 2025

Uh oh!

amanasifkhalid commented May 29, 2025

Uh oh!

AndyAyersMS commented May 29, 2025

Uh oh!

amanasifkhalid commented May 29, 2025

Uh oh!

AndyAyersMS commented May 29, 2025

Uh oh!

amanasifkhalid commented May 30, 2025

Uh oh!

amanasifkhalid commented Aug 5, 2025

Uh oh!

AndyAyersMS left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

amanasifkhalid commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

amanasifkhalid commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndyAyersMS commented May 29, 2025

Uh oh!

amanasifkhalid commented May 29, 2025

Uh oh!

AndyAyersMS commented May 29, 2025

Uh oh!

amanasifkhalid commented May 29, 2025

Uh oh!

AndyAyersMS commented May 29, 2025

Uh oh!

amanasifkhalid commented May 30, 2025

Uh oh!

amanasifkhalid commented Aug 5, 2025

Uh oh!

AndyAyersMS left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amanasifkhalid commented May 29, 2025 •

edited

Loading

amanasifkhalid commented May 29, 2025 •

edited

Loading