Skip to content

Conversation

@jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Dec 12, 2023

Merge the candidate identification and marking into one single lexical pass over the basic blocks. Use a couple of bit vectors to track which loops we have seen blocks of (to know when we've found the top most blocks), and to know which loops we have decided to align.

Merge the candidate identification and marking into one single lexical
pass over the basic block. Use a couple of bit vectors to track which
loops we have seen blocks of (to know when we've found the top most
blocks), and to know which loops we have decided to align.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 12, 2023
@ghost ghost assigned jakobbotsch Dec 12, 2023
@ghost
Copy link

ghost commented Dec 12, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Merge the candidate identification and marking into one single lexical pass over the basic block. Use a couple of bit vectors to track which loops we have seen blocks of (to know when we've found the top most blocks), and to know which loops we have decided to align.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

area-CodeGen-coreclr

Milestone: -

@jakobbotsch jakobbotsch marked this pull request as ready for review January 4, 2024 14:37
@jakobbotsch
Copy link
Member Author

jakobbotsch commented Jan 4, 2024

cc @dotnet/jit-contrib PTAL @kunalspathak

A few minor diffs with alignment enabled. They happen because the old logic wasn't correct for tracking if a block was in an aligned loop in the presence of interleaved loops or loop blocks interleaved with non-loop blocks.
For example, here is a jitdump diff:

 ***************  (New) Natural loop graph
 L00 header: BB47
   Members (3): [BB47..BB48];BB59
   Entry: BB46 -> BB47
   Exit: BB47 -> BB51; BB48 -> BB49
   Back: BB59 -> BB47
 L01 header: BB53
   Members (1): BB53
   Entry: BB52 -> BB53
   Exit: BB53 -> BB54
   Back: BB53 -> BB53
 
-Aligning L00 that starts at BB47, weight=400 >= 300.
-Aligning L01 that starts at BB53, weight=400 >= 300.
-Inside placeLoopAlignInstructions for 2 loop candidates.
 BB22, bbWeight=50 ends with unconditional 'jmp' 
+Aligning L00 that starts at BB47, weight=400 >= 300.
 Marking BB22 that ends with unconditional jump with BBF_HAS_ALIGN for loop at BB47
-BB59, bbWeight=200 ends with unconditional 'jmp' 
-Marking BB59 that ends with unconditional jump with BBF_HAS_ALIGN for loop at BB53
+Aligning L01 that starts at BB53, weight=400 >= 300.
+Marking BB52 before the loop with BBF_HAS_ALIGN for loop at BB53
+Found 2 candidates for loop alignment
 
 *************** Finishing PHASE Place 'align' instructions
 Trees after Place 'align' instructions
 
 -----------------------------------------------------------------------------------------------------------------------------------------
 BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]         [flags]
 -----------------------------------------------------------------------------------------------------------------------------------------
 BB01 [0000]  1                             1       [000..00E)-> BB56 ( cond )                     i LIR hascall gcsafe 
 BB02 [0002]  1       BB01                  0.50    [00F..021)-> BB04 ( cond )                     i LIR gcsafe 
 BB03 [0003]  1       BB02                  0.50    [021..03A)        (return)                     i LIR jmp hascall gcsafe idxlen 
 BB04 [0004]  1       BB02                  0.50    [03A..045)-> BB25 ( cond )                     i LIR gcsafe 
 BB05 [0005]  1       BB04                  0.50    [045..056)-> BB23 ( cond )                     i LIR gcsafe 
 BB06 [0006]  1       BB05                  0.50    [056..066)-> BB14 ( cond )                     i LIR gcsafe idxlen 
 BB07 [0007]  1       BB06                  0.50    [000..08A)-> BB14 (always)                     i LIR gcsafe nullcheck q 
 BB14 [0008]  2       BB06,BB07             0.50    [08A..09A)-> BB22 ( cond )                     i LIR gcsafe idxlen 
 BB15 [0009]  1       BB14                  0.50    [000..0BE)-> BB22 (always)                     i LIR gcsafe nullcheck q 
 BB22 [0010]  2       BB14,BB15             0.50    [0BE..0ED)-> BB55 (always)                     i LIR gcsafe idxlen has-align 
 BB23 [0011]  1       BB05                  0.50    [0ED..0FB)-> BB25 ( cond )                     i LIR hascall gcsafe 
 BB24 [0012]  1       BB23                  0.50    [0FB..114)        (return)                     i LIR jmp hascall gcsafe idxlen 
 BB25 [0013]  2       BB04,BB23             0.50    [114..126)-> BB27 ( cond )                     i LIR gcsafe 
 BB26 [0014]  1       BB25                  0.50    [126..145)        (return)                     i LIR jmp hascall gcsafe idxlen 
 BB27 [0015]  1       BB25                  0.50    [145..155)-> BB35 ( cond )                     i LIR gcsafe idxlen 
 BB28 [0016]  1       BB27                  0.50    [000..179)-> BB35 (always)                     i LIR gcsafe nullcheck q 
 BB35 [0017]  2       BB27,BB28             0.50    [179..189)-> BB43 ( cond )                     i LIR gcsafe idxlen 
 BB36 [0018]  1       BB35                  0.50    [000..1AD)-> BB43 (always)                     i LIR gcsafe nullcheck q 
 BB43 [0019]  2       BB35,BB36             0.50    [1AD..1B7)-> BB46 ( cond )                     i LIR gcsafe 
 BB44 [0020]  1       BB43                  0.50    [1B7..1CA)-> BB46 ( cond )                     i LIR gcsafe idxlen 
 BB45 [0021]  1       BB44                  0.50    [1CA..200)-> BB56 (always)                     i LIR gcsafe idxlen 
 BB46 [0022]  2       BB43,BB44             0.50    [200..207)-> BB47 (always)                     i LIR gcsafe LoopPH q 
 BB47 [0023]  2       BB46,BB59             4     0 [207..20F)-> BB51 ( cond )                     i LIR Loop bwd bwd-target align loopheader 
 BB48 [0024]  1       BB47                  4     0 [20F..21E)-> BB59 ( cond )                     i LIR idxlen bwd bwd-src 
 BB49 [0025]  1       BB48                  0.50    [21E..222)-> BB51 ( cond )                     i LIR 
 BB50 [0026]  1       BB49                  0.50    [222..23B)        (return)                     i LIR jmp hascall idxlen 
-BB59 [0069]  1       BB48                  2       [???..???)-> BB47 (always)                     LIR internal bwd has-align 
+BB59 [0069]  1       BB48                  2       [???..???)-> BB47 (always)                     LIR internal bwd 
 BB51 [0027]  2       BB47,BB49             0.50    [23B..248)-> BB54 ( cond )                     i LIR 
-BB52 [0066]  1       BB51                  0.50    [248..???)-> BB53 (always)                     LIR internal idxlen LoopPH q 
+BB52 [0066]  1       BB51                  0.50    [248..???)-> BB53 (always)                     LIR internal idxlen LoopPH has-align q 
 BB53 [0028]  2       BB52,BB53             4     1 [248..27C)-> BB53 ( cond )                     i LIR Loop idxlen bwd bwd-target align loopheader 
 BB54 [0030]  2       BB51,BB53             0.50    [27C..292)-> BB55 (always)                     i LIR idxlen q 
 BB55 [0065]  2       BB22,BB54             0.50    [292..2A1)-> BB56 (always)                     i LIR idxlen q 
 BB56 [0063]  3       BB01,BB45,BB55        0.50    [???..???)        (return)                     LIR keep internal 
 BB57 [0067]  0                             0       [???..???)        (throw )                     i LIR rare keep internal 
 BB58 [0068]  0                             0       [???..???)        (throw )                     i LIR rare keep internal 
 -----------------------------------------------------------------------------------------------------------------------------------------

Notice that in the base we insert alignment inside BB59 which is part of the previous loop that was also aligned (Members (3): [BB47..BB48];BB59, whose physical blocks are interleaved with two non-loop blocks).

@kunalspathak
Copy link
Contributor

the old logic wasn't correct for tracking if a block was in an aligned loop in the presence of interleaved loops or loop blocks interleaved with non-loop blocks.

The old logic meaning the one that was added in #95836?

Copy link
Contributor

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@jakobbotsch
Copy link
Member Author

The old logic meaning the one that was added in #95836?

No, the logic was preexisting, but presumably before the new loop representation most loops would be contiguous at the point of placeLoopAlignInstructions (and the logic worked fine in those cases).

@jakobbotsch jakobbotsch merged commit e565285 into dotnet:main Jan 4, 2024
@jakobbotsch jakobbotsch deleted the loop-alignment-cleanup branch January 4, 2024 20:05
@github-actions github-actions bot locked and limited conversation to collaborators Feb 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants