Conversation
naoyam
commented
Jul 25, 2024
| EXPECT_TRUE(tv->axis(0)->isBroadcast()); | ||
| } | ||
|
|
||
| // Repro of unswitch predicate issue #681 |
Collaborator
Author
There was a problem hiding this comment.
Just moved to test_indexing.cpp
5424eb8 to
40f8097
Compare
7943d93 to
4852041
Compare
Collaborator
Author
|
!build |
zasdfgbnm
approved these changes
Jul 26, 2024
Base automatically changed from
idmodel_indexing_circular_buffer_predicate
to
main
July 27, 2024 01:21
naoyam
commented
Jul 27, 2024
| return loop_promotion_map_it->second; | ||
| } | ||
|
|
||
| // Check if unswitching a given for-loop actually matters. For example, |
Collaborator
Author
There was a problem hiding this comment.
@zasdfgbnm Forgot to add this code. I already merged this but let me know if you have any concern.
naoyam
added a commit
that referenced
this pull request
Nov 21, 2025
Enabling TensorIndexer for the transpose tests. Diff check revealed some mismatches, but I believe they are benign. As far as I can see, all of them are caused by some slight difference for issue #681. TensorIndexer's workaround is simpler but not as precise as the legacy indexer, as indicated in #2689. For example, there is an index for a predicate of `TransposeTest.FusionScheduleTransposeSimple`: ``` ( ( ( blockIdx.x % ( ( ceilDiv(( (( (( getMetaData(T0) )).logical_size ))[2] ), 32) ) * ( ceilDiv(( (( (( getMetaData(T0) )).logical_size ))[1] ), 32) ) ) ) / ( ceilDiv(( (( (( getMetaData(T0) )).logical_size ))[1] ), 32) ) ) * 32 ) + (threadIdx.x * 4) % 32) ``` The above is generated by the legacy indexer. With TensorIndexer, it's generated as: ``` ( ( ( ( blockIdx.x % ( ( ceilDiv(( (( (( getMetaData(T0) )).logical_size ))[2] ), 32) ) * ( ceilDiv(( (( (( getMetaData(T0) )).logical_size ))[1] ), 32) ) ) ) / ( ceilDiv(( (( (( getMetaData(T0) )).logical_size ))[1] ), 32) ) ) * 32 ) + 0 ) ``` The difference is `threadIdx.x 4 % 32` vs. `0`. Note that this is a predicate for the condition at the start position, i.e., they are predicated with `>= 0`, and as such the index of TensorIndexer is more conservative, so the legacy indexer may have a higher chance of using the fast path created by the unswitch predicate. However, in practice, both predicates should be always true, so it shouldn't matter. I don't see any actual difference with stop predicates.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(Stacked on top of #2677)
The original issue is #681. It was addressed in #687.
This PR is NOT as comprehensive as #687, but my gut feeling is that this should be good enough, in particular since contig indexing would avoid backward traversals through merge in many cases.
I'll do final more comprehensive comparison with the legacy indexing once contig indexing is done.
Since the original PR and issue were reviewed by @zasdfgbnm, could you please review this too?