Skip to content

IdModel: Step 5 of the loop promotion analysis#2220

Merged
naoyam merged 5 commits intomainfrom
idmodel_step5
May 10, 2024
Merged

IdModel: Step 5 of the loop promotion analysis#2220
naoyam merged 5 commits intomainfrom
idmodel_step5

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented May 8, 2024

This is the final step of the loop promotion analysis. The promotion map is almost completed at Step 3, but some partially inlined domains need one more propagation, which is done by Step 4 and Step 5. Step 5 is mostly just a repeat of Step 3.

This basically concludes the loop promotion analysis, although there are a couple of issues that were found while working on indexing (#2218). Those issues will be addressed as further follow-up PRs.

@naoyam naoyam added the idmodel label May 8, 2024
@naoyam naoyam force-pushed the idmodel_step5 branch 2 times, most recently from 4eecddf to aa2c9aa Compare May 10, 2024 00:14
@naoyam
Copy link
Collaborator Author

naoyam commented May 10, 2024

!build

bool build_graphs,
bool allow_self_mapping) {
bool allow_self_mapping)
: allow_self_mapping_(allow_self_mapping) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was just accidentally forgotten.


s5_loop_graph = idGraph(IdMappingMode::LOOP);
s5_loop_promotion_map =
updateValGroupIdMap(s5_loop_promotion_map, s5_loop_graph);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole function is mostly just a copy of IdModel::buildLoopPromotionMap but modified to save intermediate results for validations. I'll cleanup this part of the code after this PR.

@naoyam naoyam changed the title [WIP] IdModel: Step 5 of the loop promotion analysis IdModel: Step 5 of the loop promotion analysis May 10, 2024
@naoyam naoyam marked this pull request as ready for review May 10, 2024 06:22
@naoyam naoyam requested a review from zasdfgbnm May 10, 2024 06:22
@zasdfgbnm
Copy link
Collaborator

Should this be added back?
#1968

@naoyam
Copy link
Collaborator Author

naoyam commented May 10, 2024

Should this be added back? #1968

Thanks. Added back.

Comment on lines +67 to +73
// LOOP mode is important to resolve inlined broadcassts. If we have something
// like: consumer[i0o, threadIdx.x{i0i}] = producer[i0o,
// threadIdx.y{i0i}](computeAt = 1) which can easily happen when using shared
// memory. Loop is actually defined for all iteration domains, and resembles
// groups of iter domains that are effectively inlined with each other.
// Therefore iter domain's that are a common dependency of inlined leaf domains
// may be loop mapped together.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be very interested in seeing when the loop promotion we have is strictly required, and when it is just one way to do things. For example, if I have a fusion

T0[1, 4]
T1[3, 4] = T0[1, 4]

T1->reorder({{0, 1}});
T1->merge(0);
T1->split(2, inner=false);
propagate;
T0->inlineAt(1);

Then for this specific case, loop promotion is not necessary, because (i, j) in the leaf domain of T1 is ((i*6+j)%3, (i*6+j)/3) in T1's root domain. For T0, without loop promotion, (i, k) in the leaf domain is (0, i*2+k) in T0's root domain. According to Theorem 2.15.1 in https://github.com/NVIDIA/Fuser/blob/main/doc/math/integer-division.md, (i*6+j)/3 = i*2+j/3, which has very similar mathematical form as i*2+k. And this mathematical similarity tells us that each i takes the same slice of T0 and T1, therefore the program is valid even without loop promotion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very interesting. 🤯

@naoyam naoyam merged commit b574238 into main May 10, 2024
@naoyam naoyam deleted the idmodel_step5 branch May 10, 2024 17:25
naoyam added a commit that referenced this pull request May 21, 2024
No logic change. Mostly mechanical cleanup.

Replaced the test-specific IdModel subclass with a callback interface.
The callback interface allows to save all necessary temporary results
for validation. No more duplication of `buildLoopPromotionMap`. (Related
comment:
#2220 (comment))

To introduce the callback interface, moved the loop promotion part out
of `IdModel` to its own builder class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants