Conversation
|
!build |
|
The JIT binary tests on H100 failure is due to a system problem: |
|
!build |
|
!build |
zasdfgbnm
left a comment
There was a problem hiding this comment.
Not done reading yet. Just posting some minor comment so far.
| const ValGraph& iel_graph, | ||
| const std::unordered_map<ValGroup, IterDomain*>& iel_promotion_map, | ||
| const std::unordered_map<ValGroup, ValGroups>& exact_covered_ids, | ||
| const VectorOfUniqueEntries<IterDomain*>& terminal_loop_ids) { |
There was a problem hiding this comment.
Why do we only consider terminal ids? Is it because of forwarding and we only want to consider the most forwarded ID?
There was a problem hiding this comment.
If an ID is not terminal, that means there's an consumer ID that's also in the same loop group, and that in turn means the non-terminal ID should never be the representative ID of this loop group. So, yes, we should only look for most forwarded IDs.
That said, I believe it is not required to consider only terminal IDs since the check around line 1278 should exclude non-terminal IDs anyway.
There was a problem hiding this comment.
Not done yet, but I believe I should post early, because I have a real worry #1830 (comment) to discuss.
| tv2->getRootDomain().at(0), | ||
| tv3->getRootDomain().at(0), | ||
| tv3->getRootDomain().at(1), | ||
| getParentId(tv3->axis(0), 4), |
There was a problem hiding this comment.
nit: would it be more readable if we use the following?
| getParentId(tv3->axis(0), 4), | |
| getChildId(tv3->getRootDomain().at(0), 1), |
They are equivalent, just easier to find in the drawing. In general, I think we should make the second argument of getParentId or getChildId as small as possible.
There was a problem hiding this comment.
Agree it'd be simpler, but since a replay may add a new expr using the same domain, traversing from parents to children may not always work.
| getParentId(tv4->axis(0), 4)}, | ||
| getParentId(tv4->axis(0), 4)}, |
There was a problem hiding this comment.
nit: would it be more readable if we use
| getParentId(tv4->axis(0), 4)}, | |
| getParentId(tv4->axis(0), 4)}, | |
| getChildId(tv4->getRootDomain().at(0), 1)}, | |
| getChildId(tv4->getRootDomain().at(0), 1)}, |
| getParentId(tv2->axis(0), 3), | ||
| getParentId(tv3->axis(0), 3), | ||
| getParentId(tv4->axis(0), 3)}, | ||
| getParentId(tv4->axis(0), 3)}, |
There was a problem hiding this comment.
nit: would it be more readable if we use:
| getParentId(tv2->axis(0), 3), | |
| getParentId(tv3->axis(0), 3), | |
| getParentId(tv4->axis(0), 3)}, | |
| getParentId(tv4->axis(0), 3)}, | |
| getChildId(tv2->getRootDomain().at(0), 1), | |
| getChildId(tv3->getRootDomain().at(2), 1), | |
| getChildId(tv3->getRootDomain().at(2), 1)}, | |
| getChildId(tv3->getRootDomain().at(2), 1)}, |
zasdfgbnm
left a comment
There was a problem hiding this comment.
Finished reading. Don't see any blocker.
|
!build |
|
Thanks for the review @zasdfgbnm. I'll merge the PR once the tests are done. I'll update the doc as well. |
This is the final step of the loop promotion analysis. The promotion map is almost completed at Step 3, but some partially inlined domains need one more propagation, which is done by Step 4 and Step 5. Step 5 is mostly just a repeat of Step 3. This basically concludes the loop promotion analysis, although there are a couple of issues that were found while working on indexing (#2218). Those issues will be addressed as further follow-up PRs. - Step 1: #1650 - Step 2: #1777 - Step 3: #1830 - Step 4: #2003
Step 3 of the loop promotion analysis.
Stacked on top of #1777.
After this, there'll be PRs for steps 4 and 5. They would be relatively smaller PRs as they are mostly just repeating steps 2 and 3.