Skip to content

Visit all Expr outputs during concretization#591

Merged
jacobhinkle merged 14 commits intomainfrom
concretize_all_expr_outputs
Jul 19, 2023
Merged

Visit all Expr outputs during concretization#591
jacobhinkle merged 14 commits intomainfrom
concretize_all_expr_outputs

Conversation

@jacobhinkle
Copy link
Collaborator

@jacobhinkle jacobhinkle commented Jul 14, 2023

This change ensures that concretization visits all outputs to active expressions. Even though unused outputs do not directly affect the result of the Fusion, they still must not have Symbolic IterDomains during scheduling. Note that this pre-empts the more complicated #420.

This fixes the immediate cause of #418. However, that test still fails due to another issue where scalars get lost during segmentation. The test included here only tests that concretization works properly; scheduling and execution will be tested in another PR.

This does not test execution, which is hitting a separate issue.
It would be better to address this in StmtSort/IterVisitor
@jacobhinkle jacobhinkle changed the title Concretize all expr outputs Visit all Expr outputs during concretization Jul 14, 2023
Comment on lines +462 to +475
if (tv->definition()) {
for (auto outp :
ir_utils::filterByType<TensorView>(tv->definition()->outputs())) {
if (visited_tvs.find(outp) == visited_tvs.end()) {
mutate(outp);
visited_tvs.insert(tv);
}
}
} else {
if (visited_tvs.find(tv) == visited_tvs.end()) {
mutate(tv);
visited_tvs.insert(tv);
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awkward and inefficient, and this functionally should probably be implemented as an option to StmtSort::getStmts instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed an alternative that adds a traverse_siblings option for IterVisitor and StmtSort.

@jacobhinkle jacobhinkle marked this pull request as ready for review July 14, 2023 21:12
@jacobhinkle jacobhinkle requested a review from naoyam July 14, 2023 21:13

auto replay_exprs =
StmtSort::getExprs(tv->fusion(), {reference_id}, false, false);
StmtSort::getExprs(tv->fusion(), {reference_id}, false, false, false);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was hitting the wrong overload and trying to convert {reference_id} to bool.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one are you referring to?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I realized there's a new overload in this PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me concerned as it's easy to make this mistake. How about changing the name to avoid the overload? Maybe getExprs and getExprsTo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I'll rename it.

@jacobhinkle
Copy link
Collaborator Author

!build

Comment on lines +232 to +246
if (traverse_siblings) {
// Add unvisited siblings to next_stmts
std::vector<Statement*> unvisited_sibs;
for (auto next_val : ir_utils::filterByType<Val>(next_stmts)) {
for (auto sib : ir_utils::siblingValsOf(next_val)) {
if (visited.find(sib) == visited.end()) {
// Push to separate vector so that we don't modify next_stmts
// while looping
unvisited_sibs.push_back(sib);
}
}
}
next_stmts.insert(
next_stmts.end(), unvisited_sibs.begin(), unvisited_sibs.end());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions.

  • Do we want to visit siblings even when they are in visited if traverse_all_paths is true?
  • More commonly, though, since when we are visiting an input first, if that has a sibling, it would be most likely not visited yet, no matter if it's an orphaned sibling or not. Wouldn't this change the traversal ordering then? It would still be a topological order, but I'd be surprised if traverse_siblings would change the traversal order of a fusion with no orphaned siblings.
  • Can you please add unit tests just for this traversal with siblings?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yeah, we should add them even if visited here I believe.
  2. You're right that this flag changes the ordering. An alternative that would not change the ordering would be to add siblings to a vector called maybe_orphaned_siblings here, then after the while (!stmt_stack.empty()) {} loop we would loop over maybe_orphaned_siblings and call handle on those that are not found in visited. This would use some extra space to hold the vector of siblings, but would perform the same number of searches in visited.
  3. Yes. I will add some tests for the generic traversal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re 2: The alternative sounds simpler and better to me. The extra space would be negligible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed changes that address 1 and 2 above. Will add tests soon

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed tests, and fixed a bug causing siblings of outputs to not be traversed.

jacobhinkle and others added 6 commits July 17, 2023 14:42
This prevents accidental use of the wrong overload, wherein a vector of
Vals might be converted to bool.
This is still a topological ordering since these siblings have no active
uses (by definition), and this method prevents changing the traversal
ordering by changing this flag when there are no orphaned siblings in
the Fusion.
@jacobhinkle
Copy link
Collaborator Author

!build

Comment on lines +9333 to +9338
stmts = StmtSort::getStmtsTo(
&fusion,
{wf.n},
/*traverse_all_paths*/ false,
/*traverse_attributes*/ false,
/*traverse_siblings*/ true);
Copy link
Collaborator Author

@jacobhinkle jacobhinkle Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test exposed a bug where siblings of outputs in to were not being traversed. Fixed now.

Copy link
Collaborator

@naoyam naoyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix and overall improvement.

@jacobhinkle jacobhinkle merged commit 766aa7b into main Jul 19, 2023
@jacobhinkle jacobhinkle deleted the concretize_all_expr_outputs branch July 19, 2023 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants