Visit all Expr outputs during concretization by jacobhinkle · Pull Request #591 · NVIDIA/Fuser

jacobhinkle · 2023-07-14T19:20:19Z

This change ensures that concretization visits all outputs to active expressions. Even though unused outputs do not directly affect the result of the Fusion, they still must not have Symbolic IterDomains during scheduling. Note that this pre-empts the more complicated #420.

This fixes the immediate cause of #418. However, that test still fails due to another issue where scalars get lost during segmentation. The test included here only tests that concretization works properly; scheduling and execution will be tested in another PR.

This does not test execution, which is hitting a separate issue.

It would be better to address this in StmtSort/IterVisitor

jacobhinkle · 2023-07-14T19:21:45Z

csrc/dynamic_transform.cpp

+    if (tv->definition()) {
+      for (auto outp :
+           ir_utils::filterByType<TensorView>(tv->definition()->outputs())) {
+        if (visited_tvs.find(outp) == visited_tvs.end()) {
+          mutate(outp);
+          visited_tvs.insert(tv);
+        }
+      }
+    } else {
+      if (visited_tvs.find(tv) == visited_tvs.end()) {
+        mutate(tv);
+        visited_tvs.insert(tv);
+      }
+    }


This is awkward and inefficient, and this functionally should probably be implemented as an option to StmtSort::getStmts instead.

Pushed an alternative that adds a traverse_siblings option for IterVisitor and StmtSort.

…utputs

jacobhinkle · 2023-07-14T22:23:50Z

csrc/scheduler/utils.cpp


  auto replay_exprs =
-      StmtSort::getExprs(tv->fusion(), {reference_id}, false, false);
+      StmtSort::getExprs(tv->fusion(), {reference_id}, false, false, false);


This was hitting the wrong overload and trying to convert {reference_id} to bool.

Which one are you referring to?

Oh, I realized there's a new overload in this PR

This makes me concerned as it's easy to make this mistake. How about changing the name to avoid the overload? Maybe getExprs and getExprsTo?

Great idea. I'll rename it.

jacobhinkle · 2023-07-17T13:05:29Z

!build

naoyam · 2023-07-17T18:37:11Z

csrc/iter_visitor.cpp

+      if (traverse_siblings) {
+        // Add unvisited siblings to next_stmts
+        std::vector<Statement*> unvisited_sibs;
+        for (auto next_val : ir_utils::filterByType<Val>(next_stmts)) {
+          for (auto sib : ir_utils::siblingValsOf(next_val)) {
+            if (visited.find(sib) == visited.end()) {
+              // Push to separate vector so that we don't modify next_stmts
+              // while looping
+              unvisited_sibs.push_back(sib);
+            }
+          }
+        }
+        next_stmts.insert(
+            next_stmts.end(), unvisited_sibs.begin(), unvisited_sibs.end());
+      }


A couple of questions.

Do we want to visit siblings even when they are in visited if traverse_all_paths is true?

More commonly, though, since when we are visiting an input first, if that has a sibling, it would be most likely not visited yet, no matter if it's an orphaned sibling or not. Wouldn't this change the traversal ordering then? It would still be a topological order, but I'd be surprised if traverse_siblings would change the traversal order of a fusion with no orphaned siblings.

Can you please add unit tests just for this traversal with siblings?

Yeah, we should add them even if visited here I believe.

You're right that this flag changes the ordering. An alternative that would not change the ordering would be to add siblings to a vector called maybe_orphaned_siblings here, then after the while (!stmt_stack.empty()) {} loop we would loop over maybe_orphaned_siblings and call handle on those that are not found in visited. This would use some extra space to hold the vector of siblings, but would perform the same number of searches in visited.

Yes. I will add some tests for the generic traversal.

Re 2: The alternative sounds simpler and better to me. The extra space would be negligible.

Pushed changes that address 1 and 2 above. Will add tests soon

Pushed tests, and fixed a bug causing siblings of outputs to not be traversed.

This prevents accidental use of the wrong overload, wherein a vector of Vals might be converted to bool.

This is still a topological ordering since these siblings have no active uses (by definition), and this method prevents changing the traversal ordering by changing this flag when there are no orphaned siblings in the Fusion.

jacobhinkle · 2023-07-18T14:18:29Z

!build

jacobhinkle · 2023-07-18T16:56:01Z

test/test_gpu3.cpp

+  stmts = StmtSort::getStmtsTo(
+      &fusion,
+      {wf.n},
+      /*traverse_all_paths*/ false,
+      /*traverse_attributes*/ false,
+      /*traverse_siblings*/ true);


This test exposed a bug where siblings of outputs in to were not being traversed. Fixed now.

naoyam

LGTM. Thanks for the fix and overall improvement.

jacobhinkle added 2 commits July 14, 2023 15:08

Add test for concretization in repro of #418

7399d1d

This does not test execution, which is hitting a separate issue.

Explicitly mutate all expr outputs in concretization.

aa5e8d2

It would be better to address this in StmtSort/IterVisitor

jacobhinkle changed the title ~~Concretize all expr outputs~~ Visit all Expr outputs during concretization Jul 14, 2023

jacobhinkle commented Jul 14, 2023

View reviewed changes

Add traverse_siblings option to IterVisitor and StmtSort

cb48d7c

jacobhinkle marked this pull request as ready for review July 14, 2023 21:12

jacobhinkle requested a review from naoyam July 14, 2023 21:13

jacobhinkle added 2 commits July 17, 2023 08:16

Fix segfault from modifying vector while looping

d4987e6

Merge remote-tracking branch 'origin/main' into concretize_all_expr_o…

f6ed958

…utputs

jacobhinkle commented Jul 17, 2023

View reviewed changes

Merge branch 'main' into concretize_all_expr_outputs

796674e

naoyam reviewed Jul 17, 2023

View reviewed changes

jacobhinkle and others added 6 commits July 17, 2023 14:42

Rename overloads getExprsTo, getStmtsTo

cc388ad

This prevents accidental use of the wrong overload, wherein a vector of Vals might be converted to bool.

Handle orphaned siblings at end of loop

d19575d

This is still a topological ordering since these siblings have no active uses (by definition), and this method prevents changing the traversal ordering by changing this flag when there are no orphaned siblings in the Fusion.

Fix typo in traverseBetween

f12856d

Add IterVisitorTraverseSiblings_CUDA test

48157a1

Fix bug where siblings of outputs were not added

0cd1b25

Merge branch 'main' into concretize_all_expr_outputs

8c72e12

jacobhinkle commented Jul 18, 2023

View reviewed changes

naoyam approved these changes Jul 18, 2023

View reviewed changes

jacobhinkle added 2 commits July 18, 2023 19:15

Merge branch 'main' into concretize_all_expr_outputs

3400150

Merge branch 'main' into concretize_all_expr_outputs

58a1cee

jacobhinkle merged commit 766aa7b into main Jul 19, 2023

jacobhinkle deleted the concretize_all_expr_outputs branch July 19, 2023 01:54

jacobhinkle mentioned this pull request Jul 20, 2023

Concretize all dynamic outputs of downstream expressions #420

Closed

Conversation

jacobhinkle commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jacobhinkle commented Jul 17, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jacobhinkle commented Jul 18, 2023

Uh oh!

jacobhinkle Jul 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

naoyam left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jacobhinkle commented Jul 14, 2023 •

edited

Loading

jacobhinkle Jul 18, 2023 •

edited

Loading