-
Notifications
You must be signed in to change notification settings - Fork 79
Enable resize scheduler by default #3848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
404d6b6
1b8fc0f
861c187
12d0e85
643779d
4f28c44
5f851ea
d7c9af1
4cb7fbd
e1b1790
c79e234
c5cc1ba
cc4f1c2
4994d84
cd59f29
5dc95f4
b979fc5
339aac3
a08ce8a
1d50e73
10d37d5
d72668e
4f17092
6499911
d9fa74e
9f6a79f
5dfb2ba
8ee1966
2a56bbd
90e13e3
0c0d9b8
18f3265
4cca2f4
89022b0
2315b48
0e9bc84
015c730
e0b3955
ea5bc58
c1c33b0
0c5c6f8
501beb3
4e8b9ce
875a92d
6336c7c
7b972c5
f473ecb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -528,18 +528,15 @@ void scheduleLoopDomainsBy( | |
| } | ||
| } | ||
|
|
||
| // It should be either: all of the inputs found and none of the | ||
| // outputs found, or none of the inputs found and all of the | ||
| // outputs found. | ||
| // If all of the inputs are found, the tranform expr is replayed as | ||
| // a forward op. | ||
| Direction replay_dir_tv = Direction::Undefined; | ||
| if (replay_dir != Direction::Backward && | ||
| input_ids.size() == transform->inputs().size()) { | ||
| NVF_ERROR(output_ids.empty()); | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I originally thought this should be always the case but that isn't actually case. In particular, there can be a
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nitpick: we should remove the comment above on line 525-527 then. |
||
| replay_dir_tv = Direction::Forward; | ||
| } else if ( | ||
| replay_dir != Direction::Forward && | ||
| output_ids.size() == transform->outputs().size()) { | ||
| NVF_ERROR(input_ids.empty()); | ||
| replay_dir_tv = Direction::Backward; | ||
| } else { | ||
| // Replay not possible since none of inputs nor outputs are connected with | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -76,7 +76,8 @@ void propagateResizeToInputs(Expr* resize_tensor_op) { | |
| continue; | ||
| } | ||
|
|
||
| scheduler_tools::scheduleLoopDomainsBy(tvs_to_schedule, resize); | ||
| scheduler_tools::scheduleLoopDomainsBy( | ||
| tvs_to_schedule, resize, Direction::Forward); | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added the direction option to make it more explicit as it's always forward transformations. |
||
| } | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3285,6 +3285,16 @@ def fusion_func(fd: FusionDefinition) -> None: | |
| nvf_out, _ = self.exec_nvfuser(fusion_func, inputs, supports_segmentation=False) | ||
| # self.assertEqual(nvf_out[0], t24) | ||
|
|
||
| # This fusion takes a long time to segment and schedule | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Without this, the python tests took about 2 hours to complete. This is a WAR @rdspring1 suggested. |
||
| # because of the resized extents, which seem to stress the | ||
| # expression simplifier a lot. Serializing this fusion would | ||
| # significantly increase the test time as it would be | ||
| # deserialized every time, which includes segmentation and | ||
| # scheduling. Ideally, we should optimize the expression | ||
| # simplifier, but for now resetting the cache should avoid the | ||
| # issue. | ||
| FusionCache.reset() | ||
|
|
||
| # Test that symbolic IterDomains can be concatenated | ||
| # https://github.com/NVIDIA/Fuser/issues/1554 | ||
| def test_cat_symbolic(self): | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me why we were using
largest_inputas vectorization reference in the first place?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to mention this, but this was an actually a bug. It should have been changed in #3955.