Skip to content

Dynamic Resize of IterDomains#258

Merged
jacobhinkle merged 66 commits intoNVIDIA:mainfrom
jacobhinkle:dynamic_resize
May 16, 2023
Merged

Dynamic Resize of IterDomains#258
jacobhinkle merged 66 commits intoNVIDIA:mainfrom
jacobhinkle:dynamic_resize

Conversation

@jacobhinkle
Copy link
Collaborator

Fixes #256.

This extends the concretization/dynamic Fusion machinery introduced in #24 to include Resize expressions, which is how ops like pad, cat, and slice are represented.

@jacobhinkle
Copy link
Collaborator Author

Adding tests and need to iterate, but I think this is a start @naoyam

@naoyam
Copy link
Collaborator

naoyam commented Apr 28, 2023

Looks good overall. Added some comments

This should probably be refactored and reused in the original op in case
the inputs are all constant at definition.
Resized IterDomains are now replaced, but I am hitting a scheduling
error:

```
exception with description "!replay_has_rfactor_inp INTERNAL ASSERT
FAILED at "/opt/pytorch/nvfuser/csrc/transform_iter.cpp":519, please
report a bug to PyTorch. Error during replay, a transformation was
called that conflicts with an rfactor call.
Exception raised from BestEffortReplay at
/opt/pytorch/nvfuser/csrc/transform_iter.cpp:519 (most recent call
first):
```
@jacobhinkle jacobhinkle marked this pull request as ready for review May 1, 2023 17:12
Comment on lines +940 to +941
// TODO: The following fails with a SIGFPE in innerReductionHeuristic
//{{3, 5}, {-3, -2}, false}, // output is zero-dimensional
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #264

This fixes a subtle bug where all C++ tests were passing, but for
example `test_cat` in `test_python_frontend.py` failed. The exact same
test would fail in Python that succeeded in C++. The reason is that in
the Python frontend, a `FusionExecutorCache` is initialized _before_
defining the fusion, i.e. with an empty `Fusion`. Doing the check for
dynamic transforms in the constructor then for `FusionExecutorCache`
lead to always finding a static (empty) transform. Instead, this commit
adds the `isDynamic()` method which will cache its results. It should
only be used somewhere inside `runFusionWithInputs` which indicates that
the definition of the `Fusion` is complete.
@jacobhinkle
Copy link
Collaborator Author

!build

// Test full negative shifts, so output doesn't overlap input
{{3, 5},
{-5, 2},
false}, // TODO: why doesn't this miss due to concretize to broadcast?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is expected in this case?

This change replaces uses of IterDomainBuilder which previously required
registerMutation. This is because replacing an IterDomain in a
TensorDomain requires creation of an entire new TensorDomain and
swapping it out in the enclosing TensorView. This change simplifies that
by introducing a protected method IterDomain::setIterType, and making
DynamicTransformConcretizer a friend of IterDomain. Then we can simply
change the iter_type and not require any changes to the TensorDomain or
TensorView. This also means we do not need to modify the resize()
command since we will not call it in order to create the concretized
IterDomain for resize().
This was a hold-over from when I was doing more complicated
replacements. Since we now just change the IterType to Broadcast or
Iteration during concretization, there is not additional complexity to
consider, so this function is not needed.
Copy link
Collaborator

@naoyam naoyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support dynamic resize

2 participants