Layout propagation (Part 2) - Enable#1755
Merged
jjsjann123 merged 78 commits intomainfrom Feb 21, 2024
Merged
Conversation
6 tasks
Collaborator
Author
|
Tests failures are not caused by this code change and have been cleaned up. |
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
…yout_propagation_pr_0
wujingyue
approved these changes
Feb 20, 2024
jjsjann123
added a commit
that referenced
this pull request
Feb 20, 2024
Stacked PRs: #1755 enabling layout propagation through runtime #1792 propagation rule for broadcast #1790 propagation rule for binary op ==== #1788 adding layout inference pass **_<- this one_** What's in this PR: inferenceAllocationOrder pass that works on an entire Fusion: It computes AllocationOrder on inputs by looking at each TensorView's allocation_domain and rfactor_domain; It uses a predefined rule (in AllocationOrderInferencer) to traverse and propagate AllocationOrder from inputs to the entire fusion; Note that the pass itself doesn't mutate the fusion IR. It's just a utility function that suggests ways to specify allocation domain to be used by other optimization passes. - [x] adding inferenceAllocationOrder pass function; - [x] adding propagate rule for unary op; - [x] adding cpp test to verify propagation rule; Quick design doc: #1756 Future Work: * expanding propagation rule to cover more operation; --------- Co-authored-by: Jacob Hinkle <1454944+jacobhinkle@users.noreply.github.com> Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
Collaborator
Author
|
!build |
Collaborator
Author
|
failure is unrelated in CI. merging as-is. |
jjsjann123
added a commit
that referenced
this pull request
Feb 21, 2024
Stacked PRs: #1755 enabling layout propagation through runtime #1792 propagation rule for broadcast ==== #1790 propagation rule for binary op **_<- this one_** #1788 adding layout inference pass What's in this PR: BinaryOp propagation tries to merge the allocation order of both inputs: * when there's only one operand is a tensor, we just forward the recorded allocation order * when both operands are tensors, we resolve it by: i. prioritize the tensor with less broadcast iterdomain; ii. otherwise, we just propagate the allocation order of lhs. Propagation rule for binary operation, - [x] adding propagate rule for binary op; - [x] handling two scalar; - [x] handling intermediate tensors (factory tensor); - [x] adding cpp test to verify propagation rule; --------- Co-authored-by: Jacob Hinkle <1454944+jacobhinkle@users.noreply.github.com> Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
jjsjann123
added a commit
that referenced
this pull request
Feb 21, 2024
Stacked PRs: #1755 enabling layout propagation through runtime ==== #1792 propagation rule for broadcast **_<- this one_** #1790 propagation rule for binary op #1788 adding layout inference pass What's in this PR: BroadcastOp propagation tries to push all new broadcast iterdomain as outer dimensions for the output tensor. - [x] adding propagate rule for broadcast op; - [x] adding cpp test to verify propagation rule; --------- Co-authored-by: Jacob Hinkle <1454944+jacobhinkle@users.noreply.github.com> Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked PRs:
==== #1755 enabling layout propagation through runtime <- this one
#1792 propagation rule for broadcast
#1790 propagation rule for binary op
#1788 adding layout inference pass
What's in this PR:
Enabling the MemoryFormat optimization pass in runtime. The pass is run as part of pre_segment optimization pass.
Adding cpp test to verify optimization behavior
Quick design doc: #1756
TODOs: