Schedule loop domains such that reshape transforms are cancelled by naoyam · Pull Request #3679 · NVIDIA/Fuser

naoyam · 2025-01-08T06:04:22Z

This PR adds a scheduling primitive, cancelReshapeInLoopDomains(TensorView* from_tv), where all reshape transforms appearing between from_tv and fusion outputs are effectively cancelled in their loop domains. Please see the comment for a motivating example.

This could be used to remove the restriction of the interfering reshape in reduction/normalization fusions.

naoyam · 2025-01-08T10:14:43Z

!test

naoyam · 2025-01-09T02:01:23Z

!test

naoyam · 2025-01-09T02:02:27Z

csrc/scheduler/tools/loop_domain_scheduler.h

    const std::vector<TensorView*>& tvs,
-    Expr* transform);
+    Expr* transform,
+    Direction replay_dir = Direction::Undefined);


Added an optional parameter to restrict the direction.

naoyam · 2025-01-09T08:03:06Z

!test

jacobhinkle

LGTM. Just one small clarifying question.

csrc/scheduler/tools/loop_domain_scheduler.h

jacobhinkle · 2025-01-10T18:00:58Z

csrc/scheduler/tools/loop_domain_scheduler.h

+// as t0, which could minimize strided accesses.
+//
+// This scheduling is not always feasible. Specifically, if a reshape
+// outout iter domain is resized, the loop domain needs to keep using


Could you give an example of what the output would look like if some transforms can't be cancelled? For example if we had a further tensor

// t4 = pad(t3) // [i1, i0*i2 + 2]

Then we cannot cancel anything. Presumably if there is a decomposition of the reshape where some is cancellable and another part is not, we will cancel the part that is possible to cancel.

How about this test?

https://github.com/NVIDIA/Fuser/pull/3679/files#diff-add3baa66fa88dd28b1baec00ec023373d88630908bff6583a0a4d61379e17cbR705

The reshape for tv1 is not cancelled as the following slice depends on it, whereas the tv3 reshape is cancelled.

naoyam · 2025-01-10T18:29:34Z

!build

naoyam · 2025-01-10T18:30:46Z

!build

Depends on #3674, #3675, #3679 Reorder tensors to align with the largest input. This should improve memory accesses by minimizing strides. Store throughputs may be lowered, but it should generally be more important to optimize load accesses. I do not have actual performance results by this change. I just remember this was effective in some cases while manually trying out different optimization strategies. We may eventually need to enable or disable this reordering by some heuristic.

…4823) I decided to disable the cancellation of reshape in the resize scheduler. It was originally added in #3679. It results in about 10% perf regression in the RoPE benchmarks http://nv/eO-. The optimization should be reenabled but rather than ad-hoc patching, I feel we should investigate fixing the root cause of the issue, which is cycles in the exact graph. Tracking issue: #4839

…VIDIA#4823) I decided to disable the cancellation of reshape in the resize scheduler. It was originally added in NVIDIA#3679. It results in about 10% perf regression in the RoPE benchmarks http://nv/eO-. The optimization should be reenabled but rather than ad-hoc patching, I feel we should investigate fixing the root cause of the issue, which is cycles in the exact graph. Tracking issue: NVIDIA#4839

naoyam force-pushed the cancel_reshape branch from c85ea88 to 9af0ddc Compare January 8, 2025 06:05

Schedule loop domains such that reshape transforms are cancelled

dd0f134

naoyam force-pushed the cancel_reshape branch from 9af0ddc to dd0f134 Compare January 8, 2025 06:05

fix

ed100ac

naoyam marked this pull request as ready for review January 9, 2025 02:01

naoyam requested a review from jacobhinkle January 9, 2025 02:01

naoyam commented Jan 9, 2025

View reviewed changes

comment

0c2c335

naoyam mentioned this pull request Jan 10, 2025

Reordering optimization in the resize scheduler #3693

Merged

jacobhinkle approved these changes Jan 10, 2025

View reviewed changes

naoyam and others added 2 commits January 10, 2025 10:29

comment fix

126251b

Merge branch 'main' into cancel_reshape

aeec18d

naoyam added 2 commits January 10, 2025 10:30

comment fix

3402b38

Merge remote-tracking branch 'origin/cancel_reshape' into cancel_reshape

8c2fd0f

naoyam merged commit 05ec62b into main Jan 10, 2025
6 checks passed

naoyam deleted the cancel_reshape branch January 10, 2025 18:56

naoyam mentioned this pull request Jul 24, 2025

Do not attempt to cancel reshape when not all tensors are dominated #4823

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schedule loop domains such that reshape transforms are cancelled#3679

Schedule loop domains such that reshape transforms are cancelled#3679
naoyam merged 7 commits intomainfrom
cancel_reshape

naoyam commented Jan 8, 2025 •

edited

Loading

Uh oh!

naoyam commented Jan 8, 2025

Uh oh!

naoyam commented Jan 9, 2025

Uh oh!

naoyam Jan 9, 2025

Uh oh!

naoyam commented Jan 9, 2025

Uh oh!

jacobhinkle left a comment

Uh oh!

Uh oh!

Uh oh!

jacobhinkle Jan 10, 2025

Uh oh!

naoyam Jan 10, 2025

Uh oh!

naoyam commented Jan 10, 2025

Uh oh!

naoyam commented Jan 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

naoyam commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

naoyam commented Jan 8, 2025

Uh oh!

naoyam commented Jan 9, 2025

Uh oh!

naoyam Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

naoyam commented Jan 9, 2025

Uh oh!

jacobhinkle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jacobhinkle Jan 10, 2025

Choose a reason for hiding this comment

Uh oh!

naoyam Jan 10, 2025

Choose a reason for hiding this comment

Uh oh!

naoyam commented Jan 10, 2025

Uh oh!

naoyam commented Jan 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

naoyam commented Jan 8, 2025 •

edited

Loading