Matmul default scheduling by Priya2698 · Pull Request #1775 · NVIDIA/Fuser

Priya2698 · 2024-02-16T00:54:19Z

PR #1743 was reverted due to the following issues:

Matmul scheduler does not support all architectures: This caused some tests to fail on V100. This is fixed by returning early in getMatmulHeuristics and only setting the required parameters.
Errors in GpuLower::analysis: validateMma and PredicateElimination are modified to skip expressions and outputs marked for expression evaluator.

Priya2698 · 2024-02-21T23:09:58Z

!build

csrc/device_lower/analysis/predicate_elimination.cpp

csrc/device_lower/pass/expr_sort.cpp

Priya2698 · 2024-02-21T23:13:40Z

csrc/device_lower/analysis/predicate_elimination.cpp

+      [&fusion](Val* out) {
+        return (fusion->getOutputAlias(out).type != AllocationType::Evaluate);
+      });
+  traverseTo(outs_requiring_codegen);


This removes outputs marked for EE from predicate elimination to avoid errors in lowering analysis.

Priya2698 · 2024-02-21T23:15:05Z

csrc/scheduler/matmul.cpp


 void MatmulScheduler::schedule(Fusion* fusion) {
  FUSER_PERF_SCOPE("Schedule Matmul Fusion");
+  // Skip scheduling if Matmul will be expression evaluated.


Moving this from scheduleMatmul to schedule.

Priya2698 · 2024-02-21T23:17:11Z

csrc/scheduler/matmul_utils.cpp

  auto params = std::make_shared<MatmulParams>();

+  // Set kernel index mode
+  params->cparams.index_type = runtime_info.getIndexType();


index_type is needed in compileFusion and needs to be set before returning. This will avoid errors on architectures not supported by matmul scheduler (see getMmaOp function.)

Priya2698 · 2024-02-21T23:26:37Z

!build

jjsjann123 · 2024-02-22T20:31:36Z

csrc/options.h

  MemoryPromotion, //! Enable promotion of memory types for non-pointwise ops
  StaticFusionCount, //! Enable using single static count in kernel name
  WarnRegisterSpill, //! Enable warnings of register spill
+  MatmulExprEval, //! Enable ATen evaluation for Matmul


I think we need to make a louder statement here regarding the impact of this flag.

Enabling MatmulExprEval means that we are running the entire fusion containing a matmul with expression evaluation, not just the matmul portion.

csrc/scheduler/matmul.cpp

Priya2698 · 2024-02-22T23:12:09Z

!build

Priya2698 · 2024-02-23T18:58:55Z

!build

Priya2698 · 2024-02-23T23:36:50Z

!build

csrc/fusion.h

csrc/fusion.cpp

csrc/executor.cpp

Priya2698 · 2024-02-23T23:21:02Z

csrc/fusion.h

@@ -90,10 +90,14 @@ enum class AllocationType : int {
  // For example, the tensor storing BatchNorm's running mean. The output EMA is
  // updated in place.
  InplaceUpdate,


Renaming NoAlias->New and InplaceUpdate->ReuseBuffer will be done in a separate PR to avoid extraneous changes in this PR.

csrc/fusion.cpp

csrc/executor.cpp

csrc/fusion.h

csrc/fusion.cpp

Priya2698 · 2024-02-29T22:59:48Z

!build

wujingyue

LGTM with comments

wujingyue · 2024-02-29T23:00:53Z

csrc/executor.cpp

+    default:
+      NVF_ERROR(false, "Unrecognized AllocationType.");


I doubt this is needed. Compilers today should be smart enough to figure out the above cases have covered all possible options.

Added this to avoid the error: error: control reaches end of non-void function even though the switch-case is completely handled.

csrc/fusion.cpp

csrc/scheduler/matmul.cpp

Priya2698 · 2024-03-01T02:54:01Z

!build

Adds ATen evaluation for Matmul and Matmul + Bias. Based on PR #1921, when evaluating a `castOp`, we _look back_ to see if there is a preceding MmaOp and evaluate them together. Issue #1775.

This PR resolves Issue #1812. 1. Removes the workarounds added to support ATen evaluation for matmuls in PR #1775 (in `predicate_elimination.cpp` and `validation.cpp` (see comments in the PR) 2. `FusionExecutor::fusion_` is initialized only when compilation is skipped. So only one of `lowered_` or `fusion_` is non-null. 3. `FusionKernelRuntime/FusionExecutorCache/FusionExecutor::isCompiled()` indicates whether compilation was attempted (either marked for EE or via nvFuser). --------- Co-authored-by: Ryan Spring <rspring@nvidia.com>

Priya2698 force-pushed the pm/mma_default branch from 70fcc57 to 76c8b67 Compare February 16, 2024 01:01

Priya2698 commented Feb 21, 2024

View reviewed changes

csrc/device_lower/analysis/predicate_elimination.cpp Outdated Show resolved Hide resolved

Priya2698 commented Feb 21, 2024

View reviewed changes

csrc/device_lower/pass/expr_sort.cpp Outdated Show resolved Hide resolved

Priya2698 commented Feb 21, 2024

View reviewed changes

jjsjann123 reviewed Feb 22, 2024

View reviewed changes

protonu requested review from protonu and removed request for protonu February 22, 2024 20:39

jjsjann123 reviewed Feb 22, 2024

View reviewed changes

csrc/scheduler/matmul.cpp Show resolved Hide resolved

Priya2698 added 16 commits February 23, 2024 19:16

Add type evaluate

f4e0a15

skip codegen for AllocationType::Evaluate

4ce7fe5

add enable option for matmul aten

241411d

add test

77db556

change function args

81012cc

clean unnecessary checks; use fusion output instead of mma->out

34a9971

set hide output to false

28173cd

rename

cfc9f09

add tests

e521180

lintrunner

8cf1e17

rename

7257256

save orig state in enable options

bd0d412

review comments

08f5970

rename test and testfile

3358c14

modify test to cast bias

7ae53ca

check io_alias_ entry

bd5c00f

Priya2698 marked this pull request as ready for review February 23, 2024 23:36

Priya2698 requested review from jacobhinkle, naoyam, protonu and wujingyue February 23, 2024 23:37

Priya2698 changed the title ~~[WIP] Matmul default scheduling~~ Matmul default scheduling Feb 23, 2024

Priya2698 requested a review from kevinstephano February 23, 2024 23:38

jjsjann123 mentioned this pull request Feb 24, 2024

allocation order inference pass broken matmul scheduler tests #1810

Closed

wujingyue reviewed Feb 24, 2024

View reviewed changes

csrc/fusion.h Outdated Show resolved Hide resolved

csrc/fusion.cpp Outdated Show resolved Hide resolved

wujingyue reviewed Feb 24, 2024

View reviewed changes

csrc/executor.cpp Outdated Show resolved Hide resolved

csrc/executor.cpp Outdated Show resolved Hide resolved

Priya2698 added 2 commits February 28, 2024 03:20

restructure allocOutput

d5c0149

move to device lower utils

5c5e57e

Priya2698 requested review from jjsjann123 and wujingyue February 29, 2024 21:04

Priya2698 commented Feb 29, 2024

View reviewed changes

filter exprs

9a71787

wujingyue reviewed Feb 29, 2024

View reviewed changes

Priya2698 added 2 commits March 1, 2024 01:30

review comments

f571a3a

format

664eb05

wujingyue approved these changes Mar 1, 2024

View reviewed changes

Priya2698 merged commit 5cdbfc5 into main Mar 1, 2024

Priya2698 deleted the pm/mma_default branch March 1, 2024 05:53

Priya2698 mentioned this pull request Mar 1, 2024

Rename AllocationTypes #1763

Merged

Priya2698 mentioned this pull request Mar 13, 2024

Skip compiling fusion segments when using expression evaluator #1930

Merged

Priya2698 mentioned this pull request Mar 25, 2024

Evaluate Matmul+Bias #1993

Merged

Conversation

Priya2698 commented Feb 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Priya2698 commented Feb 21, 2024

Uh oh!

Uh oh!

Uh oh!

Priya2698 Feb 21, 2024

Choose a reason for hiding this comment

Uh oh!

Priya2698 Feb 21, 2024

Choose a reason for hiding this comment

Uh oh!

Priya2698 Feb 21, 2024

Choose a reason for hiding this comment

Uh oh!

Priya2698 commented Feb 21, 2024

Uh oh!

jjsjann123 Feb 22, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Priya2698 commented Feb 22, 2024

Uh oh!

Priya2698 commented Feb 23, 2024

Uh oh!

Priya2698 commented Feb 23, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Priya2698 Feb 23, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Priya2698 commented Feb 29, 2024

Uh oh!

wujingyue left a comment

Choose a reason for hiding this comment

Uh oh!

wujingyue Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

Priya2698 Mar 1, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Priya2698 commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Priya2698 commented Feb 16, 2024 •

edited

Loading