clean normalization_inner by liqiangxl · Pull Request #923 · NVIDIA/Fuser

liqiangxl · 2023-09-21T19:57:31Z

(1) Transformed private static functions in the InnerPersistentKernelScheduler class into utility functions within an anonymous namespace.
(2) Removed checks/calculations that are unrelated to inner persistent scheduler.

liqiangxl · 2023-09-21T21:04:01Z

csrc/scheduler/normalization_inner.cpp

+bool checkReductionPattern(
+    Fusion* fusion,
+    const std::vector<TensorView*>& reduction_tvs) {
+  // Use root domain map to check the reduction ops have the same axes


(1) Transformed from a static function in the InnerPersistentKernelScheduler class into a utility function within an anonymous namespace.
(2) Removed checks that were unrelated to inner persistence.

liqiangxl · 2023-09-21T21:04:28Z

csrc/scheduler/normalization_inner.cpp

+    Fusion* fusion,
+    SchedulerRuntimeInfo& runtime_info,
+    HeuristicSummary* data_cache,
+    const std::vector<TensorView*>& reduction_tvs) {


(1) Transformed from a static function in the InnerPersistentKernelScheduler class into a utility function within an anonymous namespace.
(2) Removed checks that were unrelated to inner persistence.

liqiangxl · 2023-09-21T21:05:06Z

csrc/scheduler/normalization_inner.cpp


-  if (inner_reduction_tvs.empty() || !outer_reduction_tvs.empty()) {
+  if (reduction_type != reduction_scheduler_utils::ReductionType::Inner) {
    scheduler_debug_utils::canScheduleRejectReason(


simplify reduction type check using utility function.

liqiangxl · 2023-09-21T21:06:05Z

csrc/scheduler/normalization_inner.cpp

-  // the iter domain of the persistent reduction.
-  if (!properties.fastest_dim_reduction &&
-      !(norm_per_sm >= warp_size / 2 ||
-        max_multi_reduction_factor >= warp_size)) {


Remove checks not used by inner persistent.

liqiangxl · 2023-09-21T21:06:59Z

csrc/scheduler/normalization_inner.cpp

  // If the persistence requires over half the device don't do grid
  // persistence as we can't overlap the grid comms.
  if (required_sm_per_norm >
-      scheduler_utils::safeDiv(device_multiprocessor_count, 3)) {


change from 3 to 2 based on the comments.

liqiangxl · 2023-09-21T21:07:31Z

csrc/scheduler/normalization_inner.cpp

-}
-
-bool InnerPersistentKernelScheduler::canScheduleRunTimeOuter(
-    Fusion* fusion,


liqiangxl · 2023-09-21T21:08:13Z

csrc/scheduler/normalization_inner.cpp

-  }
-
-  return rparams;
-}


heuristics for outer and innerOuter are removed.

liqiangxl · 2023-09-21T21:37:58Z

!build

liqiangxl · 2023-09-21T23:42:52Z

!build

liqiangxl · 2023-09-22T11:07:29Z

!build

liqiangxl · 2023-09-22T15:47:19Z

!build

liqiangxl · 2023-09-22T15:48:57Z

!build

liqiangxl · 2023-09-23T13:26:51Z

!build

naoyam

LGTM. Just in case, please do the diff check. It's currently disabled in CI, so needs to be done manually.

liqiangxl · 2023-09-23T18:33:12Z

Previous HEAD position was 8facc54 Define TensorMap for CUDA 11 (#932)
Switched to branch 'llu/clean_inner_norm'
Your branch is up to date with 'origin/llu/clean_inner_norm'.

DIFF RESULT:

No difference found
Already on 'llu/clean_inner_norm'
Your branch is up to date with 'origin/llu/clean_inner_norm'.

naoyam · 2023-09-23T18:40:46Z

Previous HEAD position was 8facc54 Define TensorMap for CUDA 11 (#932) Switched to branch 'llu/clean_inner_norm' Your branch is up to date with 'origin/llu/clean_inner_norm'.

DIFF RESULT:

No difference found Already on 'llu/clean_inner_norm' Your branch is up to date with 'origin/llu/clean_inner_norm'.

Thanks. Just please make sure to run the benchmarks as I don't think the script runs them by default.

liqiangxl · 2023-09-23T18:47:14Z

Previous HEAD position was 8facc54 Define TensorMap for CUDA 11 (#932) Switched to branch 'llu/clean_inner_norm' Your branch is up to date with 'origin/llu/clean_inner_norm'.
DIFF RESULT:
No difference found Already on 'llu/clean_inner_norm' Your branch is up to date with 'origin/llu/clean_inner_norm'.

Thanks. Just please make sure to run the benchmarks as I don't think the script runs them by default.

The CI runs the benchmarks. e.g. https://gitlab-master.nvidia.com/dl/pytorch/fuser-gh-mirror/-/jobs/69890090

naoyam · 2023-09-23T18:52:20Z

I think the diff check is not enabled at this moment in the CI.

liqiangxl · 2023-09-23T18:55:06Z

I think the diff check is not enabled at this moment in the CI.

You are right. The diff check is not enabled in CI. But CI runs the benchmark Job jit_nvfuser_bench_jitfuture_TNVF and I ran the tools/compare_codegen.sh on my local node.

naoyam · 2023-09-23T19:41:58Z

I think the diff check is not enabled at this moment in the CI.

You are right. The diff check is not enabled in CI. But CI runs the benchmark Job jit_nvfuser_bench_jitfuture_TNVF and I ran the tools/compare_codegen.sh on my local node.

What I wanted to make sure was to run the benchmarks with the diff script as it doesn't by default.

liqiangxl · 2023-09-24T19:12:39Z

I think the diff check is not enabled at this moment in the CI.

You are right. The diff check is not enabled in CI. But CI runs the benchmark Job jit_nvfuser_bench_jitfuture_TNVF and I ran the tools/compare_codegen.sh on my local node.

What I wanted to make sure was to run the benchmarks with the diff script as it doesn't by default.

got you! I did this check in #928 using tools/compare_codegen.sh -- build/nvfuser_bench --benchmark_filter=NvFuserScheduler --benchmark_repetitions=1 --benchmark_min_time=0. No diff found.

similar to #923 (1) Transformed private static functions in the OuterPersistentKernelScheduler class into utility functions within an anonymous namespace. (2) Removed checks/calculations that are unrelated to outer persistent scheduler.

similar to #923 (1) Transformed private static functions in the InnerOuterPersistentKernelScheduler class into utility functions within an anonymous namespace. (2) Removed checks/calculations that are unrelated to inner_outer persistent scheduler.

liqiangxl added 2 commits September 21, 2023 12:35

clean normalization_inner

041cb2d

Merge branch 'main' into llu/clean_inner_norm

df3affa

liqiangxl commented Sep 21, 2023

View reviewed changes

comment and format

acab563

liqiangxl marked this pull request as ready for review September 21, 2023 21:38

liqiangxl requested a review from naoyam September 21, 2023 21:39

Merge branch 'main' into llu/clean_inner_norm

de6bce5

Merge branch 'main' into llu/clean_inner_norm

94684fa

liqiangxl marked this pull request as draft September 22, 2023 14:59

liqiangxl added 2 commits September 22, 2023 11:45

Merge branch 'main' into llu/clean_inner_norm

aceab64

Merge branch 'main' into llu/clean_inner_norm

7dba5d9

liqiangxl mentioned this pull request Sep 22, 2023

Revert "Enable the newly added OuterPersistent scheduler" #927

Closed

liqiangxl marked this pull request as ready for review September 22, 2023 19:43

Merge branch 'main' into llu/clean_inner_norm

5fdc5f9

naoyam approved these changes Sep 23, 2023

View reviewed changes

This was referenced Sep 23, 2023

clean normalization_outer #931

Merged

clean normalization_inner_outer #928

Merged

liqiangxl merged commit b4335f0 into main Sep 23, 2023

liqiangxl deleted the llu/clean_inner_norm branch September 23, 2023 18:32

Conversation

liqiangxl commented Sep 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liqiangxl Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

liqiangxl Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

liqiangxl Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

liqiangxl Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

liqiangxl Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

liqiangxl Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

liqiangxl Sep 21, 2023

Choose a reason for hiding this comment

Uh oh!

liqiangxl commented Sep 21, 2023

Uh oh!

liqiangxl commented Sep 21, 2023

Uh oh!

liqiangxl commented Sep 22, 2023

Uh oh!

liqiangxl commented Sep 22, 2023

Uh oh!

liqiangxl commented Sep 22, 2023

Uh oh!

liqiangxl commented Sep 23, 2023

Uh oh!

naoyam left a comment

Choose a reason for hiding this comment

Uh oh!

liqiangxl commented Sep 23, 2023

Uh oh!

naoyam commented Sep 23, 2023

Uh oh!

liqiangxl commented Sep 23, 2023

Uh oh!

naoyam commented Sep 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liqiangxl commented Sep 23, 2023

Uh oh!

naoyam commented Sep 23, 2023

Uh oh!

liqiangxl commented Sep 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liqiangxl commented Sep 21, 2023 •

edited

Loading

naoyam commented Sep 23, 2023 •

edited

Loading