Some misc expr simplification by zasdfgbnm · Pull Request #105 · NVIDIA/Fuser

zasdfgbnm · 2023-03-30T23:46:59Z

Add a new namespace assume which contains utilities to generate boolean expressions stating certain things we want to assume.
Simplification: max(a, b) -> a if a >= b, min(a, b) -> b if a >= b.
Positivity proof of ceilDiv.
extend stupid_simple_compiler in test_expr_simplifier.cpp to support function call.

This PR allows me to simplify the parallel dimension from

max(ceilDiv(T0.size[0], 128), ceilDiv(T0.size[0], 128) * 4 , 4)

into

ceilDiv(T0.size[0], 128) * 4

zasdfgbnm · 2023-03-30T23:49:08Z

test/test_expr_simplifier.cpp

+  // constant folding
+  ASSERT_TRUE(simplifyExpr("ceilDiv( 5 , 3 ) * 5"_)->sameAs("10"_));
+
+  ASSERT_TRUE(simplifyExpr("1 * i"_)->sameAs("i"_));


I replaced some TORCH_CHECK here with ASSERT_TRUE.

…to more-expr-simplification

csrc/assume.h

naoyam · 2023-03-31T18:42:49Z

csrc/parallel_dimension_map.cpp

  // Simplify dim_map_
  for (auto& [k, v] : dim_map_) {
-    v = simplifyExpr(v);
+    // Well, this isn't really correct, but we need this assumption to better


But doesn't this actually break if empty tensors are given?

I know we haven't been very strict with empty tensors, so it's highly likely there are other things that would break with empty tensors, so this isn't a new issue.

Maybe we want to add a property to Kernel asserting there's no empty tensor? Similar to what we do with the index type, we could lower a fusion with an optional assertion (based on actual input tensor sizes) that it is safe to assume no tensor is empty. That seems to work good enough, doesn't it?

Hmm, I don't know if there is anything really breaks today. If a tensor size is 0, and it appears in the parallel dim, this usually means we want to launch 0 blocks or 0 threads, this usually means all the fusion's output are empty tensor, and the lowering of this case is already skipped. The thing that I am really worried about is, if some extent is T0.size[0] + 1, then empty tensor does not imply empty output. I don't think we currently have this schedule yet, but it might become a pitfall for the future.

Regarding the property for non-empty tensor in Kernel, I am not sure. But isn't that a big project to implement it? For example, if you have such a property, you need to know whether to set it as true or false so that you can enable more simplifications as much as possible. And doesn't that requires modifying our cache system to handle 0 as a special number (just like how we treated 1 as broadcast), and if I see an empty tensor, we need to make sure we don't use the cache for non-empty tensor?

Oh, I didn't pay close attention that this is only for vals that represent parallel dimensions. Yes, in this case, I see no concern.

Regarding the property for non-empty tensor in Kernel, I am not sure. But isn't that a big project to implement it? For example, if you have such a property, you need to know whether to set it as true or false so that you can enable more simplifications as much as possible. And doesn't that requires modifying our cache system to handle 0 as a special number (just like how we treated 1 as broadcast), and if I see an empty tensor, we need to make sure we don't use the cache for non-empty tensor?

I don't think that's too different what we already do with the index type. Granted, we do have a couple of issues with it, but all we would need to do is to assume tensors can be empty by default and allow to have the non-empty tensor only with an optional flag that should be part of CompileOptions. Not necessary for this PR, but I think this approach should fit well by following the same approach as the index type specialization.

Let's leave this PR as is, and figure out our strategy for it when we see a bug in empty tensor. For matmul, it strongly rely on the assumption that tensors are not empty, but I don't see any other case where this is important. But I am feeling weird about the logic "if matmul, then assume nonempty, otherwise don't assume so". Whether to assume nonempty should be based on inputs, not on whether it is matmul.

naoyam · 2023-03-31T18:47:15Z

csrc/assume.cpp

+    }
+  }
+  Bool* result = nullptr;
+  for (auto ts : tensor_sizes) {


tensor_sizes can contain tensor size named scalars redundantly. Are you assuming the redundancy should be eliminated by the expr simplification?

Thanks for catching! They will not be eliminated by expr simplifier, because the assumptions argument are not simplified. I don't think this will cause any issue, but still, we should get rid of it. I will update this PR.

naoyam

LGTM

@mmigdal-nv

This PR is stacked on #105, it fixes #95. There are mainly two changes effective: - In `lower_scalar_hoist.cpp`, add assumptions about parallel indices - In `predicate_compute.cpp`, check if extent is the same as parallel dimension Only one of the above changes is sufficient to fix the bug, but I did both to be double safe. cc: @mmigdal-nv @drzejan2

zasdfgbnm added 7 commits March 30, 2023 13:04

Better test

cd3448f

min-max simplifier

81dcd48

assume

da109f2

fix

85d846a

comment

3e8befe

newline

8804619

unformat

5cc53e9

zasdfgbnm changed the title ~~More expr simplification~~ Some misc expr simplification Mar 30, 2023

zasdfgbnm commented Mar 30, 2023

View reviewed changes

zasdfgbnm added 3 commits March 30, 2023 16:49

Merge branch 'main' into more-expr-simplification

26f373f

fix tests

15f097f

Merge branch 'more-expr-simplification' of github.com:NVIDIA/Fuser in…

05f97b4

…to more-expr-simplification

zasdfgbnm requested a review from naoyam March 30, 2023 23:56

Merge branch 'main' into more-expr-simplification

5257900

This was referenced Mar 30, 2023

Trivial predicate is causing a 30% slowdown for matmul with grid swizzle #95

Closed

Remove trivial predicate for matmul #106

Merged

Merge branch 'main' into more-expr-simplification

01bc3f5

naoyam reviewed Mar 31, 2023

View reviewed changes

csrc/assume.h Show resolved Hide resolved

zasdfgbnm added 2 commits March 31, 2023 11:00

dic

d91025c

Merge branch 'main' into more-expr-simplification

977a53d

naoyam reviewed Mar 31, 2023

View reviewed changes

assume dedup

5954219

zasdfgbnm requested a review from naoyam March 31, 2023 20:40

naoyam approved these changes Mar 31, 2023

View reviewed changes

zasdfgbnm merged commit 680195d into main Mar 31, 2023

zasdfgbnm deleted the more-expr-simplification branch March 31, 2023 21:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some misc expr simplification#105

Some misc expr simplification#105
zasdfgbnm merged 15 commits intomainfrom
more-expr-simplification

zasdfgbnm commented Mar 30, 2023 •

edited

Loading

Uh oh!

zasdfgbnm Mar 30, 2023

Uh oh!

Uh oh!

naoyam Mar 31, 2023

Uh oh!

zasdfgbnm Mar 31, 2023

Uh oh!

naoyam Mar 31, 2023

Uh oh!

naoyam Mar 31, 2023

Uh oh!

zasdfgbnm Mar 31, 2023

Uh oh!

naoyam Mar 31, 2023

Uh oh!

zasdfgbnm Mar 31, 2023

Uh oh!

zasdfgbnm Mar 31, 2023

Uh oh!

naoyam left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zasdfgbnm commented Mar 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

naoyam left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zasdfgbnm commented Mar 30, 2023 •

edited

Loading