[Unity][Transform] Introduce data-dependent operation of reshape and its constant folding #14282

sunggg · 2023-03-12T20:48:23Z

As discussed in previous Unity open dev meeting, this PR implements data-dependent operation of reshape.

Representative examples

This PR realizes data-dependent operation of reshape as follows.

# Target shape will be computed at runtime and stored in tensor
target_shape: R.Tensor(ndim=1) = ...
# With new 'tensor_to_shape`, we can convert tensor values to ShapeExpr
lv: R.Shape(ndim=2) = R.tensor_to_shape(target_shape)
# Reshape is extended to take a Var as long as it is bound with ShapeExpr
gv: R.Tensor(ndim=2, dtype="float32") = R.reshape(data, lv)

Also, FoldConstant is extended to support tensor_to_shape.

# Before `FoldConstant`: c0, c1 are constants
lv0 = R.add(c0, c0)
target_shape = R.multiply(lv0, c1)
lv2: R.Shape(ndim=2) = R.tensor_to_shape(target_shape)
gv: R.Tensor(ndim=2, dtype="float32") = R.reshape(data, lv2)

# After `FoldConstant`
gv: R.Tensor((16, 16), dtype="float32") = R.reshape(data, R.shape([16, 16]))

Summary of changes

Introduce new ~~builtin~~ tensor_to_shape
Current reshape takes the target shape in ShapeExpr | Array[PrimExpr]. This PR extends this to take Var only when it is bound to ShapeExpr.
Extend FoldConstant pass to support tensor_to_shape

03/18/2023 Update

Turned out if we implement target_to_shape as a builtin, its lowering happens too late and we cannot lower the following reshape-like ops since they need at least symbolic shape to legalize. Therefore, target_to_shape should be lowered before we lower reshape. New change extends existing SimplifyNormInference pass to serve as a generic pass to decompose composite operators like tensor_to_shape, attention, erf, etc.

Follow-up PRs

Formal introduction of composite ops
Refactor of ConstantFold pass to handle such composite ops

tvm-bot · 2023-03-12T20:48:26Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @quic-sanirudh _{See #10317 for details}

_{Generated by tvm-bot}

sunggg · 2023-03-12T20:51:14Z

cc @jwfromm @zxybazh @psrivas2 @slyubomirsky @MasterJH5574

tests/python/relax/test_relax_operators.py

tests/python/relax/test_transform_legalize_ops_manipulate.py

src/relax/backend/vm/vm_builtin_lower.cc

src/runtime/relax_vm/builtin.cc

tqchen · 2023-03-13T16:00:32Z

please update via

git rebase --onto  upstream/unity upstream/unity-rebase-backup-2023-03-13

slyubomirsky · 2023-03-13T17:20:01Z

tests/python/relax/test_relax_operators.py

+
+def test_op_tensor_to_shape():
+    out_shape = run_cpu(
+        TensorToShapeTest, "run_tensor_to_shape", tvm.nd.array(np.array([1, 2, 3]).astype("int64"))


I think the demonstrated behavior is correct. We are converting the rank-1 tensor [1, 2, 3] into a shape (1, 2, 3). I think the problem here is that ndim means different things for tensors (where it means the rank) and shapes (where it means the number of dimensions denoted by the shape).

python/tvm/relax/op/base.py

slyubomirsky · 2023-03-13T23:15:00Z

src/relax/op/op.cc

+  ICHECK(call->args.size() == 1);
+  ICHECK(call->args[0]->struct_info_.defined());
+  const auto* tsinfo = GetStructInfoAs<TensorStructInfoNode>(call->args[0]);
+  ICHECK(tsinfo && tsinfo->shape.defined());


I think this condition is more restrictive than necessary. I think it's okay for the shape not to be known at compile time and to return ShapeStructInfo(ndim=-1) if the input rank/shape is unknown at compile time.

Edit: I guess you use the rank in vm_builtin_lower.cc but I don't see why it has to be implemented that way. You could check all the necessary properties dynamically (inside the builtin)

Do you mean these lines?

// define symbolic variables Array<PrimExpr> shape_var; for (int i = 0; i < sinfo->ndim; i++) { shape_var.push_back(tir::Var("x", DataType::Int(64))); }

Initially, I also wanted to support the unknown rank but realized it is trickier than I thought.
The problem is you need to insert these symbolic variables at compile-time so we need this info.

What exactly are the new shape vars needed for?

To MatchCast and put them in the ShapeExpr.

// define symbolic variables Array<PrimExpr> shape_var; for (int i = 0; i < sinfo->ndim; i++) { shape_var.push_back(tir::Var("x", DataType::Int(64))); } // bind symbolic variables to the shape tuple relax::Var var("y", ShapeStructInfo(shape_var)); builder_->EmitNormalized(MatchCast(var, call, ShapeStructInfo(shape_var))); return ShapeExpr(shape_var);

That much I understand, what I am more curious about is why we need to construct this shape expression

Why does lowering require the dimensions to be bound to a variable? That doesn't seem like a restriction that needs to exist.

@slyubomirsky To pre-allocate memory for TIR functions? TE compute takes the output shape in Array<PrimExpr> as the first argument:

Tensor compute(Array<PrimExpr> shape, FCompute fcompute, std::string name, std::string tag, Map<String, ObjectRef> attrs)

great discussion folks! Looks the limitation for shape_expr is from TE, interesting

Hmmmm, I think we should put this on the list to fix later. Let's please note this limitation in a comment, since it is not obvious from looking at the code

Added this in the comment. Would it be good to go now?

slyubomirsky · 2023-03-13T23:28:36Z

src/runtime/relax_vm/builtin.cc

+TVM_REGISTER_GLOBAL("vm.builtin.tensor_to_shape").set_body_typed([](NDArray data) {
+  NDArray arr = data;
+  if (data->device.device_type != kDLCPU) {
+    arr = data.CopyTo(DLDevice{kDLCPU, 0});


Does this copy fail if the NDArray is on another device? I'm a little hesitant to have an op that just will not work depending on the device.

Is it possible that we have a shape tensor not on host device?

Not sure where this might come up. I wouldn't be surprised if, say, we import an ONNX model (this is where this use-case first came up) that we mean to run on GPU and every tensor (including those that stand for shapes) is stored on GPU.

Based on my current understanding, we want to keep the shape-related computation on the host side since its result could be related to the memory planning. Even for general GPU execution besides this case, we are running output shape computation (inserted by VMShapeLower pass) on the host side.

sunggg · 2023-03-19T02:50:23Z

This PR extends existing SimplifyNormInference pass to serve as a generic pass to decompose composite operators like tensor_to_shape, attention, erf, etc. Would you check if you are okay with this change?
cc. @Hzfengsy @SiriusNEO

Hzfengsy

LGTM except for a naming issue

Hzfengsy · 2023-03-19T03:27:53Z

src/relax/transform/decompose_composite_ops.cc

  Map<Expr, Expr> batch_norm_map_;
 };

+class OpDecomposer : public ExprMutator {


The name of OpDecoposer is too generic since it's only for tensor_to_shape

We can extend this function to add other composite ops like erf, attention which will be upcoming soon.

SiriusNEO · 2023-03-19T09:44:15Z

@sunggg Hi actually we also do some changes in SimplifyNormInference but it has not been upstreamed yet (mlc-ai/relax#162). Notice that you don't change the part which simplifies BatchNorm, I can find some time to rebase my changes on yours.

sunggg · 2023-03-20T15:55:57Z

@yongwww @slyubomirsky would you take another look?

yongwww · 2023-03-20T20:05:13Z

@sunggg it would be good to rebase it via git rebase --onto upstream/unity upstream/unity-rebase-backup-2023-03-20

…teOps pass

src/relax/backend/vm/vm_builtin_lower.cc

src/relax/transform/decompose_composite_ops.cc

yongwww

LGTM, thanks for addressing my concerns!

slyubomirsky

My concerns have been addressed, thank you

zxybazh · 2023-03-21T20:35:12Z

Thanks @sunggg for the thoughtful design and quick implementation!

…its constant folding (#14282) * FEAT: Support data-dependent operation of reshape * FEAT: Support constant folding with data-dependent reshape * fix * remove empty line * reflect feedback * Lift the lowering of tensor_to_shape from builtin to DecomposeCompositeOps pass * fix and comment * fix * add comments * reflect feedback * add comment * fix

PR (apache/tvm#14282) refactors the pass `SimplifyNorm` to `DecomposeOps`. Last rebase leaves the conflicts to be fixed and this PR merges apache/tvm#14282 and #162 together. The changes mainly include: - func pass -> module pass (Because sometimes we don't want simplify all functions in a module) - Add a `mode` argument to indicate whether it is a training simplification or eval simplification.

Hzfengsy reviewed Mar 13, 2023

View reviewed changes

tests/python/relax/test_relax_operators.py Outdated Show resolved Hide resolved

yongwww reviewed Mar 13, 2023

View reviewed changes

tests/python/relax/test_transform_legalize_ops_manipulate.py Show resolved Hide resolved

src/relax/backend/vm/vm_builtin_lower.cc Outdated Show resolved Hide resolved

src/runtime/relax_vm/builtin.cc Show resolved Hide resolved

tqchen force-pushed the unity branch from 78884cc to ae27e6f Compare March 13, 2023 15:58

slyubomirsky reviewed Mar 13, 2023

View reviewed changes

sunggg force-pushed the data_dep_reshape branch from 47eead9 to b75f11f Compare March 13, 2023 20:24

zxybazh approved these changes Mar 13, 2023

View reviewed changes

slyubomirsky reviewed Mar 13, 2023

View reviewed changes

python/tvm/relax/op/base.py Outdated Show resolved Hide resolved

slyubomirsky reviewed Mar 13, 2023

View reviewed changes

sunggg force-pushed the data_dep_reshape branch from 11842e3 to 802a01b Compare March 19, 2023 00:37

sunggg changed the title ~~[Unity][BuiltinOp][Transform] Introduce data-dependent operation of reshape and its constant folding~~ [Unity][Transform] Introduce data-dependent operation of reshape and its constant folding Mar 19, 2023

Hzfengsy approved these changes Mar 19, 2023

View reviewed changes

SiriusNEO approved these changes Mar 19, 2023

View reviewed changes

sunggg force-pushed the data_dep_reshape branch from 3de7d67 to 24a9fe7 Compare March 20, 2023 02:56

tqchen force-pushed the unity branch from 2a9709c to 18c19fb Compare March 20, 2023 17:10

sunggg added 8 commits March 20, 2023 13:54

FEAT: Support data-dependent operation of reshape

acbaa1f

FEAT: Support constant folding with data-dependent reshape

4809080

fix

1a7ae44

remove empty line

302e525

reflect feedback

36cc64a

Lift the lowering of tensor_to_shape from builtin to DecomposeComposi…

bdfaa6b

…teOps pass

fix and comment

8ce3895

fix

eb40129

add comments

f4095b6

sunggg force-pushed the data_dep_reshape branch from a07929a to f4095b6 Compare March 20, 2023 20:58

yongwww requested changes Mar 20, 2023

View reviewed changes

src/relax/backend/vm/vm_builtin_lower.cc Outdated Show resolved Hide resolved

src/relax/transform/decompose_composite_ops.cc Show resolved Hide resolved

src/relax/transform/decompose_composite_ops.cc Outdated Show resolved Hide resolved

reflect feedback

600a94d

yongwww approved these changes Mar 20, 2023

View reviewed changes

add comment

6022718

slyubomirsky approved these changes Mar 21, 2023

View reviewed changes

fix

28c71b4

zxybazh merged commit 675a22e into apache:unity Mar 21, 2023

sunggg mentioned this pull request Mar 31, 2023

[Unity][Op] introduce shape_to_tensor op #14447

Merged

SiriusNEO mentioned this pull request Apr 2, 2023

[Pass][Fix] Merging DecomposeOps and SimplifyNorm mlc-ai/relax#170

Merged

SiriusNEO mentioned this pull request Apr 5, 2023

[Unity][Transform] Some Improvements on pass DecomposeOps #14512

Merged

[Unity][Transform] Introduce data-dependent operation of reshape and its constant folding #14282

[Unity][Transform] Introduce data-dependent operation of reshape and its constant folding #14282

Uh oh!

Conversation

sunggg commented Mar 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Representative examples

Summary of changes

03/18/2023 Update

Follow-up PRs

Uh oh!

tvm-bot commented Mar 12, 2023

Uh oh!

sunggg commented Mar 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tqchen commented Mar 13, 2023

Uh oh!

slyubomirsky Mar 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

slyubomirsky Mar 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sunggg Mar 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sunggg commented Mar 19, 2023

Uh oh!

Hzfengsy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SiriusNEO commented Mar 19, 2023

Uh oh!

sunggg commented Mar 20, 2023

Uh oh!

yongwww commented Mar 20, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yongwww left a comment

Choose a reason for hiding this comment

Uh oh!

sunggg commented Mar 12, 2023 •

edited

Loading

sunggg commented Mar 12, 2023 •

edited

Loading

slyubomirsky Mar 13, 2023 •

edited

Loading

slyubomirsky Mar 13, 2023 •

edited

Loading

sunggg Mar 15, 2023 •

edited

Loading