Skip to content

Conversation

@hzfan
Copy link
Contributor

@hzfan hzfan commented Aug 2, 2021

Fix 2 minor bugs in:

  • reshape array of shape [1] to []
  • pytorch linear

cc @yzhliu @comaniac

# 1 - weight
bias = inputs[2]
mm_out = self.matmul(inputs[:2], input_types[:2])
mm_out = self.matmul([inputs[0], _op.transpose(inputs[1], axes=(1, 0))], input_types[:2])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use _op.nn.dense without transposing its second input. Also see https://discuss.tvm.apache.org/t/pytorch-unable-to-import-a-simple-torch-nn-linear-to-relay/10383/4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed something, does that mean the current linear layers converted from PyTorch are wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see my comment in the discuss thread above. Until PT 1.8, nn.Linear converts to aten::addmm, so this code path never hit unless users explicitly call something that converts to aten::linear. Since PT 1.9, nn.Linear converts to aten::linear, so this becomes a major issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, so this bug doesn't explode until PT 1.9. Thanks for the explanation!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masahi My concern is that linear is supposed to support inputs with arbitrary ranks like 1d, 2d, 3d, 4d.... The different input dimensions are normalized in self.matmul. If _op.nn.dense is directly used, the usability of linear would be limited to inputs whose ranks are 2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just make an exception: when the input rank is 2 then use dense; otherwise use matmul?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't op.nn.dense also support ND input?

data : tvm.relay.Expr
The input data to the operator,
of shape `(d_1, d_2, ..., d_n, units_in)`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if we want to support ND input, please add more tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masahi Tests with 3d input are added. For now I use dense when both inputs are 2d and fallback to matmul otherwise.

I tried op.nn.dense in ND scenario, but it fails at

@dense_alter_layout.register(["cpu", "arm_cpu"])
def _alter_dense_layout(attrs, inputs, tinfos, out_type):
target = tvm.target.Target.current(allow_none=False)
dispatch_ctx = autotvm.task.DispatchContext.current
data_tensor, weight_tensor = tinfos
out_dtype = out_type.dtype
M, K = get_const_tuple(data_tensor.shape)
N, _ = get_const_tuple(weight_tensor.shape)

where data_tensor.shape is unpacked into a tuple of size 2. I also check that in tests/python/relay/test_op_level1.py dense is tested numerically for 2d inputs only. Not sure if I miss anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it looks like relay doc is wrong and topi dense indeed only supports 2D #8412

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Masa, and otherwise looks good to me :-)

@hzfan
Copy link
Contributor Author

hzfan commented Aug 4, 2021

Any idea why ci always gets aborted in Build:CPU? Did I miss anything?

@comaniac
Copy link
Contributor

comaniac commented Aug 4, 2021

Any idea why ci always gets aborted in Build:CPU? Did I miss anything?

It was timeout (4hrs). Re-trigger CI may resolve this issue

@masahi
Copy link
Member

masahi commented Aug 4, 2021

It should be fixed by #8658

@junrushao junrushao merged commit 874ea7a into apache:main Aug 5, 2021
@junrushao
Copy link
Member

Thanks @hzfan @masahi @comaniac! It is merged :-)

mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Aug 11, 2021
dvhg pushed a commit to neo-ai/tvm that referenced this pull request Sep 8, 2021
apivovarov pushed a commit to neo-ai/tvm that referenced this pull request Sep 8, 2021
apivovarov pushed a commit to neo-ai/tvm that referenced this pull request Sep 8, 2021
apivovarov pushed a commit to neo-ai/tvm that referenced this pull request Sep 8, 2021
ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants