[CMSIS-NN] Fixed return data type from pattern callback function #9682

asparkhi · 2021-12-08T16:05:32Z

This PR fixes following issue: MergeComposite callback function from pattern table returns non boolean data type.

u99127

Would be good to have a test case for this.

Ramana

asparkhi · 2021-12-08T16:23:03Z

true that, I clubbed it with another bug fix which I'm still working on. If its okay with you, can I add that test later?

u99127 · 2021-12-08T16:30:10Z

We should be looking to add tests when they are fixing bugs at the time the commits go in to show what problem they are solving. Sometimes it is obvious but in this case we don't have any test coverage in the test suite for this failure.

A PR should also be ideally fixing one bug not multiple bugs and that too in an unclear fashion.

I'm not a committer or a reviewer in the TVM community , so others might have a different view.

asparkhi · 2021-12-08T16:32:44Z

I agree with you on all the points there. Just wanted to avoid the long CI cycle. I will add the test.

mshawcroft · 2021-12-08T16:35:11Z

python/tvm/relay/op/contrib/cmsisnn.py

    def check_qnn_binary_op(extract):
        """Check if multiply is supported by CMSIS-NN."""
-        return (
+        return bool(


Im puzzled by this change. Specifically I don't understand how the insertion of the bool() cast here changes the return type of the function, superficially it looks like the two subclauses are comparisons of str == str, both resulting in a class 'bool', applying 'and' to two bools will result in a bool. So what does the cast achieve?

Ditto the same logic applies to each of the previous bool casts inserted above...

What am I missing?

The problem originated because of checks for Conv2D function: check_qnn_conv2d()

Here is how it looks :

conv2d.attrs.out_dtype == "int32" and conv2d.attrs.padding[2] == 0 and conv2d.attrs.padding[3] == 0 and conv2d_input.checked_type.dtype == "int8" and conv2d_weight.checked_type.dtype == "int8" and pattern.checked_type.dtype == "int8" and bias_dtype == "int32" and all([zp == 0 for zp in kernel_zp]) and (not is_depthwise or bias_add is not None)

This check conv2d.attrs.padding[2] == 0 returns IntImm type instead of bool. So, the fix could be casting just the conv2d.attrs.padding[2] instead of casting around the whole complex boolean expression. Just to be safe always, I chose to go with the latter choice for all the operators which involve accessing attributes.

Ah ok, that's curious, and presumably once one of those subclauses has an integer type, the subsequent application of and just propagates the numerical type rather than giving a bool.

Thanks for the explanation.

This explanation makes sense for the conv2d case where that padding comparison return intimm, but presumably it does not apply to several of the other cases where bool() has been inserted. For instance this qnn_binary_op case that I commented on... in which case it would probably be some what less confusing to the future reader of this code if we didn't insert the unnecessary bool() casts.

For the case(s) where == return intimm instead of bool, I wonder whether this is the actual bug, or whether the actual bug is that the underlying == machinery is broken and returning an inappropriate type, it is certainly not illegal to have == return an arbitrary object in python, but there is a strong convention that eq returns a bool, Im thinking specifically from the perspective of the principle of least astonishment .

If this really is the bug (and == really should be returning a non bool object), then I'd suggest that from the readers perspective, inserting bool cast just around the subexpression that need to be cast rather than around entire subexpression is less confusing. A comment as to why the spurious bool() cast is required might also be helpful.....

Tagging @leandron if he knows something about the conv2d.attrs.padding[2] == 0 returning IntImm instead of bool. Its definitely surprising.

I'd update the code to just cast the subexpression as that makes code more readable, as you've suggested. Thanks @mshawcroft

Where do you see the IntImm?

I have seen python side bools become IntImm s when acessed from C++ via TVM's FFIs. Is it the case here?

manupak

Thanks! @ashutosh-arm

Sorry Im not fully following what is being fixed here. I think we need a test to get this PR in. I suppose it is the IntImm comparison ?

I agree with @u99127 that we should break the padding refactor out of this PR and publish that as a seperate PR concurrently. I don't think it will create a substantial delay as such because we can do the reviews for both PRs concurrently.

Then, if everything get resolved by the end of the day, it will create a maximum delay of one day, which IMO is reasonable.

asparkhi · 2021-12-09T12:28:33Z

@manupa-arm If you look at the latest changes that I've pushed, it contains both: Fix for the exact issue and the corresponding test.

The issue was that MergeComposite() failed because Conv2D's check function failed on cifar_10. By check function here, I mean the callback function registered with MergeComposite(). As you can see in the code snippet posted above in the comments, IntImm comparison was causing the equality to return non bool value: conv2d.attrs.padding[2] == 0 . As you pointed out, we are trying to access C++ fields from python here.

Regarding the test, @u99127 suggested to add corresponding test which I've pushed with my last commit. This test is a negative test to introduce the negative padding that would trigger a failure in the check function.

So, now we do have both the fix and the test as per reviewers' comments.

asparkhi · 2021-12-10T10:05:19Z

@u99127 @manupa-arm any other thoughts on this?

manupak

Hi @ashutosh-arm

I did a pass and I am still bit unclear how is the testing scope is increased to cover the bug.
Is it because it checks for "VALID" padding as well ? Earlier the testing scope seems to be only for "SAME" padding. Even then, earlier version of the test seemed to have have values generated for all pad_top, pad_left, pad_bottom and pad_right. Therefore, how is not captured before ?

Also I found out that we dont have any tflite based single operator E2E test -- I think we should have that in the absense of unit testing the compilation pipeline at different levels (i.e graph partitioning).

@ekalda would you be able to take a look as well ?

manupak · 2021-12-10T10:48:32Z

python/tvm/relay/op/contrib/cmsisnn.py

            conv2d.attrs.out_dtype == "int32"
-            and conv2d.attrs.padding[2] == 0
-            and conv2d.attrs.padding[3] == 0
+            and int(conv2d.attrs.padding[2]) == 0


I believe this is the actual fix. Am I right ? (because others are changes to test code).
What test would fail in this PR without this change ?

I think we could have a simpler test to check whether graph partitioning works with conv2d with padding values for top, left, bottom and right.

manupak · 2021-12-10T10:50:24Z

tests/python/contrib/test_cmsisnn/utils.py



-def get_same_padding(data, kernel, dilation, stride, cmsisnn_padding=True):
+def get_padding(data, kernel, dilation, stride, padding):


Is this change necessary to fix the bug here ?

manupak · 2021-12-10T10:50:49Z

tests/python/contrib/test_cmsisnn/test_generate_constants.py

    kernel_w = kernel_shape[w_index]
    a = relay.var("input", shape=shape, dtype=dtype)
-    p = (0, 0, 0, 0)
-    if padding == "SAME":


This seems orthogonal to the fix here

manupak · 2021-12-10T10:58:48Z

tests/python/contrib/test_cmsisnn/utils.py

-    pad_left, pad_right = (pad, 0) if cmsisnn_padding else (0, pad)
-    return [pad_top, pad_left, pad_bottom, pad_right]
+    pad_left, pad_right = (pad, 0) if padding == "SAME" else (0, pad)
+    return [pad_top, pad_left, pad_bottom, pad_right], True


Please add documentation as to what are returned with this function.
It used to return the pad top,left,bottom and right values which are bit simpler, However augmenting that with a boolean is confusing.

I would recommend using a named tuple as the return here to indicate what the boolean correspond to

manupak · 2021-12-10T11:01:37Z

tests/python/contrib/test_cmsisnn/test_pooling.py

-        pad_ = get_same_padding((shape[1], shape[2]), pool_size, dilation, strides)
+    dilation = (1, 1)
+    pad_, result = get_padding((shape[1], shape[2]), pool_size, dilation, strides, padding)
+    if result:


I have added a comment below, it is a bit difficult to follow what "result" means here. I think a named tuple might help.

However, looking at this feel, it is better to keep the get_padding(..) just to return the pad top, bottom, left, right values while here we could use padding check create the nn.pad rather than passing it inside the get_padding to return another boolean as 'result'

manupak · 2021-12-10T11:08:29Z

tests/python/contrib/test_cmsisnn/utils.py

    out = int(math.ceil(float(data[0]) / float(stride[0])))
    pad = max(0, (out - 1) * stride[0] + dilated_kernel_h - data[0])
-    pad_top, pad_bottom = (pad, 0) if cmsisnn_padding else (0, pad)
+    pad_top, pad_bottom = (pad, 0) if padding == "SAME" else (0, pad)


What pad mode does the else correspond to here ?

Change-Id: If87e853e094b72e6c34e030cd1ae3b690982332f

ekalda

I added some questions for clarification. In general, I agree with Manupa that it would be good to have some TFLite based E2E tests as well if there aren't any, might help us catch the bugs early, but that's probably outside the scope of this PR :)

ekalda · 2021-12-10T12:01:43Z

tests/python/contrib/test_cmsisnn/utils.py


-def get_same_padding(data, kernel, dilation, stride, cmsisnn_padding=True):
+def get_padding(data, kernel, dilation, stride, padding):
    """Provides CMSIS-NN padding when output dim == input dim"""


For the enlightenment, what is CMSIS-NN padding and why is it provided only when output dim == input dim?

CMSIS-NN padding is same as TFLU padding. This function was for providing "SAME" kind of padding defined by TFL / TF specs. When we expect output to be of the same dim as input, input to Conv2D / other ops needs to be padded first.

ekalda · 2021-12-10T12:04:43Z

tests/python/contrib/test_cmsisnn/utils.py



-def get_same_padding(data, kernel, dilation, stride, cmsisnn_padding=True):
+def get_padding(data, kernel, dilation, stride, padding):


What does data represent here? If it is a shape of a tensor, then I think the parameter name should represent that (or some documentation could help)

Good point. I will take care of it.

ekalda · 2021-12-10T12:11:31Z

tests/python/contrib/test_cmsisnn/test_pooling.py

        op = relay.nn.pad(
            op,
            pad_width=[(0, 0), (pad_[0], pad_[2]), (pad_[1], pad_[3]), (0, 0)],


For enlightenment, why do we add a pad operator to the graph instead of using the padding attribute of pool2d?

That has troubled me as well at the time of implementing it :)
In CMSIS-NN, its only used for calculating the appropriate start, end for the Pooling kernel. I think this should work either way. I will try it out without the preceding padding op.

asparkhi · 2021-12-10T15:11:36Z

We should definitely discuss about having TFLU E2E test with wider audience.

manupak · 2021-12-10T15:45:30Z

Agreed! that part is out of scope for this PR but meant to be a general comment :) .

As we discussed offline, if we can simply use get_<type>_padding() helpers as well as an invalid case to exercise non-offloading scenario, that should be good enough to get the PR in.

Change-Id: Ib014ab6829d67b9d05f06fa6ed58f5e775daa5dd

manupak

LGTM (modulo others' comments) now!. Thanks for working with me on this.

u99127 · 2021-12-10T17:53:42Z

LGTM (modulo others' comments) now!. Thanks for working with me on this.

LGTM .

Thanks,
Ramana

manupak · 2021-12-11T09:10:09Z

Thanks @ashutosh-arm @u99127 @mshawcroft @ekalda . This is merged now!

…che#9682) This commit fixes the following issue: MergeComposite callback function from pattern table returns non boolean data type

asparkhi requested review from MarisaKirisame, ZihengJiang, anijain2305, icemelon, jroesch, junrushao, jwfromm, mbrookhart, slyubomirsky, vinx13, wweic, yzhliu and zhiics as code owners December 8, 2021 16:05

u99127 suggested changes Dec 8, 2021

View reviewed changes

mshawcroft reviewed Dec 8, 2021

View reviewed changes

asparkhi force-pushed the merge_composite_error branch from 66f76f7 to 0eb97a5 Compare December 8, 2021 17:46

asparkhi requested review from areusch, comaniac, merrymercy and tqchen as code owners December 8, 2021 17:46

manupak requested changes Dec 9, 2021

View reviewed changes

manupak requested changes Dec 10, 2021

View reviewed changes

[CMSIS-NN] Fixed return data type from pattern callback function

0541a0c

Change-Id: If87e853e094b72e6c34e030cd1ae3b690982332f

ekalda reviewed Dec 10, 2021

View reviewed changes

asparkhi force-pushed the merge_composite_error branch from 0eb97a5 to 0541a0c Compare December 10, 2021 16:02

Fixed lint error

8ed26fc

Change-Id: Ib014ab6829d67b9d05f06fa6ed58f5e775daa5dd

manupak approved these changes Dec 10, 2021

View reviewed changes

asparkhi mentioned this pull request Dec 10, 2021

Follow up from CMSIS-NN Pooling failure #9708

Merged

manupak merged commit a28a8bf into apache:main Dec 11, 2021

asparkhi deleted the merge_composite_error branch December 15, 2021 09:32

ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022

[CMSIS-NN] Fixed return data type from pattern callback function (apa…

52a7c59

…che#9682) This commit fixes the following issue: MergeComposite callback function from pattern table returns non boolean data type

ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022

[CMSIS-NN] Fixed return data type from pattern callback function (apa…

f8dd45e

…che#9682) This commit fixes the following issue: MergeComposite callback function from pattern table returns non boolean data type



		def get_same_padding(data, kernel, dilation, stride, cmsisnn_padding=True):
		def get_padding(data, kernel, dilation, stride, padding):

[CMSIS-NN] Fixed return data type from pattern callback function #9682

[CMSIS-NN] Fixed return data type from pattern callback function #9682

Uh oh!

Conversation

asparkhi commented Dec 8, 2021

Uh oh!

u99127 left a comment

Choose a reason for hiding this comment

Uh oh!

asparkhi commented Dec 8, 2021

Uh oh!

u99127 commented Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asparkhi commented Dec 8, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asparkhi Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshawcroft Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

manupak left a comment

Choose a reason for hiding this comment

Uh oh!

asparkhi commented Dec 9, 2021

Uh oh!

asparkhi commented Dec 10, 2021

Uh oh!

manupak left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

manupak Dec 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ekalda left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asparkhi commented Dec 10, 2021

Uh oh!

manupak commented Dec 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

manupak left a comment

Choose a reason for hiding this comment

Uh oh!

u99127 commented Dec 10, 2021

Uh oh!

manupak commented Dec 11, 2021

u99127 commented Dec 8, 2021 •

edited

Loading

asparkhi Dec 8, 2021 •

edited

Loading

mshawcroft Dec 8, 2021 •

edited

Loading

manupak Dec 10, 2021 •

edited

Loading

manupak commented Dec 10, 2021 •

edited

Loading