[TOPI] Fix dtype legalize logic for CPU dot product instruction #12865

masahi · 2022-09-22T00:36:44Z

The logic in

Lines 480 to 499 in 52d6b59

    
           # How to convert data to int8 
        
           # Original --> C = A (conv) B 
        
           # A and B are int8 
        
           #   C = (A + 128 - 128) (conv) B 
        
           #   C = (A' conv B) - 128 (conv) B 
        
           # where A' = A + 128 
        
           # and 128 (conv) B is basically a reduce on CRS axis for weights. 
        
           # 
        
           # How to convert data to uint8 
        
           #   C = (A - 128 + 128) (conv) B 
        
           #   C = (A' conv B) + 128 (conv) B 
        
           # where A' = A - 128 
        
           if data_dtype == "int8": 
        
               # shift data to int8 
        
               before_shift = relay.add 
        
               after_shift = relay.subtract 
        
           else: 
        
               # shift data to uint8 
        
               before_shift = relay.subtract 
        
               after_shift = relay.add

is supposed to legalize the input dtype to be able to apply target-specific intrinsics that only support one of int8 or uint8. For example, the x86 VNNI instruction only supports uint8 activation.

But the logic is incorrect (two cases are flipped) and leads to incorrect result in the following case:

The input activation is int8, and we want to use the x86 VNNI intrinsic which only supports uint8 activations.
The input activation is uint8, and we want to use the ARM sdot intrinsic which only supports int8 activations.

The first case also applies to the Hexagon vrmpy intrinsic. I found this bug while testing vrmpy conv2d on int8 input.

To test this on CI, we need to be running on a cascadelake or ARM v8.2 (with dot product support) instance. I cannot find a way to detect such cpu feature from a python script. try / catch doesn't work because the error is raised from LLVM (LLVM ERROR: Do not know how to split the result of this operator) that I don't know how to catch. So for now the test is skipped. Any suggestion on this? @areusch @driazati

cc @tkonolige @mbrookhart

tkonolige

Thanks @masahi for fixing my mistakes!

driazati · 2022-09-26T03:18:12Z

To test this on CI

Could we have a short test that triggers the LLVM error that we run in a subprocess? That way we could just read the stdout of the process to get the C++ exception and detect the feature.

…he#12865) The logic in `python/tvm/topi/generic/conv2d.py#L480-L499` is supposed to legalize the input dtype to be able to apply target-specific intrinsics that only support one of int8 or uint8. For example, the x86 VNNI instruction only supports uint8 activation. But the logic is incorrect (two cases are flipped) and leads to incorrect result in the following case: * The input activation is int8, and we want to use the x86 VNNI intrinsic which only supports uint8 activations. * The input activation is uint8, and we want to use the ARM `sdot` intrinsic which only supports int8 activations. The first case also applies to the Hexagon `vrmpy` intrinsic. I found this bug while testing `vrmpy` conv2d on int8 input. To test this on CI, we need to be running on a cascadelake or ARM v8.2 (with dot product support) instance. I cannot find a way to detect such cpu feature from a python script. `try / catch` doesn't work because the error is raised from LLVM (`LLVM ERROR: Do not know how to split the result of this operator`) that I don't know how to catch. So for now the test is skipped.

masahi added 4 commits September 22, 2022 08:53

[Int8] Fix dtype legalize logic for CPU dot product instruction

453ff39

update test

be081e7

skip test for now

3228a5b

black

316bb23

github-actions bot requested review from mbrookhart and tkonolige September 22, 2022 00:37

tkonolige approved these changes Sep 22, 2022

View reviewed changes

tkonolige merged commit 195ae72 into apache:main Sep 22, 2022

masahi mentioned this pull request Oct 4, 2022

[TEST] CPU feature detection for x86 and ARM dot product instructions #12980

Merged

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TOPI] Fix dtype legalize logic for CPU dot product instruction #12865

[TOPI] Fix dtype legalize logic for CPU dot product instruction #12865

Uh oh!

masahi commented Sep 22, 2022 •

edited

Loading

Uh oh!

tkonolige left a comment

Uh oh!

driazati commented Sep 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	# How to convert data to int8
	# Original --> C = A (conv) B
	# A and B are int8
	# C = (A + 128 - 128) (conv) B
	# C = (A' conv B) - 128 (conv) B
	# where A' = A + 128
	# and 128 (conv) B is basically a reduce on CRS axis for weights.
	#
	# How to convert data to uint8
	# C = (A - 128 + 128) (conv) B
	# C = (A' conv B) + 128 (conv) B
	# where A' = A - 128
	if data_dtype == "int8":
	# shift data to int8
	before_shift = relay.add
	after_shift = relay.subtract
	else:
	# shift data to uint8
	before_shift = relay.subtract
	after_shift = relay.add

[TOPI] Fix dtype legalize logic for CPU dot product instruction #12865

[TOPI] Fix dtype legalize logic for CPU dot product instruction #12865

Uh oh!

Conversation

masahi commented Sep 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tkonolige left a comment

Choose a reason for hiding this comment

Uh oh!

driazati commented Sep 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

masahi commented Sep 22, 2022 •

edited

Loading