Skip to content

Conversation

@huochaitiantang
Copy link
Contributor

The topi schedule dense_int8.cuda uses dp4a, whose types are int8 * int8 = int32.

This PR could ensure that, the quantized dense ops can find the more efficient schedule dense_int8.cuda, instead of dense_small_batch.cuda or dense_large_batch.cuda.

Copy link
Member

@icemelon icemelon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case for dense uint8 in topi?

@huochaitiantang
Copy link
Contributor Author

It seems that the schedule dense_int8.cuda does not support the uint8 dtype, so I remove it.
And I add an autotvm task extraction test case, which tests relay graphs that contain int8 dense op.

Copy link
Member

@icemelon icemelon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@icemelon
Copy link
Member

icemelon commented Mar 8, 2021

Thanks @huochaitiantang

@icemelon icemelon merged commit 8d1f5b2 into apache:main Mar 8, 2021
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
* [Relay] Fix relay op strategy for cuda dense int8

* Remove uint8 && Add autotvm task extraction test for relay graph that contains dense op (int8 * int8 -> int32)

* Reformat the code of test case
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request May 11, 2021
* [Relay] Fix relay op strategy for cuda dense int8

* Remove uint8 && Add autotvm task extraction test for relay graph that contains dense op (int8 * int8 -> int32)

* Reformat the code of test case
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants