Skip to content

Conversation

@vinx13
Copy link
Member

@vinx13 vinx13 commented Sep 12, 2018

Added dp4a intrinsic to TOPI, and refactored gemm_int8 recipe. And then I will send int8 conv2d using dp4a in the next PR.

cc @tqchen @merrymercy

import tvm


def _intrin_dp4a_reduce(x_scope, y_scope, z_scope):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let us put it under cuda/tensor_intrin.py and rename to dp4a

@vinx13
Copy link
Member Author

vinx13 commented Sep 13, 2018

@tqchen I have renamed the filename and dp4a, please review.


Parameters
----------
x_scope: The storage scope of buffer for lhs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to default everything to local


Parameters
----------
x_scope: The storage scope of buffer for lhs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to default everything to local

@tqchen tqchen self-assigned this Sep 13, 2018
@tqchen tqchen added the status: need update need update based on feedbacks label Sep 13, 2018
@tqchen tqchen merged commit edf0967 into apache:master Sep 14, 2018
@tqchen
Copy link
Member

tqchen commented Sep 14, 2018

Thanks @vinx13 this is now merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants