-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Improve MaskedSoftmax by oneDNN #20853
Conversation
|
Hey @bgawrych , Thanks for submitting the PR
CI supported jobs: [sanity, website, centos-cpu, windows-gpu, miscellaneous, unix-gpu, clang, windows-cpu, edge, unix-cpu, centos-gpu] Note: |
166ea84 to
3fcbe11
Compare
3fcbe11 to
b4da62a
Compare
|
@mxnet-bot run ci [centos-cpu, unix-cpu] |
|
Jenkins CI successfully triggered : [unix-cpu, centos-cpu] |
|
@mxnet-bot run ci [windows-gpu] |
|
Jenkins CI successfully triggered : [windows-gpu] |
210146a to
ff37408
Compare
ff37408 to
a24a748
Compare
a24a748 to
45f631c
Compare
bartekkuncer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@mxnet-bot run ci [centos-gpu] |
|
Jenkins CI successfully triggered : [centos-gpu] |
Description
Utilize few oneDNN primitives to mask input and execute softmax
Accuracy comparision on GluonNLP models (not affected) [C6i.16xlarge]:

Performance comaprision (samples/s) [C6i.16xlarge]:

BERT Base can archive up to 60% more samples/s with this change