Skip to content

MLAS: Add MlasComputeSoftmax/MlasComputeExp#3846

Merged
pranavsharma merged 11 commits intomasterfrom
tracysh/mlas_softmax
May 7, 2020
Merged

MLAS: Add MlasComputeSoftmax/MlasComputeExp#3846
pranavsharma merged 11 commits intomasterfrom
tracysh/mlas_softmax

Conversation

@tracysh
Copy link
Contributor

@tracysh tracysh commented May 6, 2020

Description: This adds optimized routines to compute softmax/logsoftmax/exp.

Motivation and Context
This change adds:

  1. MlasComputeExp function to vectorize exp() over a buffer used for the Exp op. The previous implementation used Eigen but was stuck the instruction set used to build ORT. The new version supports SSE2/AVX2/AVX512F (as well as NEON/VSX).
  2. MlasComputeSoftmax function to vectorize softmax()/logsoftmax() for a NxD buffer. This is now used for the Softmax/LogSoftmax/Attention ops. This is optimized for the same platforms as MlasComputeExp.

This also restores threading to MlasComputeSoftmax for 1.3.

For an internal customer model, the changes improved a model from 134ms to 124ms per inference.

@tracysh tracysh requested a review from a team as a code owner May 6, 2020 19:36
@tracysh tracysh requested review from snnn and yufenglee May 6, 2020 19:36
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Member

@yufenglee yufenglee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@pranavsharma pranavsharma merged commit cb554fb into master May 7, 2020
@pranavsharma pranavsharma deleted the tracysh/mlas_softmax branch May 7, 2020 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants