MLAS: tune softmax kernels for partial vectors by tracysh · Pull Request #3906 · microsoft/onnxruntime

tracysh · 2020-05-11T21:50:36Z

Description: The code paths to handle partial vectors in the softmax kernels was too slow: speed it up.

Motivation and Context
This addresses some of the slowness indicated in #3892. The paths to handle partial 256b vectors used the AVX instructions to load/store data, but these can be slower than handling the data an element at a time instead. In some microbenchmarks of MlasComputeSoftmax with D < 8, the updated sequences can be twice as fast.

There is additional performance to be had from de-virtualizing the softmax kernel, but that will be done in a different PR.

Also, clean up pooling.cpp to use the new helpers to do a vector wide max/sum reduction. The code generated by this is the same.

skottmckay

…. (#4170)

…rosoft#3906. (microsoft#4170)

tracysh added 2 commits May 11, 2020 14:40

use new reduction helpers

9a0976d

tune partial vector path

660ba2a

tracysh requested a review from a team as a code owner May 11, 2020 21:50

skottmckay approved these changes May 11, 2020

View reviewed changes

tracysh merged commit b12d35b into master May 12, 2020

tracysh deleted the tracysh/softmax_tune branch May 12, 2020 01:02

skottmckay mentioned this pull request May 12, 2020

Cherry pick PRs to release branch rel-1.3.0 #3911

Closed

stevenlix pushed a commit that referenced this pull request May 12, 2020

MLAS: tune softmax kernels for partial vectors (#3906)

5f91ac2

skottmckay added a commit that referenced this pull request Jun 9, 2020

Tune setting for when to use MlasComputeSoftmax due to changes in #3906.

20868c5

skottmckay mentioned this pull request Jun 9, 2020

Tune setting for when to use MlasComputeSoftmax due to changes in #3906. #4170

Merged

skottmckay added a commit that referenced this pull request Jun 24, 2020

Tune setting for when to use MlasComputeSoftmax due to changes in #3906…

5dd3ebb

…. (#4170)

rayankrish pushed a commit to rayankrish/onnxruntime that referenced this pull request Jun 24, 2020

Tune setting for when to use MlasComputeSoftmax due to changes in mic…

8f1a65b

…rosoft#3906. (microsoft#4170)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLAS: tune softmax kernels for partial vectors#3906

MLAS: tune softmax kernels for partial vectors#3906
tracysh merged 2 commits intomasterfrom
tracysh/softmax_tune

tracysh commented May 11, 2020

Uh oh!

skottmckay left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tracysh commented May 11, 2020

Uh oh!

skottmckay left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants