Skip to content

LSTM performance tweak + cleanup#3868

Merged
pranavsharma merged 2 commits intomasterfrom
tracysh/lstm_perf
May 8, 2020
Merged

LSTM performance tweak + cleanup#3868
pranavsharma merged 2 commits intomasterfrom
tracysh/lstm_perf

Conversation

@tracysh
Copy link
Contributor

@tracysh tracysh commented May 7, 2020

Description: Changes to LSTM operator to improve an internal first party model. Also cleanup the code after other recent changes.

Motivation and Context
For larger LSTMs, the zero fills of some intermediate buffers was a noticeable chunk of time. The buffer zeroing is not needed in the cases as the buffer is always overwritten by InitializeBuffers() or the first ComputeGemm in the operator. For an internal first party model, this saved 4ms of single-threaded time (reducing the model from 124ms/inference to 120ms).

Also cleanup the #include files used in this module to simplify the code.

@tracysh tracysh requested a review from a team as a code owner May 7, 2020 22:14
@tracysh tracysh requested a review from skottmckay May 7, 2020 22:14
skottmckay
skottmckay previously approved these changes May 7, 2020
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants