Skip to content

Revert Gather Grad optimization in PR 6381 targeted for Rocm#6880

Merged
suffiank merged 3 commits intomasterfrom
sukha/revert-rocm-gather-grad-optimization
Mar 4, 2021
Merged

Revert Gather Grad optimization in PR 6381 targeted for Rocm#6880
suffiank merged 3 commits intomasterfrom
sukha/revert-rocm-gather-grad-optimization

Conversation

@suffiank
Copy link
Contributor

@suffiank suffiank commented Mar 3, 2021

For reasons not fully understood, PR 6381 has increased the variance of the observed loss on our BERT convergence test. As it is only an optimization, we will revert for now and revisit the situation in the future.

Commit Note Final Loss Sigma Max-diff % noise
7f57317 PR 6381 0.674179 0.0136674 0.02963 2.03%
9a9e741 comment optimization 0.670375 0.00052346 0.001588 0.08%

Final loss is calculated across 10 runs of run_convergence_test.py.

@suffiank suffiank requested a review from a team as a code owner March 3, 2021 20:34
@suffiank suffiank merged commit 7915b67 into master Mar 4, 2021
@suffiank suffiank deleted the sukha/revert-rocm-gather-grad-optimization branch March 4, 2021 18:21
weixingzhang pushed a commit that referenced this pull request Aug 6, 2021
* revert gather_grad_impl.cu

* put stream changes back in

* restrict changes to commenting launch of optimized version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants