Are there any plans to sync with the upstream? e.g. the most recent bug fix PR for the fused softmax layer: https://github.com/NVIDIA/Megatron-LM/pull/133 and the corresponding commit: https://github.com/NVIDIA/Megatron-LM/commit/0be405263f28ab7aebaf3cbd80199b56fb6fe398 Thank you!
Are there any plans to sync with the upstream?
e.g. the most recent bug fix PR for the fused softmax layer: NVIDIA#133
and the corresponding commit: NVIDIA@0be4052
Thank you!