Introduce LayerNorm optimization from latest Apex#277
Introduce LayerNorm optimization from latest Apex#277tjruwase merged 4 commits intodeepspeedai:mainfrom
Conversation
@Quentin-Anthony, thanks for this PR. But we do need backwards-compatibility, so please add a version check. |
Apex doesn't have versioning yet, so I added support to manually inspect the function and see if the Hopefully in the future NVIDIA/apex#1648 gets merged and we can just check |
Works for me. Thanks! |
|
Hi, Author of NVIDIA/apex#1715 here. Thanks for incorporate this into the repo (as the default)! This is very exciting. Moreoever, I'm writing to let you guys know that https://github.com/Quentin-Anthony/Megatron-DeepSpeed-MS/blob/046319fecccfb8053ad3de5181e48f943ff14d27/megatron/model/fused_layer_norm.py#L96C18-L96C75 also has the same memory_efficient feature in the same pr! |
|
@RuiWang1998, thanks for the information. @Quentin-Anthony, do you have bandwidth to handle this? |
Yep I'll take care of it |
Introduced in NVIDIA/apex#1715
My PR lets the user disable this LayerNorm optimization, but I suspect everyone will use it so it's on-by-default.
Not backwards-compatible with older Apex. Do you need a version check or is this ok?