Introduce LayerNorm optimization from latest Apex by Quentin-Anthony · Pull Request #277 · deepspeedai/Megatron-DeepSpeed

Quentin-Anthony · 2023-11-01T19:21:24Z

My PR lets the user disable this LayerNorm optimization, but I suspect everyone will use it so it's on-by-default.

Not backwards-compatible with older Apex. Do you need a version check or is this ok?

tjruwase · 2023-11-01T20:19:43Z

Not backwards-compatible with older Apex. Do you need a version check or is this ok?

@Quentin-Anthony, thanks for this PR. But we do need backwards-compatibility, so please add a version check.

Quentin-Anthony · 2023-11-02T00:00:26Z

Not backwards-compatible with older Apex. Do you need a version check or is this ok?

@Quentin-Anthony, thanks for this PR. But we do need backwards-compatibility, so please add a version check.

Apex doesn't have versioning yet, so I added support to manually inspect the function and see if the memory_efficient arg exists in FusedLayerNormAffineFunction.forward, which is a bit messy but does the job.

Hopefully in the future NVIDIA/apex#1648 gets merged and we can just check apex.__version__

tjruwase · 2023-11-02T01:08:54Z

Apex doesn't have versioning yet, so I added support to manually inspect the function and see if the memory_efficient arg exists in FusedLayerNormAffineFunction.forward, which is a bit messy but does the job.

Works for me. Thanks!

RuiWang1998 · 2023-11-24T03:58:39Z

Hi,

Author of NVIDIA/apex#1715 here. Thanks for incorporate this into the repo (as the default)! This is very exciting.

Moreoever, I'm writing to let you guys know that https://github.com/Quentin-Anthony/Megatron-DeepSpeed-MS/blob/046319fecccfb8053ad3de5181e48f943ff14d27/megatron/model/fused_layer_norm.py#L96C18-L96C75 also has the same memory_efficient feature in the same pr!

tjruwase · 2023-11-26T21:58:55Z

@RuiWang1998, thanks for the information. @Quentin-Anthony, do you have bandwidth to handle this?

Quentin-Anthony · 2023-11-26T23:56:22Z

@RuiWang1998, thanks for the information. @Quentin-Anthony, do you have bandwidth to handle this?

Yep I'll take care of it

Quentin-Anthony added 2 commits November 1, 2023 12:13

Introduce LayerNorm optimization from NVIDIA/apex#1715

34c9b34

Fix args call

7c59960

Quentin-Anthony requested review from GuanhuaWang, RezaYazdaniAminabadi, ShadenSmith, arashb, awan-10, conglongli, duli2012, eltonzheng, jeffra, minjiaz, mrwyattii, tjruwase, xiaoxiawu-microsoft and yaozhewei as code owners November 1, 2023 19:21

Quentin-Anthony added 2 commits November 1, 2023 14:37

Ad-hoc apex version check

8434496

Remove unnecessary TransformerConfig arg

046319f

tjruwase approved these changes Nov 2, 2023

View reviewed changes

tjruwase merged commit ef13d09 into deepspeedai:main Nov 2, 2023

RuiWang1998 mentioned this pull request Dec 21, 2023

Make fused normalization functions backward-compatible NVIDIA/apex#1760

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce LayerNorm optimization from latest Apex#277

Introduce LayerNorm optimization from latest Apex#277
tjruwase merged 4 commits intodeepspeedai:mainfrom
Quentin-Anthony:mem_efficient_ln

Quentin-Anthony commented Nov 1, 2023

Uh oh!

tjruwase commented Nov 1, 2023

Uh oh!

Quentin-Anthony commented Nov 2, 2023

Uh oh!

tjruwase commented Nov 2, 2023

Uh oh!

RuiWang1998 commented Nov 24, 2023

Uh oh!

tjruwase commented Nov 26, 2023

Uh oh!

Quentin-Anthony commented Nov 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

Quentin-Anthony commented Nov 1, 2023

Uh oh!

tjruwase commented Nov 1, 2023

Uh oh!

Quentin-Anthony commented Nov 2, 2023

Uh oh!

tjruwase commented Nov 2, 2023

Uh oh!

RuiWang1998 commented Nov 24, 2023

Uh oh!

tjruwase commented Nov 26, 2023

Uh oh!

Quentin-Anthony commented Nov 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments