Distributed optimizer support for contiguous param buffer with FP8 params by timmoon10 · Pull Request #1749 · NVIDIA/apex

timmoon10 · 2023-11-14T00:19:22Z

#1723 added distopt infrastructure to support FP8 parameters in NeMo, but I found a bug with contiguous_param_buffer=True. In the non-FP8 case, the local shards of the updated params are views into the contiguous buffer. The Adam kernel outputs to the buffer, we do in-place all-gathers, and the params are ready for fprop. However, the FP8 case should use a temporary buffer since the Adam kernel doesn't support FP8. The Adam kernel outputs to a temporary FP32 buffer and we cast to FP8 in the contiguous param buffer.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

crcrpar

would it be easily feasible to add a test case?

Signed-off-by: Tim Moon <tmoon@nvidia.com>

timmoon10 · 2023-11-17T23:01:37Z

👍 Done

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Debug distopt contiguous param buffers with uint8 param all-gathers

40500a3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

timmoon10 mentioned this pull request Nov 14, 2023

Distributed optimizer support for experimental FP8 tensors NVIDIA-NeMo/NeMo#7885

Closed

8 tasks

crcrpar reviewed Nov 17, 2023

View reviewed changes

Add test

1696bd4

Signed-off-by: Tim Moon <tmoon@nvidia.com>

timmoon10 requested a review from crcrpar November 17, 2023 23:01

timmoon10 mentioned this pull request Nov 17, 2023

Add distopt support for FP8 params and BF16 optimizer state NVIDIA-NeMo/NeMo#7909

Merged

8 tasks

Avoid temporary buffer for param shard in optim step if possible

42d5d8d

Signed-off-by: Tim Moon <tmoon@nvidia.com>

crcrpar merged commit a2f6683 into NVIDIA:master Nov 20, 2023

timmoon10 mentioned this pull request Jan 23, 2024

Support scaled optimizer state in distributed Adam optimizer #1771

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed optimizer support for contiguous param buffer with FP8 params#1749

Distributed optimizer support for contiguous param buffer with FP8 params#1749
crcrpar merged 3 commits intoNVIDIA:masterfrom
timmoon10:fp8-distopt-bugfix

timmoon10 commented Nov 14, 2023

Uh oh!

crcrpar left a comment

Uh oh!

timmoon10 commented Nov 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

timmoon10 commented Nov 14, 2023

Uh oh!

crcrpar left a comment

Choose a reason for hiding this comment

Uh oh!

timmoon10 commented Nov 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments