Skip to content

[FEATURE]: support BF16 mixed precision training #3839

@ver217

Description

@ver217

Describe the feature

BFloat16 may be useful when training LLMs. We should support BF16 mixed precision training.

UPDATED: I found avx does not support load bf16 directly. BF16 for cpu adam kernel may be not implemented in the near future.

Possible roadmap:

  • fused adam kernel support bf16 grad.
  • low level zero support bf16. Since it may use grad accumulation, the accumulated grad should be fp32. So we have to leave a flag, like force_fp32_grad to control this.
  • gemini support bf16.
  • cpu adam kernel support bf16 grad.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions