Skip to content

[FEATURE]: [PyTorch] per-channel FP8 quantization #5873

@BurkeHulk

Description

@BurkeHulk

Implement per-channel scaling (in PyTorch) for FP8 quantization.
Support PyTorch native FP8 formats.
Refer to:
https://pytorch.org/docs/stable/tensors.html#id7
https://arxiv.org/pdf/2209.05433

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions