Implement per-channel scaling (in PyTorch) for FP8 quantization. Support PyTorch native FP8 formats. Refer to: https://pytorch.org/docs/stable/tensors.html#id7 https://arxiv.org/pdf/2209.05433
Implement per-channel scaling (in PyTorch) for FP8 quantization.
Support PyTorch native FP8 formats.
Refer to:
https://pytorch.org/docs/stable/tensors.html#id7
https://arxiv.org/pdf/2209.05433