Extended quantization layouts for ComfyUI, enabling loading and inference with models quantized by convert_to_quant.
| Format | Layout | quant_format | Status |
|---|---|---|---|
| FP8 (tensor-wise) | TensorCoreFP8Layout |
float8_e4m3fn |
Supported (ComfyUI built-in) |
| FP8 (row-wise) | RowWiseFP8Layout |
float8_e4m3fn_rowwise |
WIP |
| FP8 (block-wise) | BlockWiseFP8Layout |
float8_e4m3fn_blockwise |
WIP |
| INT8 (block-wise) | BlockWiseINT8Layout |
int8 |
Supported |
-
Clone to your ComfyUI custom_nodes directory:
cd ComfyUI/custom_nodes git clone https://github.com/silveroxides/ComfyUI-QuantOps.git -
(Optional) Install Triton for GPU-accelerated INT8:
# Activate your ComfyUI venv first! # Linux pip install triton # Windows pip install triton-windows
Use the QuantizedModelLoader node to load models created by convert_to_quant:
-
Quantize your model with convert_to_quant:
convert_to_quant -i model.safetensors --int8 --comfy_quant --simple --block_size 128
-
Place the output in your ComfyUI models/checkpoints folder
Use the Load CLIP (Quantized) node for INT8-quantized text encoders:
-
Quantize your text encoder (CLIP, T5, etc.):
convert_to_quant -i t5xxl.safetensors --int8 --comfy_quant --simple --block_size 128
-
Place the output in
ComfyUI/models/text_encoders/ -
Select the appropriate type (e.g.,
sd3orfluxfor T5-XXL)
MIT License
- lyogavin for PR #10864 to ComfyUI.
- Clybius for inspiring me to take on quantization and his Learned-Rounding repository.