Intergrate fp16/bf16 support to sdxl model loading#791
Intergrate fp16/bf16 support to sdxl model loading#791kohya-ss merged 2 commits intokohya-ss:devfrom
Conversation
|
Thank you for this! This is very useful. I wonder it is possible to load Text Encoders as fp32, even with full_fp16/bf16 option. If it is not possible, this feature may be enabled when lowram/lowvram option is specified. |
|
Yes, |
|
Thank you for clarification, and sorry for misunderstood. I understood the text encoders are loaded in fp16/bf16 only if the Diffusers format is used. |
|
I've merged. This significantly reduces the peak memory usage. Thanks again! |
|
could this have introduced a major bug? My SDXL LoRAs are super trained now with same settings compared to before this change Actually I am testing right now effect of this option |
Maybe related issue: #788
unetandvaeto fp16/bf16 dtype if use full fp16/bf16.-
text_encodersare remained to fp32 for cache outputs.This should reduce RAM/VRAM usage peak when enable
--full_fp16/--full_bf16during model loading on CPU/GPU.It also reduced RAM usage when loading checkpoint from
safetensorsformat.