-
Notifications
You must be signed in to change notification settings - Fork 125
Open
Labels
kind:choreInternal improvementsInternal improvements
Description
A list of ideas to explore:
- Lazy transfers (so we don't load data into the GPU at once)
- FP16 on load
- FP16 policies on Axon
-
Attention slicing(no longer applicable Remove attention slicing from docs huggingface/diffusers#4487) -
Flash attention (JAX version)(see notes in Refactor attention implementation #300) - DPM-Solver++ (more schedulers here, here, and in the comments below) (another PyTorch implementation)
- TokenMerging
- LCM+LoRA
-
DeepCache(not applicable Reduce StableDiffusion memory usage #147 (comment))
lin72h, pikdum, bartekupartek and DeedleFakemauricionr
Metadata
Metadata
Assignees
Labels
kind:choreInternal improvementsInternal improvements