Reduce StableDiffusion memory usage

A list of ideas to explore:

* [x] Lazy transfers (so we don't load data into the GPU at once)
* [x] FP16 on load
* [x] FP16 policies on Axon
* [x] ~[Attention slicing](https://github.com/huggingface/diffusers/pull/366)~ (no longer applicable https://github.com/huggingface/diffusers/issues/4487)
* [x] ~[Flash attention](https://huggingface.co/docs/diffusers/optimization/fp16#memory-efficient-attention) ([JAX version](https://github.com/lucidrains/flash-attention-jax))~ (see notes in https://github.com/elixir-nx/bumblebee/pull/300)
* [ ] [DPM-Solver++](https://mobile.twitter.com/pcuenq/status/1590665645233881089) (more schedulers [here](https://github.com/ozanciga/diffusion-for-beginners), [here](https://stable-diffusion-art.com/samplers/#Samplers_overview), and in the comments below) ([another PyTorch implementation](https://github.com/lucidrains/denoising-diffusion-pytorch/pull/148/files))
* [ ] [TokenMerging](https://arxiv.org/abs/2303.17604)
* [ ] LCM+LoRA
* [x] ~DeepCache~ (not applicable https://github.com/elixir-nx/bumblebee/issues/147#issuecomment-1963787773)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce StableDiffusion memory usage #147

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reduce StableDiffusion memory usage #147

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions