Keep using float16 in ldm.modules.attention; ~12% speed up and less VRAM #494

mh-dm · 2022-09-10T23:09:17Z

Tested on nvidia eGPU setup so YMMV with the default half precision math. Speed from 1.69it/s to 1.89it/s and max VRAM from 4.44G to 3.37G for generating 512x512 images. Measured after applying a separate PR #484 Move model.half() before model.to(device)

…ax VRAM Tested on nvidia eGPU setup so YMMV with the default half precision math. Speed from 1.69it/s to 1.89it/s and max VRAM from 4.44G to 3.37G for generating 512x512 images. Measured after applying a separate PR #484 Move model.half() before model.to(device)

tildebyte · 2022-09-11T01:51:50Z

If this is identical to #495, please close this one. Only bugfixes should target 'main'.

lstein closed this Sep 11, 2022

mh-dm deleted the float16 branch September 12, 2022 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep using float16 in ldm.modules.attention; ~12% speed up and less VRAM #494

Keep using float16 in ldm.modules.attention; ~12% speed up and less VRAM #494

Uh oh!

mh-dm commented Sep 10, 2022

Uh oh!

tildebyte commented Sep 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Keep using float16 in ldm.modules.attention; ~12% speed up and less VRAM #494

Keep using float16 in ldm.modules.attention; ~12% speed up and less VRAM #494

Uh oh!

Conversation

mh-dm commented Sep 10, 2022

Uh oh!

tildebyte commented Sep 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants