Skip to content

Support offloading feature in diffuser #3373

@jerryzh168

Description

@jerryzh168
  1. diffuser currently is using older torchao versions, we should update it to stable 0.14.1 (and use nightly for testing)

  2. Repro error and add missing support torchao to unskip the following tests:

Note: right now they are using int8 weight only: https://github.com/huggingface/diffusers/blob/d5da453de56fe73e0cfd26204ccca441af568ca1/tests/quantization/torchao/test_torchao.py#L660, but we haven't fully migrated this config yet, maybe we can start with float8 weight only config which is using the new Float8Tensor design and expand to other quant types.

https://github.com/huggingface/diffusers/blob/d5da453de56fe73e0cfd26204ccca441af568ca1/tests/quantization/torchao/test_torchao.py#L664

https://github.com/huggingface/diffusers/blob/d5da453de56fe73e0cfd26204ccca441af568ca1/tests/quantization/torchao/test_torchao.py#L673

int8 weight only is still more important, but migration is in progress: https://github.com/pytorch/ao/pull/3241/files, we can use float8 to figure out what fix is needed first and then do the same fix on the migrated int8 tensor.

cc @sayakpaul

Metadata

Metadata

Assignees

Labels

0.16integrationIssues related to integrations with other libraries, like huggingface, vllm, sglang, gemlite etc.triaged

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions