Skip to content

First Block Caching Infra for diffusers#941

Open
quic-amitraj wants to merge 4 commits intoquic:mainfrom
quic-amitraj:wan2_non_uni_cache
Open

First Block Caching Infra for diffusers#941
quic-amitraj wants to merge 4 commits intoquic:mainfrom
quic-amitraj:wan2_non_uni_cache

Conversation

@quic-amitraj
Copy link
Copy Markdown
Contributor

@quic-amitraj quic-amitraj commented Apr 24, 2026

Wan-AI/Wan2.2-T2V-A14B-Diffusers

  1. Added support of Non-unified Wan transformers ( as for 720p, unified approach was failing). This could be enabled using flag use_unified=False
  2. Added support of first-block caching for non-unified approach, which can be enabled using flag enable_first_block_cache. Caching is only enabled with non-unified approach.
  3. Added test for both non-unified and caching

Example-

pipeline = QEffWanPipeline.from_pretrained(
    "Wan-AI/Wan2.2-T2V-A14B-Diffusers",
    use_unified=False,
    enable_first_block_cache=True,
    # Hidden-dimension downsampling used for first-block residual similarity check.
    first_block_cache_downsample_factor=4,
)

Result- 90p/16TS/81f

image

black-forest-labs/FLUX.1-schnell

  1. Enabled first block caching for flux
  2. Caching can be enabled with flag enable_first_block_cache
  3. Example-
pipeline = QEffFluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell",
    enable_first_block_cache=True,
    # Hidden-dimension downsampling used for first-block residual similarity check.
    first_block_cache_downsample_factor=4,
)

  1. Result-
    height/width=256
image

Signed-off-by: Amit Raj <amitraj@qti.qualcomm.com>
Signed-off-by: Amit Raj <amitraj@qti.qualcomm.com>
@quic-amitraj quic-amitraj self-assigned this Apr 24, 2026
@quic-amitraj quic-amitraj added the Diffusers Use for PR related to diffusers in efficient-transformers. label Apr 24, 2026
@quic-amitraj quic-amitraj marked this pull request as ready for review April 24, 2026 08:38
Signed-off-by: Amit Raj <amitraj@qti.qualcomm.com>
Signed-off-by: Amit Raj <amitraj@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Diffusers Use for PR related to diffusers in efficient-transformers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant