Add ZImageImg2ImgPipeline by CalamitousFelicitousness · Pull Request #12751 · huggingface/diffusers

CalamitousFelicitousness · 2025-11-29T19:35:59Z

What does this PR do?

This PR adds img2img pipeline for Z-Image. The summary of changes are below

Updated the pipeline structure to include ZImageImg2ImgPipeline alongside ZImagePipeline.
Implemented the ZImageImg2ImgPipeline class
Mapped the new ZImageImg2ImgPipeline for image generation tasks.
Added unit tests for ZImageImg2ImgPipeline
Updated dummy objects to include ZImageImg2ImgPipeline for testing

Closes issue #12752

Tested using a simple script:

Testing script

#!/usr/bin/env python
"""Test script for ZImage img2img support (without LoRA)."""

import sys
sys.path.insert(0, '/home/ohiom/diffusers/src')

import torch
from PIL import Image
from diffusers import ZImageImg2ImgPipeline

# Paths
MODEL_PATH = "database/models/huggingface/models--Tongyi-MAI--Z-Image-Turbo/snapshots/78771b7e11b922c868dd766476bda1f4fc6bfc96"
INPUT_IMAGE_PATH = "aline_1024.jpg"  # Use existing image as input

print("Loading ZImageImg2ImgPipeline...")
pipe = ZImageImg2ImgPipeline.from_pretrained(
    MODEL_PATH,
    torch_dtype=torch.bfloat16,
    local_files_only=True,
)
pipe.to("cuda")
print("Pipeline loaded.")

# Load input image
print(f"\nLoading input image from {INPUT_IMAGE_PATH}...")
input_image = Image.open(INPUT_IMAGE_PATH).convert("RGB")
print(f"Input image size: {input_image.size}")

# Generate an image
prompt = "a woman sitting under a tree, oil painting style, impressionist, vibrant colors"
strength = 0.6  # 0.0 = no change, 1.0 = full transformation

print(f"\nGenerating image with prompt: {prompt}")
print(f"Strength: {strength}")

image = pipe(
    prompt=prompt,
    image=input_image,
    strength=strength,
    num_inference_steps=8,
    guidance_scale=3.0,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]

output_path = "test_zimage_img2img_output.png"
image.save(output_path)
print(f"\nImage saved to {output_path}")

Prompt: a woman sitting in a dark room, oil painting style, impressionist, vibrant colors

LoRA functionality depends on my other PR #12750, so they will have to be merged sequentially. I did not think there was much point in leaving it out.

Before submitting

Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul @asomoza

Updated the pipeline structure to include ZImageImg2ImgPipeline alongside ZImagePipeline. Implemented the ZImageImg2ImgPipeline class for image-to-image transformations, including necessary methods for encoding prompts, preparing latents, and denoising. Enhanced the auto_pipeline to map the new ZImageImg2ImgPipeline for image generation tasks. Added unit tests for ZImageImg2ImgPipeline to ensure functionality and performance. Updated dummy objects to include ZImageImg2ImgPipeline for testing purposes.

CalamitousFelicitousness · 2025-11-29T19:40:13Z

For some reason the VAE Tiling couldn't meet the 0.2 diff threshold, my test has upped that to 0.3, whether further investigation is warranted I am not sure.

asomoza

thanks a lot again! for this one we should probably wait for the lora one to be merged. I left a few comments

src/diffusers/pipelines/z_image/pipeline_z_image_img2img.py

src/diffusers/pipelines/auto_pipeline.py

asomoza · 2025-12-01T10:43:47Z

this model is really finicky with the img2img but it seems to be working ok.

source	img2img

CalamitousFelicitousness · 2025-12-01T12:02:20Z

@asomoza I just thought, I have inpainting PR lined up, do you think keeping this one img2img only and inpainting after that, separately, is the better approach, to keep the PR review easier? Or is it less work for you guys if I also merge this in this PR?

asomoza · 2025-12-01T12:16:52Z

I prefer to keep them separated, I'm not really sure the inpainting can be good with this model so I want to test it and maybe we can add something like differential diffusion as a switch for it to be better

CalamitousFelicitousness · 2025-12-01T12:24:23Z

Alrighty, that's how I felt as well.

Inpainting seems alright

- Add `# Copied from` annotations to encode_prompt and _encode_prompt - Add ZImagePipeline to auto_pipeline.py for AutoPipeline support

asomoza · 2025-12-03T11:51:11Z

@CalamitousFelicitousness you need to resolve the conflict before we can proceed

Resolved conflict in src/diffusers/pipelines/__init__.py by: - Accepting upstream's expanded Kandinsky5 pipelines - Preserving ZImageImg2ImgPipeline addition

CalamitousFelicitousness · 2025-12-03T22:20:44Z

@asomoza The conflict has been resolved.

HuggingFaceDocBuilderDev · 2025-12-04T02:27:10Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Resolve conflict in _toctree.yml: use upstream's z_image.md naming convention and add ZImageImg2ImgPipeline documentation.

CalamitousFelicitousness · 2025-12-06T20:36:29Z

@asomoza Gentle reminder if you could review.

asomoza

looks good, thanks a lot!

asomoza · 2025-12-08T07:32:21Z

@sayakpaul do you have time to also review it if I missed something

sayakpaul

Just some nits! Thanks for this!

sayakpaul · 2025-12-08T07:45:24Z

docs/source/en/api/pipelines/z_image.md


 Z-Image-Turbo is a distilled version of Z-Image that matches or exceeds leading competitors with only 8 NFEs (Number of Function Evaluations). It offers sub-second inference latency on enterprise-grade H800 GPUs and fits comfortably within 16G VRAM consumer devices. It excels in photorealistic image generation, bilingual text rendering (English & Chinese), and robust instruction adherence.

+## Image-to-image


If the default example docstrings don't change, then we can remove this as ## ZImageImg2ImgPipeline below will render it.

sayakpaul · 2025-12-08T07:46:51Z

tests/pipelines/z_image/test_z_image_img2img.py

+            t_scale=1000.0,
+            axes_dims=[8, 4, 4],
+            axes_lens=[256, 32, 32],
+        )


We can do something like:
https://github.com/huggingface/diffusers/blob/f67639b0bb54d3ccf7fc17157ba0b1e2e959ac5e/tests/pipelines/z_image/test_z_image.py#L104C9-L110C1

And fix the slices for test_inference()?

CalamitousFelicitousness mentioned this pull request Nov 29, 2025

Z-Image img2img and inpainting pipeline #12752

Open

2 tasks

asomoza reviewed Dec 1, 2025

View reviewed changes

src/diffusers/pipelines/z_image/pipeline_z_image_img2img.py Show resolved Hide resolved

src/diffusers/pipelines/z_image/pipeline_z_image_img2img.py Show resolved Hide resolved

src/diffusers/pipelines/auto_pipeline.py Outdated Show resolved Hide resolved

yiyixuxu and others added 2 commits December 1, 2025 08:54

Merge branch 'main' into zimage-img2img

253263e

Merge branch 'main' into zimage-img2img

0e78905

CalamitousFelicitousness requested a review from asomoza December 2, 2025 22:43

Address review comments for ZImageImg2ImgPipeline

af848d6

- Add `# Copied from` annotations to encode_prompt and _encode_prompt - Add ZImagePipeline to auto_pipeline.py for AutoPipeline support

Merge upstream/main into zimage-img2img

4912ad3

Resolved conflict in src/diffusers/pipelines/__init__.py by: - Accepting upstream's expanded Kandinsky5 pipelines - Preserving ZImageImg2ImgPipeline addition

CalamitousFelicitousness added 2 commits December 6, 2025 20:26

Add ZImage pipeline documentation

22c5c1e

Merge upstream/main into zimage-img2img

61667cf

Resolve conflict in _toctree.yml: use upstream's z_image.md naming convention and add ZImageImg2ImgPipeline documentation.

asomoza approved these changes Dec 8, 2025

View reviewed changes

asomoza self-assigned this Dec 8, 2025

asomoza requested a review from sayakpaul December 8, 2025 07:31

Merge branch 'main' into zimage-img2img

55621d9

sayakpaul approved these changes Dec 8, 2025

View reviewed changes

yiyixuxu merged commit 2246d2c into huggingface:main Dec 8, 2025
10 of 11 checks passed

Taechai mentioned this pull request Dec 10, 2025

where is the z-image-base/ & z-image-edit ? Tongyi-MAI/Z-Image#49

Open

alerikaisattera mentioned this pull request Dec 19, 2025

[Feature]: Z Image I2I pipeline vladmandic/sdnext#4480

Closed


		Z-Image-Turbo is a distilled version of Z-Image that matches or exceeds leading competitors with only 8 NFEs (Number of Function Evaluations). It offers sub-second inference latency on enterprise-grade H800 GPUs and fits comfortably within 16G VRAM consumer devices. It excels in photorealistic image generation, bilingual text rendering (English & Chinese), and robust instruction adherence.

		## Image-to-image

Conversation

CalamitousFelicitousness commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

CalamitousFelicitousness commented Nov 29, 2025

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asomoza commented Dec 1, 2025

Uh oh!

CalamitousFelicitousness commented Dec 1, 2025

Uh oh!

asomoza commented Dec 1, 2025

Uh oh!

CalamitousFelicitousness commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asomoza commented Dec 3, 2025

Uh oh!

CalamitousFelicitousness commented Dec 3, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Dec 4, 2025

Uh oh!

CalamitousFelicitousness commented Dec 6, 2025

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

asomoza commented Dec 8, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CalamitousFelicitousness commented Nov 29, 2025 •

edited

Loading

CalamitousFelicitousness commented Dec 1, 2025 •

edited

Loading