RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

NeurIPS2025 Spotlight ★

🔥🔥🔥 RepLDM is a training-free method for higher-resolution image generation, enabling the 8k image generation! You can freely adjust the richness of colors and details in the generated image through attention guidance.

Boyuan Cao, Jiaxin Ye, Yujie Wei, and Hongming Shan*
(* Corresponding Author)
From Fudan University

📝 TODO List

SDXL Based
- Text to Image
  - [ √ ] RepLDM
  - [ √ ] FreeScale + AttentionGuidance
- +ControlNet
  - [ √ ] RepLDM
  - [___] FreeScale + AttentionGuidance
FLUX Based
- Text to Image
  - [___] RepLDM
SD3 Based
- Text to Image
  - [___] RepLDM
[___] Web UI

⚙️ Setup

Install Environment

conda create -n repldm python=3.9
conda activate repldm
pip install -e .

🚀 Quik Start

Quick start with Gradio

TODO

Text to image generation

TODO

📖 Overview of RepLDM

RepLDM enables the rapid synthesis of high-quality, high-resolution images without the need for further training.

It consists of two stages:

Synthesizing high-quality images at the training resolution using Attention Guidance.
Generating finer high-resolution images through pixel upsampling and "diffusion-denoising" loop.

Attention Guidance enables the generation of images with more vivid colors and richer details, as shown in the figure below.

Attention Guidance can be used in conjunction with plugins such as ControlNet to achieve an enhanced visual experience, as illustrated in the figure below.

Attention Guidance allows users to freely adjust the level of detail and color richness in an image according to their preferences, simply by modifying the `attention guidance scale`, as shown in the figure below.

How does attention guidance work?

Attention Guidance computes layout-enhanced representations using a training-free self-attention (TFSA) mechanism and leverages them to strengthen layout consistency:

$\tilde{\boldsymbol{z}} = \gamma\mathrm{TFSA}(\boldsymbol{z})+(1-\gamma) \boldsymbol{z}, \quad \mathrm{TFSA}(\boldsymbol{z}) = \mathrm{f}^{-1}\left(\mathrm{Softmax}\left(\frac{\mathrm{f}(\boldsymbol{z}) \mathrm{f}(\boldsymbol{z})^{\mathrm{T}}}{\lambda}\right) \mathrm{f}(\boldsymbol{z})\right),$

where $\boldsymbol{z}$ is the latent representation, $\mathrm{f}$ denotes reshape operation, and 𝛾 and 𝜆 are hyperparameters. Specifically, Attention Guidance leads each denoising step closer to the final state, as illustrated in the figure below.

🔬 On Research Comparison

The implementation in the main branch includes some modifications based on the original version. If you want to compare with the original method reported in the paper, please refer to the code in the base branch.

😉 Citation

@inproceedings{caorepldm,
  title={RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation},
  author={Cao, Boyuan and Ye, Jiaxin and Wei, Yujie and Shan, Hongming},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}
}

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
AttentionGuidance		AttentionGuidance
InferCases/RepLDM/SDXL		InferCases/RepLDM/SDXL
InferencePipelines		InferencePipelines
fig		fig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

NeurIPS2025 Spotlight ★

🔥🔥🔥 RepLDM is a training-free method for higher-resolution image generation, enabling the 8k image generation! You can freely adjust the richness of colors and details in the generated image through attention guidance.

📝 TODO List

⚙️ Setup

Install Environment

🚀 Quik Start

Quick start with Gradio

Text to image generation

📖 Overview of RepLDM

RepLDM enables the rapid synthesis of high-quality, high-resolution images without the need for further training.

Attention Guidance enables the generation of images with more vivid colors and richer details, as shown in the figure below.

Attention Guidance can be used in conjunction with plugins such as ControlNet to achieve an enhanced visual experience, as illustrated in the figure below.

Attention Guidance allows users to freely adjust the level of detail and color richness in an image according to their preferences, simply by modifying the `attention guidance scale`, as shown in the figure below.

How does attention guidance work?

🔬 On Research Comparison

😉 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

kmittle/RepLDM

Folders and files

Latest commit

History

Repository files navigation

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

NeurIPS2025 Spotlight ★

🔥🔥🔥 RepLDM is a training-free method for higher-resolution image generation, enabling the 8k image generation! You can freely adjust the richness of colors and details in the generated image through attention guidance.

📝 TODO List

⚙️ Setup

Install Environment

🚀 Quik Start

Quick start with Gradio

Text to image generation

📖 Overview of RepLDM

RepLDM enables the rapid synthesis of high-quality, high-resolution images without the need for further training.

Attention Guidance enables the generation of images with more vivid colors and richer details, as shown in the figure below.

Attention Guidance can be used in conjunction with plugins such as ControlNet to achieve an enhanced visual experience, as illustrated in the figure below.

Attention Guidance allows users to freely adjust the level of detail and color richness in an image according to their preferences, simply by modifying the attention guidance scale, as shown in the figure below.

How does attention guidance work?

🔬 On Research Comparison

😉 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Attention Guidance allows users to freely adjust the level of detail and color richness in an image according to their preferences, simply by modifying the `attention guidance scale`, as shown in the figure below.

Packages