Skip to content
This repository was archived by the owner on Feb 7, 2025. It is now read-only.
This repository was archived by the owner on Feb 7, 2025. It is now read-only.

Crop and pad in LatentDiffusionInferer #420

@virginiafdez

Description

@virginiafdez

To maximise the shape that can go through the VAE or VQ-VAE, sometimes we must pick size shapes that result in latent space shapes that cannot go through the LDM for not being multiple of 2**(num levels on the unet).
To overcome this, a solution is to pad the VAE latent space and crop it back after sampling before passing it to the VAE - VQVAE. Having a simple MONAI transform on the inferer would be enough.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions