Skip to content

[Pipeline] Extending Stable Diffusion for generating videos #2432

@sayakpaul

Description

@sayakpaul

Since Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation has been there for sometime now, it'd be cool to officially have it supported from Diffusers 🧨

The best part is the official repository (https://github.com/showlab/Tune-A-Video) itself builds on top of Diffusers.

Architecture-wise, the main change is to inflate the UNet to operate spatiotemporally. This is implemented in the UNet3DConditionModel.

@zhangjiewu will it be possible to publish a few weights on the Hugging Face Hub for the community to try out quickly? Happy to help with the process :)

We're more than happy to help if a community member wants to pick this up. As it will be the first end-to-end video pipeline in Diffusers, I'm very excited about it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions