Skip to content

novasearch/CoSeD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CoSeD

Contrastive Sequential-Diffusion Learning: Non-linear and Multi-Scene Instructional Video Synthesis

arXiv Project

This is the official repository for Contrastive Sequential-Diffusion Learning: Non-linear and Multi-Scene Instructional Video Synthesis (WACV 2025).

Code Structure

9q3eu8vi/checkpoints - weights available

generate_only_latents.py - script for generating images using latents and performing inference with the SoftAttention model

latents_singleton.py - contains the Latents class, a singleton for managing latent vectors

sequence_predictor.py - contains the SoftAttention model and related functions for processing text and image embeddings

videos_softattention.py - script for generating videos from images using a diffusion pipeline and concatenating them into a single video

Citation

If you find CoSeD useful for your research and applications, please cite using this BibTeX:

@misc{ramos2024contrastivesequentialdiffusionlearningnonlinear,
      title={Contrastive Sequential-Diffusion Learning: Non-linear and Multi-Scene Instructional Video Synthesis},
      author={Vasco Ramos and Yonatan Bitton and Michal Yarom and Idan Szpektor and Joao Magalhaes},
      year={2024},
      eprint={2407.11814},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.11814},
}

About

[WACV'25] Contrastive Sequential-Diffusion Learning: Non-linear and Multi-Scene Instructional Video Synthesis

Resources

Stars

Watchers

Forks

Contributors