-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Description
I'm encountering a training regression after upgrading to DIffSynth2.0. When running the official training script example/wanvideo/model_training/full/Wan2.2-TI2V-5B.sh, the resulting model generates severely distorted outputs, particularly in the first few frames of the generated video. See example output:
video_Wan2.2-TI2V-5B.mp4
However, when I downgrade back to the codebase to v1.19 (and use the corresponding training script from that release), training succeeds and produces expected results—no such artifacts appear. I have compared the corresponding codes but I have no idea about what makes the difference. I think it should a bug in 2.0. Can anyone help?
testing env: torch==2.5.1+cu12.4 torchvision==0.20.1
Metadata
Metadata
Assignees
Labels
No labels