Fix LTX-2 image-to-video generation failure in two stages generation#13187
Fix LTX-2 image-to-video generation failure in two stages generation#13187dg845 merged 4 commits intohuggingface:mainfrom
Conversation
c07e9bf to
12e0305
Compare
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks! Could you also add a simple test case for this?
In LTX-2's two-stage image-to-video generation task, specifically after the upsampling step, a shape mismatch occurs between the `latents` and the `conditioning_mask`, which causes an error in function `_create_noised_state`. Fix it by creating the `conditioning_mask` based on the shape of the `latents`.
dg845
left a comment
There was a problem hiding this comment.
Thanks for the PR! I agree with #13187 (review) that adding a test case for this would be useful.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@bot /style |
|
Style bot fixed some files and pushed the changes. |
490fc13 to
389bb94
Compare
|
@sayakpaul @dg845 Hi, I pushed the unit test. Please take a review again. Thanks! |
| upsampler = LTX2LatentUpsamplerModel( | ||
| in_channels=in_channels, | ||
| ) |
There was a problem hiding this comment.
| upsampler = LTX2LatentUpsamplerModel( | |
| in_channels=in_channels, | |
| ) | |
| upsampler = LTX2LatentUpsamplerModel( | |
| in_channels=in_channels, | |
| mid_channels=32, | |
| num_blocks_per_stage=1, | |
| ) |
Would it be possible to use a smaller latent upsampler so that the test_two_stages_inference_with_upsampler test is less heavy? Maybe something like the suggestion above?
There was a problem hiding this comment.
nice catch! updated.
|
@bot /style |
|
Style bot fixed some files and pushed the changes. |
What does this PR do?
Fix failure in LTX-2 image-to-video two stages generation.
LTX-2 image-to-video two stages generation sampling code.
It gots error as below:
In LTX-2's two-stage image-to-video generation task, specifically after the upsampling step, a shape mismatch occurs between the
latentsand theconditioning_mask, which causes an error in function_create_noised_state.After applying this patch, the previously mentioned error is fixed.
ltx2_video2.mp4
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul
@DN6