fix(musicgen_melody): use DynamicCache instead of EncoderDecoderCache#45738
fix(musicgen_melody): use DynamicCache instead of EncoderDecoderCache#45738adityachoksi2512 wants to merge 1 commit intohuggingface:mainfrom
Conversation
9b910bc to
cc5566d
Compare
cc5566d to
efa9db1
Compare
|
[For maintainers] Suggested jobs to run (before merge) run-slow: musicgen_melody |
|
I noticed @voodoovampire removed the regression test with a note about deeper investigation. Happy to revisit the fix if the root cause turns out to be more complex - let me know if additional changes are needed. cc @ebezzam @eustlb |
|
Hey @adityachoksi2512 - just to clarify the situation: I independently created PR #45737 which includes: 1.Your cache fix (cherry-picked with full co-author credit to you) 2.A regression test that proves the audio conditioning bug still exists even with the cache fix I temporarily removed the test but have now restored it as @pytest.mark.xfail to document the expected behavior for future work. Your cache fix prevents crashes, which is valuable. But the regression test shows audio conditioning is still broken - two different audio inputs produce identical outputs. The root cause needs deeper investigation beyond just the cache type change. Both PRs address the same issue - maintainers will decide which to merge. Just wanted to make sure the full context is clear. |
|
Thanks for the clarification @voodoovampire. Good point on the deeper investigation - I'll defer to the maintainers on next steps. Happy to contribute further in whatever direction they suggest. cc @ebezzam @eustlb |
What does this PR do?
Fixes #45647
MusicgenMelody fuses
encoder_hidden_statesdirectly intoinputs_embedsand uses pure self-attention — it does not use cross-attention. Using
EncoderDecoderCachecaused audio conditioning to be silently ignoredduring generation, producing byte-identical output regardless of the audio input.
Root cause
Identified via
git bisect— the regression was introduced in #38635, whichrefactored the cache system. The decoder was incorrectly initialized with
EncoderDecoderCachewhen it only needs a plainDynamicCache.Fix
One line change in
MusicgenMelodyDecoder.forward():Testing
Existing test suite passes (133 passed, 62 skipped).
Regression test for this specific bug is in #45737.