Fix rope_deltas corruption in Qwen2.5VL during CFG generation#39756
Fix rope_deltas corruption in Qwen2.5VL during CFG generation#39756notkisk wants to merge 29 commits intohuggingface:mainfrom
Conversation
b9c0d6a to
b742be0
Compare
b742be0 to
b21c0cf
Compare
| # Restore rope_deltas from kwargs if available (needed for CFG generation) | ||
| if "rope_deltas" in kwargs and kwargs["rope_deltas"] is not None: | ||
| self.model.rope_deltas = kwargs["rope_deltas"] | ||
|
|
There was a problem hiding this comment.
looks quite hacky to me
What if we just pass rope_deltas to the model's forward instead of re-assigning the self attribute? AFAIR smth similar was done in the first version and it had one test failing. We can just overwrite the test, if it fails still
There was a problem hiding this comment.
Thanks for the feedback, i revisited it and did a tiny refactor @zucchini-nlp please take a look and lmk
144b3c9 to
724d81f
Compare
zucchini-nlp
left a comment
There was a problem hiding this comment.
Thanks! Can you add the same changes in qwen2-vl and add a small test as well?
| # Only update model state if rope_deltas parameter was not provided | ||
| if rope_deltas is None: | ||
| self.rope_deltas = calculated_rope_deltas | ||
| current_rope_deltas = calculated_rope_deltas |
There was a problem hiding this comment.
i think we can assign self.rope_deltas if we ended up by running this code block, and we don't need to check if users provided rope_deltas. The only case when this is possible is pre-fill and we need to re-compute correct self.rope_deltas in every prefill stage
| # Add rope_deltas parameter for clean CFG generation | ||
| if rope_deltas_param is not None: | ||
| model_inputs["rope_deltas"] = rope_deltas_param | ||
|
|
There was a problem hiding this comment.
this will be passed over in super() call i think, so we don't need to manually pop and add again. Can you check?
| # Preserve rope_deltas for CFG and multi-pass generation | ||
| if hasattr(outputs, "rope_deltas") and outputs.rope_deltas is not None: | ||
| model_kwargs["rope_deltas"] = outputs.rope_deltas | ||
| elif hasattr(self.model, "rope_deltas") and self.model.rope_deltas is not None: | ||
| model_kwargs["rope_deltas"] = self.model.rope_deltas |
There was a problem hiding this comment.
checking only output is more reliable, lets' not pass model attributes here
This commit addresses an issue where the rope_deltas attribute in Qwen2.5VL models gets corrupted during Classifier-Free Guidance (CFG) generation due to shared mutable state between forward passes. The problem occurred because: 1. CFG generation requires two forward passes per iteration (conditional and unconditional) 2. self.rope_deltas was modified during forward passes (line 643: self.rope_deltas = rope_deltas) 3. The second forward pass used corrupted rope_deltas, leading to incorrect position calculations Changes made: - Added _update_model_kwargs_for_generation method to preserve rope_deltas state - Modified prepare_inputs_for_generation to restore rope_deltas from model kwargs - Both changes implemented in modular_qwen2_5_vl.py and regenerated to modeling_qwen2_5_vl.py This ensures that each forward pass in CFG generation has access to the correct rope_deltas state, preventing position embedding corruption and maintaining generation quality.
GLM4V inherits from Qwen2.5VL but doesn't need the rope_deltas CFG fix. Added override to call grandparent method instead of parent to skip the rope_deltas logic.
…ritance - Implement proper state management for rope_deltas in Qwen2.5VL models during CFG generation - Add _update_model_kwargs_for_generation to preserve rope_deltas state between forward passes - Restore rope_deltas in prepare_inputs_for_generation to ensure correct position calculations - Override rope_deltas logic in GLM4V to prevent unwanted inheritance from Qwen2.5VL - Regenerate modular-derived files to maintain consistency Fixes huggingface#39749
- Add missing imports for Any and ModelOutput type hints - Remove trailing whitespace from blank line - Maintain GLM4V rope_deltas override functionality
- Run ruff format to ensure consistent code style - Maintain rope_deltas override functionality - No functional changes, formatting only
This commit addresses the maintainer feedback to replace the "hacky" state mutation approach with a clean parameter-based solution for rope_deltas during CFG generation. Changes made: - Modified forward() to accept rope_deltas as optional parameter - Use parameter when provided, fall back to model state when not provided - Only update model state when rope_deltas parameter is not provided - Updated _update_model_kwargs_for_generation to prioritize outputs.rope_deltas - Updated prepare_inputs_for_generation to pass rope_deltas as parameter - Eliminates state corruption during multi-pass CFG generation - Maintains full backward compatibility with existing code The fix transforms the approach from mutating model state during forward passes to a functional parameter-based approach that prevents shared mutable state issues while preserving all existing functionality. Fixes rope_deltas corruption in Qwen2.5VL during CFG generation by addressing the root cause with clean architecture instead of state mutation workarounds.
The previous implementation used locals() to check if current_rope_deltas was defined, but torch.compile cannot trace builtin functions like locals(). Changes made: - Initialize current_rope_deltas at function start to ensure it's always defined - Remove locals() check from output construction - Maintain same logical behavior while being compilation-friendly Fixes torch._dynamo.exc.Unsupported: Failed to trace builtin operator 'locals' in test_generate_compile_model_forward and test_generate_compilation_all_outputs
…2.5-VL - Simplify rope_deltas assignment logic in prefill stage - Remove unnecessary conditional check for rope_deltas - Update _update_model_kwargs_for_generation method implementation - Add CFG generation support to Qwen2-VL model - Add test_rope_deltas_preserved_for_cfg_generation test for both models Fixes huggingface#39749
0ff06dd to
e141e77
Compare
- Apply ruff formatting to modular_qwen2_5_vl.py - Regenerate auto-generated modeling files from modular source
* Quick responses fix * [serve] Fix responses API and add tests * Remove typo * Remove typo * Tests
…ingface#39739) * add fast image processor Janus, deepseek_vl, deepseek_vl_hybrid * fix after review
…e_keypoint_matching` as a standard (huggingface#39830) * fix: deprecate plot_keypoint_matching and make visualize_keypoint_matching for all Keypoint Matching models * refactor: added copied from * fix: make style * fix: repo consistency * fix: make style * docs: added missing method in SuperGlue docs
…face#39851) Allow TrackioCallback to work when pynvml is not installed
* remove dtensors, not explicit Co-authored-by: 3outeille <3outeille@users.noreply.github.com> * style * fix test * update * as we broke saving try to fix * output layouts should exit * nit * devicemesh exists if it was distributed * use _device_mesh of self * update * lol * fix * nit * update * fix! * this??? * grumble grumble * ? * fuck me --------- Co-authored-by: 3outeille <3outeille@users.noreply.github.com>
…uggingface#39875) Improve `is_wandb_available` function to verify WandB installation by checking for a key attribute
…gface#39265) Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* try * try * try * try * try --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
zucchini-nlp
left a comment
There was a problem hiding this comment.
One tiny last bit, can we update the test to actually run a CFG generation. That way we'll be sure CFG works and doesn't generate gibberish
You can add a tiny test under IntegrationTests with a 3B model
| def test_rope_deltas_preserved_for_cfg_generation(self): | ||
| """Test that rope_deltas are preserved during CFG generation to prevent corruption.""" | ||
| config, _ = self.model_tester.prepare_config_and_inputs_for_common() | ||
| model = Qwen2_5_VLForConditionalGeneration(config).eval() | ||
|
|
||
| # Create mock outputs with rope_deltas | ||
| class MockOutput: | ||
| def __init__(self): | ||
| self.rope_deltas = torch.tensor([[1.0, 2.0, 3.0]]) | ||
| self.past_key_values = None | ||
|
|
||
| def __contains__(self, key): | ||
| return hasattr(self, key) | ||
|
|
||
| mock_outputs = MockOutput() | ||
| initial_kwargs = {"cache_position": torch.tensor([0, 1, 2])} |
There was a problem hiding this comment.
I meant a test to do CFG generation with Qwen-VL, to make sure it runs correctly
| class Glm4vForConditionalGeneration(Qwen2_5_VLForConditionalGeneration): | ||
| _checkpoint_conversion_mapping = {} | ||
|
|
||
| # Override to prevent rope_deltas inheritance from Qwen2.5VL - GLM4V doesn't need this |
There was a problem hiding this comment.
why not? GLM4V uses the same rope as Qwen-VL
…pply_chat_template (huggingface#39494) * added code for handling video object ,as dictionary of frames and metadata, in chat template * added new test where videos are passed as objects (dict of frames, metadata) in the chat template * modified hardcoded video_len check that does not match with increased number of tests cases. * Modify hardcoded video_len check that fails with increased number of tests * update documentation of multi-modal chat templating with extra information about including video object in chat template. * add array handling in load_video() * temporary test video inlcuded * skip testing smolvlm with videos that are list of frames * update documentation & make fixup * Address review comments
This commit addresses an issue where the rope_deltas attribute in Qwen2.5VL models gets corrupted during Classifier-Free Guidance (CFG) generation due to shared mutable state between forward passes. The problem occurred because: 1. CFG generation requires two forward passes per iteration (conditional and unconditional) 2. self.rope_deltas was modified during forward passes (line 643: self.rope_deltas = rope_deltas) 3. The second forward pass used corrupted rope_deltas, leading to incorrect position calculations Changes made: - Added _update_model_kwargs_for_generation method to preserve rope_deltas state - Modified prepare_inputs_for_generation to restore rope_deltas from model kwargs - Both changes implemented in modular_qwen2_5_vl.py and regenerated to modeling_qwen2_5_vl.py This ensures that each forward pass in CFG generation has access to the correct rope_deltas state, preventing position embedding corruption and maintaining generation quality.
GLM4V inherits from Qwen2.5VL but doesn't need the rope_deltas CFG fix. Added override to call grandparent method instead of parent to skip the rope_deltas logic.
…ritance - Implement proper state management for rope_deltas in Qwen2.5VL models during CFG generation - Add _update_model_kwargs_for_generation to preserve rope_deltas state between forward passes - Restore rope_deltas in prepare_inputs_for_generation to ensure correct position calculations - Override rope_deltas logic in GLM4V to prevent unwanted inheritance from Qwen2.5VL - Regenerate modular-derived files to maintain consistency Fixes huggingface#39749
- Add missing imports for Any and ModelOutput type hints - Remove trailing whitespace from blank line - Maintain GLM4V rope_deltas override functionality
- Run ruff format to ensure consistent code style - Maintain rope_deltas override functionality - No functional changes, formatting only
This commit addresses the maintainer feedback to replace the "hacky" state mutation approach with a clean parameter-based solution for rope_deltas during CFG generation. Changes made: - Modified forward() to accept rope_deltas as optional parameter - Use parameter when provided, fall back to model state when not provided - Only update model state when rope_deltas parameter is not provided - Updated _update_model_kwargs_for_generation to prioritize outputs.rope_deltas - Updated prepare_inputs_for_generation to pass rope_deltas as parameter - Eliminates state corruption during multi-pass CFG generation - Maintains full backward compatibility with existing code The fix transforms the approach from mutating model state during forward passes to a functional parameter-based approach that prevents shared mutable state issues while preserving all existing functionality. Fixes rope_deltas corruption in Qwen2.5VL during CFG generation by addressing the root cause with clean architecture instead of state mutation workarounds.
The previous implementation used locals() to check if current_rope_deltas was defined, but torch.compile cannot trace builtin functions like locals(). Changes made: - Initialize current_rope_deltas at function start to ensure it's always defined - Remove locals() check from output construction - Maintain same logical behavior while being compilation-friendly Fixes torch._dynamo.exc.Unsupported: Failed to trace builtin operator 'locals' in test_generate_compile_model_forward and test_generate_compilation_all_outputs
…2.5-VL - Simplify rope_deltas assignment logic in prefill stage - Remove unnecessary conditional check for rope_deltas - Update _update_model_kwargs_for_generation method implementation - Add CFG generation support to Qwen2-VL model - Add test_rope_deltas_preserved_for_cfg_generation test for both models Fixes huggingface#39749
- Apply ruff formatting to modular_qwen2_5_vl.py - Regenerate auto-generated modeling files from modular source
…otkisk/transformers into fix-qwen2-5-vl-rope-deltas-cfg
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, deepseek_vl, deepseek_vl_hybrid, efficientloftr, glm4v, janus, lightglue |
Summary
Fixes #39749
This PR addresses a critical issue where the
rope_deltasattribute in Qwen2.5VL models gets corrupted during Classifier-Free Guidance (CFG) generation due to shared mutable state between forward passes.Problem: During CFG generation, the model performs two forward passes (conditional and unconditional). The
rope_deltasstate was being modified during the first pass and incorrectly reused in the second pass, leading to position embedding corruption and incorrect generationresults.
Solution: Implemented proper state management by:
_update_model_kwargs_for_generationmethod to preserverope_deltasin model kwargsprepare_inputs_for_generationto restorerope_deltasfrom kwargs when availablerope_deltasstateImpact: Fixes CFG generation for Qwen2.5VL models, ensuring correct position calculations and generation quality.
Changes Made
src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.pywith the fixsrc/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.pyfrom the modular sourcerope_deltasduring generationTest Plan
rope_deltasstate is properly preserved and restoredThe fix is ready and properly references issue #39749 as requested. The branch contains all the necessary changes to resolve the rope_deltas corruption issue during CFG generation.