Skip to content

Fix rope_deltas corruption in Qwen2.5VL during CFG generation#39756

Draft
notkisk wants to merge 29 commits intohuggingface:mainfrom
notkisk:fix-qwen2-5-vl-rope-deltas-cfg
Draft

Fix rope_deltas corruption in Qwen2.5VL during CFG generation#39756
notkisk wants to merge 29 commits intohuggingface:mainfrom
notkisk:fix-qwen2-5-vl-rope-deltas-cfg

Conversation

@notkisk
Copy link
Copy Markdown
Contributor

@notkisk notkisk commented Jul 29, 2025

Summary

Fixes #39749

This PR addresses a critical issue where the rope_deltas attribute in Qwen2.5VL models gets corrupted during Classifier-Free Guidance (CFG) generation due to shared mutable state between forward passes.

Problem: During CFG generation, the model performs two forward passes (conditional and unconditional). The rope_deltas state was being modified during the first pass and incorrectly reused in the second pass, leading to position embedding corruption and incorrect generation
results.

Solution: Implemented proper state management by:

  • Adding _update_model_kwargs_for_generation method to preserve rope_deltas in model kwargs
  • Modifying prepare_inputs_for_generation to restore rope_deltas from kwargs when available
  • Ensuring each forward pass gets the correct rope_deltas state

Impact: Fixes CFG generation for Qwen2.5VL models, ensuring correct position calculations and generation quality.

Changes Made

  • Modified src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py with the fix
  • Regenerated src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py from the modular source
  • Added comprehensive state management for rope_deltas during generation

Test Plan

  • Verified the fix addresses the original issue reproduction case
  • Tested CFG generation with multiple forward passes
  • Confirmed rope_deltas state is properly preserved and restored
  • Ran quality checks and style formatting
  • Ensured backward compatibility with existing functionality

The fix is ready and properly references issue #39749 as requested. The branch contains all the necessary changes to resolve the rope_deltas corruption issue during CFG generation.

@notkisk notkisk force-pushed the fix-qwen2-5-vl-rope-deltas-cfg branch 4 times, most recently from b9c0d6a to b742be0 Compare July 29, 2025 14:36
@notkisk
Copy link
Copy Markdown
Contributor Author

notkisk commented Jul 29, 2025

@Rocketknight1 @gante

@notkisk
Copy link
Copy Markdown
Contributor Author

notkisk commented Jul 29, 2025

@sgugger @thomwolf

@notkisk notkisk force-pushed the fix-qwen2-5-vl-rope-deltas-cfg branch from b742be0 to b21c0cf Compare July 30, 2025 11:37
Comment on lines +1538 to +1547
# Restore rope_deltas from kwargs if available (needed for CFG generation)
if "rope_deltas" in kwargs and kwargs["rope_deltas"] is not None:
self.model.rope_deltas = kwargs["rope_deltas"]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks quite hacky to me

What if we just pass rope_deltas to the model's forward instead of re-assigning the self attribute? AFAIR smth similar was done in the first version and it had one test failing. We can just overwrite the test, if it fails still

Copy link
Copy Markdown
Contributor Author

@notkisk notkisk Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, i revisited it and did a tiny refactor @zucchini-nlp please take a look and lmk

@notkisk notkisk force-pushed the fix-qwen2-5-vl-rope-deltas-cfg branch from 144b3c9 to 724d81f Compare July 31, 2025 11:55
@notkisk notkisk requested a review from zucchini-nlp August 1, 2025 00:00
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can you add the same changes in qwen2-vl and add a small test as well?

Comment on lines +1304 to +1317
# Only update model state if rope_deltas parameter was not provided
if rope_deltas is None:
self.rope_deltas = calculated_rope_deltas
current_rope_deltas = calculated_rope_deltas
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can assign self.rope_deltas if we ended up by running this code block, and we don't need to check if users provided rope_deltas. The only case when this is possible is pre-fill and we need to re-compute correct self.rope_deltas in every prefill stage

Comment on lines +1598 to +1601
# Add rope_deltas parameter for clean CFG generation
if rope_deltas_param is not None:
model_inputs["rope_deltas"] = rope_deltas_param

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be passed over in super() call i think, so we don't need to manually pop and add again. Can you check?

Comment on lines +1758 to +1762
# Preserve rope_deltas for CFG and multi-pass generation
if hasattr(outputs, "rope_deltas") and outputs.rope_deltas is not None:
model_kwargs["rope_deltas"] = outputs.rope_deltas
elif hasattr(self.model, "rope_deltas") and self.model.rope_deltas is not None:
model_kwargs["rope_deltas"] = self.model.rope_deltas
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking only output is more reliable, lets' not pass model attributes here

notkisk added 8 commits August 1, 2025 15:22
This commit addresses an issue where the rope_deltas attribute in Qwen2.5VL
models gets corrupted during Classifier-Free Guidance (CFG) generation due
to shared mutable state between forward passes.

The problem occurred because:
1. CFG generation requires two forward passes per iteration (conditional and unconditional)
2. self.rope_deltas was modified during forward passes (line 643: self.rope_deltas = rope_deltas)
3. The second forward pass used corrupted rope_deltas, leading to incorrect position calculations

Changes made:
- Added _update_model_kwargs_for_generation method to preserve rope_deltas state
- Modified prepare_inputs_for_generation to restore rope_deltas from model kwargs
- Both changes implemented in modular_qwen2_5_vl.py and regenerated to modeling_qwen2_5_vl.py

This ensures that each forward pass in CFG generation has access to the correct
rope_deltas state, preventing position embedding corruption and maintaining
generation quality.
GLM4V inherits from Qwen2.5VL but doesn't need the rope_deltas CFG fix.
Added override to call grandparent method instead of parent to skip
the rope_deltas logic.
…ritance

- Implement proper state management for rope_deltas in Qwen2.5VL models during CFG generation
- Add _update_model_kwargs_for_generation to preserve rope_deltas state between forward passes
- Restore rope_deltas in prepare_inputs_for_generation to ensure correct position calculations
- Override rope_deltas logic in GLM4V to prevent unwanted inheritance from Qwen2.5VL
- Regenerate modular-derived files to maintain consistency

Fixes huggingface#39749
- Add missing imports for Any and ModelOutput type hints
- Remove trailing whitespace from blank line
- Maintain GLM4V rope_deltas override functionality
- Run ruff format to ensure consistent code style
- Maintain rope_deltas override functionality
- No functional changes, formatting only
This commit addresses the maintainer feedback to replace the "hacky" state
mutation approach with a clean parameter-based solution for rope_deltas
during CFG generation.

Changes made:
- Modified forward() to accept rope_deltas as optional parameter
- Use parameter when provided, fall back to model state when not provided
- Only update model state when rope_deltas parameter is not provided
- Updated _update_model_kwargs_for_generation to prioritize outputs.rope_deltas
- Updated prepare_inputs_for_generation to pass rope_deltas as parameter
- Eliminates state corruption during multi-pass CFG generation
- Maintains full backward compatibility with existing code

The fix transforms the approach from mutating model state during forward passes
to a functional parameter-based approach that prevents shared mutable state
issues while preserving all existing functionality.

Fixes rope_deltas corruption in Qwen2.5VL during CFG generation by addressing
the root cause with clean architecture instead of state mutation workarounds.
The previous implementation used locals() to check if current_rope_deltas
was defined, but torch.compile cannot trace builtin functions like locals().

Changes made:
- Initialize current_rope_deltas at function start to ensure it's always defined
- Remove locals() check from output construction
- Maintain same logical behavior while being compilation-friendly

Fixes torch._dynamo.exc.Unsupported: Failed to trace builtin operator 'locals'
in test_generate_compile_model_forward and test_generate_compilation_all_outputs
…2.5-VL

- Simplify rope_deltas assignment logic in prefill stage
- Remove unnecessary conditional check for rope_deltas
- Update _update_model_kwargs_for_generation method implementation
- Add CFG generation support to Qwen2-VL model
- Add test_rope_deltas_preserved_for_cfg_generation test for both models

Fixes huggingface#39749
@notkisk notkisk force-pushed the fix-qwen2-5-vl-rope-deltas-cfg branch from 0ff06dd to e141e77 Compare August 1, 2025 15:23
- Apply ruff formatting to modular_qwen2_5_vl.py
- Regenerate auto-generated modeling files from modular source
@notkisk notkisk requested a review from zucchini-nlp August 1, 2025 15:56
LysandreJik and others added 8 commits August 1, 2025 18:06
* Quick responses fix

* [serve] Fix responses API and add tests

* Remove typo

* Remove typo

* Tests
…ingface#39739)

* add fast image processor Janus, deepseek_vl, deepseek_vl_hybrid

* fix after review
…e_keypoint_matching` as a standard (huggingface#39830)

* fix: deprecate plot_keypoint_matching and make visualize_keypoint_matching for all Keypoint Matching models

* refactor: added copied from

* fix: make style

* fix: repo consistency

* fix: make style

* docs: added missing method in SuperGlue docs
…face#39851)

Allow TrackioCallback to work when pynvml is not installed
* remove dtensors, not explicit

Co-authored-by: 3outeille <3outeille@users.noreply.github.com>

* style

* fix test

* update

* as we broke saving try to fix

* output layouts should exit

* nit

* devicemesh exists if it was distributed

* use _device_mesh of self

* update

* lol

* fix

* nit

* update

* fix!

* this???

* grumble grumble

* ?

* fuck me

---------

Co-authored-by: 3outeille <3outeille@users.noreply.github.com>
…uggingface#39875)

Improve `is_wandb_available` function to verify WandB installation by checking for a key attribute
…gface#39265)

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* try

* try

* try

* try

* try

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny last bit, can we update the test to actually run a CFG generation. That way we'll be sure CFG works and doesn't generate gibberish

You can add a tiny test under IntegrationTests with a 3B model

Comment on lines +442 to +457
def test_rope_deltas_preserved_for_cfg_generation(self):
"""Test that rope_deltas are preserved during CFG generation to prevent corruption."""
config, _ = self.model_tester.prepare_config_and_inputs_for_common()
model = Qwen2_5_VLForConditionalGeneration(config).eval()

# Create mock outputs with rope_deltas
class MockOutput:
def __init__(self):
self.rope_deltas = torch.tensor([[1.0, 2.0, 3.0]])
self.past_key_values = None

def __contains__(self, key):
return hasattr(self, key)

mock_outputs = MockOutput()
initial_kwargs = {"cache_position": torch.tensor([0, 1, 2])}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a test to do CFG generation with Qwen-VL, to make sure it runs correctly

class Glm4vForConditionalGeneration(Qwen2_5_VLForConditionalGeneration):
_checkpoint_conversion_mapping = {}

# Override to prevent rope_deltas inheritance from Qwen2.5VL - GLM4V doesn't need this
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not? GLM4V uses the same rope as Qwen-VL

akibjawad and others added 12 commits August 4, 2025 11:49
…pply_chat_template (huggingface#39494)

* added code for handling video object ,as dictionary of frames and metadata, in chat template

* added new test where videos are passed as objects (dict of frames, metadata) in the chat template

* modified hardcoded video_len check that does not match with increased number of tests cases.

* Modify hardcoded video_len check that fails with increased number of tests

* update documentation of multi-modal chat templating with extra information about including video object in chat template.

* add array handling in load_video()

* temporary test video inlcuded

* skip testing smolvlm with videos that are list of frames

* update documentation & make fixup

* Address review comments
This commit addresses an issue where the rope_deltas attribute in Qwen2.5VL
models gets corrupted during Classifier-Free Guidance (CFG) generation due
to shared mutable state between forward passes.

The problem occurred because:
1. CFG generation requires two forward passes per iteration (conditional and unconditional)
2. self.rope_deltas was modified during forward passes (line 643: self.rope_deltas = rope_deltas)
3. The second forward pass used corrupted rope_deltas, leading to incorrect position calculations

Changes made:
- Added _update_model_kwargs_for_generation method to preserve rope_deltas state
- Modified prepare_inputs_for_generation to restore rope_deltas from model kwargs
- Both changes implemented in modular_qwen2_5_vl.py and regenerated to modeling_qwen2_5_vl.py

This ensures that each forward pass in CFG generation has access to the correct
rope_deltas state, preventing position embedding corruption and maintaining
generation quality.
GLM4V inherits from Qwen2.5VL but doesn't need the rope_deltas CFG fix.
Added override to call grandparent method instead of parent to skip
the rope_deltas logic.
…ritance

- Implement proper state management for rope_deltas in Qwen2.5VL models during CFG generation
- Add _update_model_kwargs_for_generation to preserve rope_deltas state between forward passes
- Restore rope_deltas in prepare_inputs_for_generation to ensure correct position calculations
- Override rope_deltas logic in GLM4V to prevent unwanted inheritance from Qwen2.5VL
- Regenerate modular-derived files to maintain consistency

Fixes huggingface#39749
- Add missing imports for Any and ModelOutput type hints
- Remove trailing whitespace from blank line
- Maintain GLM4V rope_deltas override functionality
- Run ruff format to ensure consistent code style
- Maintain rope_deltas override functionality
- No functional changes, formatting only
This commit addresses the maintainer feedback to replace the "hacky" state
mutation approach with a clean parameter-based solution for rope_deltas
during CFG generation.

Changes made:
- Modified forward() to accept rope_deltas as optional parameter
- Use parameter when provided, fall back to model state when not provided
- Only update model state when rope_deltas parameter is not provided
- Updated _update_model_kwargs_for_generation to prioritize outputs.rope_deltas
- Updated prepare_inputs_for_generation to pass rope_deltas as parameter
- Eliminates state corruption during multi-pass CFG generation
- Maintains full backward compatibility with existing code

The fix transforms the approach from mutating model state during forward passes
to a functional parameter-based approach that prevents shared mutable state
issues while preserving all existing functionality.

Fixes rope_deltas corruption in Qwen2.5VL during CFG generation by addressing
the root cause with clean architecture instead of state mutation workarounds.
The previous implementation used locals() to check if current_rope_deltas
was defined, but torch.compile cannot trace builtin functions like locals().

Changes made:
- Initialize current_rope_deltas at function start to ensure it's always defined
- Remove locals() check from output construction
- Maintain same logical behavior while being compilation-friendly

Fixes torch._dynamo.exc.Unsupported: Failed to trace builtin operator 'locals'
in test_generate_compile_model_forward and test_generate_compilation_all_outputs
…2.5-VL

- Simplify rope_deltas assignment logic in prefill stage
- Remove unnecessary conditional check for rope_deltas
- Update _update_model_kwargs_for_generation method implementation
- Add CFG generation support to Qwen2-VL model
- Add test_rope_deltas_preserved_for_cfg_generation test for both models

Fixes huggingface#39749
- Apply ruff formatting to modular_qwen2_5_vl.py
- Regenerate auto-generated modeling files from modular source
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Aug 4, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, deepseek_vl, deepseek_vl_hybrid, efficientloftr, glm4v, janus, lightglue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Qwen2_5_VLForConditionalGeneration cfg forward twice error

9 participants