Optimize rope_deltas propagation logic in Qwen2.5-VL#41176
Optimize rope_deltas propagation logic in Qwen2.5-VL#41176Xqle wants to merge 4 commits intohuggingface:mainfrom
Conversation
…econd_per_grid_ts
- Forward rope_deltas from Qwen2_5_VLForConditionalGeneration to Qwen2_5_VLModel - Update Qwen2_5_VLModel to accept rope_deltas and store internally - Refactor prepare_inputs_for_generation to unify rope_deltas handling - Ensure that passing rope_deltas in forward() now correctly affects position_ids calculation This fixes an issue where passing rope_deltas directly to the model's forward() had no effect, which could lead to inconsistencies between pre-fill generation and manual forward calls.
|
cc @zucchini-nlp for VLMs, @ArthurZucker for rope |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: qwen2_5_vl |
| or (past_key_values is None or past_key_values.get_seq_length() == 0) | ||
| ) | ||
| if (prefill_compiled_stage or prefill_noncompiled_stage) or self.rope_deltas is None: | ||
| if (prefill_compiled_stage or prefill_noncompiled_stage) or rope_deltas is None: |
There was a problem hiding this comment.
Isn't it enough to move above the line self.rope_deltas = rope_deltas?
molbap
left a comment
There was a problem hiding this comment.
Indeed rope_deltas is not propagated from forward, but I think we can catch it earlier and not touch the rest
|
Btw could this influence #41180 ? If the rope_deltas state is never modified in the forward |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Hey @Xqle , thanks for the PR! Indeed they are not being propagated to forward. Though we can also see that the forward does not use rope_deltas so the current changes will not be enough to make it work
We had a PR in the past which was very close to how it should look like for Qwen-VL series in #39756. It got stale so we could not merge it. I'd suggest to make changes accordingly so that the deltas are actually used by the forward (pls take a look at comments under #39756) and add a small test
|
LMK if you have bandwidth to apply the changes, otherwise I will add it to my TODO so we don't forget adding it in the future :) |
What does this PR do?
This PR fixes the propagation of
rope_deltasin the Qwen2.5-VL model during generation.Currently, the
forward()method ofQwen2_5_VLForConditionalGenerationacceptsrope_deltasas an argument, but the value is never passed to the underlyingQwen2_5_VLModel. As a result, users providingrope_deltasdirectly toforward()would see no effect.Modifications
Qwen2_5_VLForConditionalGeneration.forward()now passesrope_deltastoQwen2_5_VLModel.forward().Qwen2_5_VLModel.forward()now acceptsrope_deltasand updates its internal state accordingly.prepare_inputs_for_generation()to store calculated rope_deltasin model_inputs, aligning its handling with position_ids.
Impact
rope_deltasduring generation and have them correctly applied.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.