Skip to content

[qwen-vl] fix position ids#40490

Merged
zucchini-nlp merged 6 commits intohuggingface:mainfrom
zucchini-nlp:qwen-vl-positions
Sep 1, 2025
Merged

[qwen-vl] fix position ids#40490
zucchini-nlp merged 6 commits intohuggingface:mainfrom
zucchini-nlp:qwen-vl-positions

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

What does this PR do?

Fixes #40136 and fixes #40154

The accuracy on lm eval is restored. The issue was in position ids which weren't prepared correctly when generating. Using packed or unpacked positions doesn't affect anything, as I thought at first

Ig the values in position ids do not affect much generation results, given that slow tests weren't failing. We could add a test with the specific image/prompt where the outputs are significantly different, but I don't think it is necessary

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: qwen2_vl, qwen2_5_vl

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/qwen2_5_vl', 'models/qwen2_vl']
quantizations: [] ...

@zucchini-nlp zucchini-nlp changed the title [1qwen-vl] fix position ids [qwen-vl] fix position ids Aug 27, 2025
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: qwen2_5_vl, qwen2_vl

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/qwen2_5_vl', 'models/qwen2_vl']
quantizations: [] ...

@zucchini-nlp
Copy link
Copy Markdown
Member Author

prob cc @vasqu for FA2 and packed attention

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: qwen2_5_omni, qwen2_5_vl, qwen2_vl

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/qwen2_5_omni', 'models/qwen2_5_vl', 'models/qwen2_vl']
quantizations: [] ...

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nits and question for my understanding

We could add a test with the specific image/prompt where the outputs are significantly different, but I don't think it is necessary

If it's not too much of a hassle, I'd appreciate it but no worries if not

Comment thread src/transformers/models/qwen2_5_omni/modeling_qwen2_5_omni.py
Comment thread src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py
Comment thread src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py
Comment thread src/transformers/models/qwen2_5_vl/modular_qwen2_5_vl.py
Comment thread tests/test_modeling_common.py
@zucchini-nlp zucchini-nlp enabled auto-merge (squash) September 1, 2025 09:01
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Sep 1, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen2_5_omni, qwen2_5_vl, qwen2_vl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Qwen2.5VL is broken! Qwen2.5-VL-7B-Instruct: Significant accuracy regression on MMMU benchmark with transformers >=4.54.0

4 participants