Skip to content

[qwen2-vl] fix vision attention scaling#39043

Merged
ArthurZucker merged 1 commit intohuggingface:mainfrom
zucchini-nlp:qwen2-fix
Jun 26, 2025
Merged

[qwen2-vl] fix vision attention scaling#39043
ArthurZucker merged 1 commit intohuggingface:mainfrom
zucchini-nlp:qwen2-fix

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

What does this PR do?

As per title, after the refactor scaling was accidentally changed from 1/math.sqrt(head_dim) to math.sqrt(head_dim)

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: qwen2_vl,qwen2_5_vl,qwen2_5_omni

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/qwen2_5_omni', 'models/qwen2_5_vl', 'models/qwen2_vl']
quantizations: [] ...

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Since the CI is green, and the fix is so simple, Looks too great to me !

@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Jun 26, 2025

Ah, @zucchini-nlp , there is also

tests/models/colqwen2/test_modeling_colqwen2.py::ColQwen2ModelIntegrationTest::test_model_integration_test

which is failing due to #38930. But also failing with this PR.

@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Jun 26, 2025

previously at this block

        # Check if the maximum scores per row are in the diagonal of the matrix score
        self.assertTrue((scores.argmax(axis=1) == torch.arange(len(ds), device=scores.device)).all())

in tests/models/colqwen2/test_modeling_colqwen2.py

we have

tensor([[7.0820, 6.6836, 7.5547],
        [8.1797, 9.3516, 8.0312],
        [7.6641, 8.3359, 8.9922]], dtype=torch.float16)

but after #38930

tensor([[15.0703,  8.7422, 15.0312],
        [ 9.5078, 16.8906, 10.6250],
        [15.6484, 12.3984, 20.4688]], dtype=torch.float16)

@ydshieh ydshieh self-requested a review June 26, 2025 10:26
Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed thanks!

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much needed thanks for catching this!

@ArthurZucker ArthurZucker merged commit 44b2316 into huggingface:main Jun 26, 2025
14 of 15 checks passed
@zucchini-nlp
Copy link
Copy Markdown
Member Author

@ydshieh the ColQwen test is failing for me even before the Qwen2-VL vision refactor, maybe it started failing due to other PR?

@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Jun 30, 2025

@ydshieh the ColQwen test is failing for me even before the Qwen2-VL vision refactor, maybe it started failing due to other PR?

I will check again (inside CI runner, it's that PR causing problem. Do you check locally or inside a SSH runner?)

@zucchini-nlp
Copy link
Copy Markdown
Member Author

Only locally for now

@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Jun 30, 2025

hmm, it is passing now on main, and even on the commit of this merged PR.

Either I made some mistake when checking back then or something strange there. Sorry to bother.

We are ✅

@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Jun 30, 2025

when I say pass, I mean on A10

zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants