[vlm] fix loading of retrieval VLMs by zucchini-nlp · Pull Request #39242 · huggingface/transformers

zucchini-nlp · 2025-07-07T05:20:44Z

What does this PR do?

As per title, reported internally that slow tests are failing. We need to apply same changes as in VLMs to the models that use VLMs in their architecture

zucchini-nlp · 2025-07-07T05:20:57Z

run-slow: colpali, colqwen2

github-actions · 2025-07-07T05:22:19Z

This comment contains run-slow, running the specified jobs:

models: ['models/colpali', 'models/colqwen2']
quantizations: [] ...

HuggingFaceDocBuilderDev · 2025-07-07T05:33:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2025-07-07T08:41:20Z

I wanted to use AutoModel as we shouldn't be loading the lm-head for these model. But the Qwen2-based model was released after refactor, and can work without any conversion_key_mapping currently, and I don't want us to add another key mapping just to use AutoModel instead of AutoModelForImageTextToText

ydshieh

Looks reasonable fix to me (as they seems to apply the same changes made to VLM)

@zucchini-nlp I observed that the 2 model tests fails in 2 different PRs, but maybe they share the same cause so their fix seems identical here?

For context:

colpali tests are failing after

[VLM] Add base model without head (#37033)

And for colqwen2, it fails after

[qwen] refactor attentions for vision/audio (#38930)

and there is a fix [qwen2-vl] fix vision attention scaling #39043, but that one doesn't fix for colqwen

zucchini-nlp · 2025-07-08T06:55:08Z

Hmm, ColQwen for me wasn't failing in a sense that the weights matched when laoding. But the tensors aren't close enough even after model was released. I can check out on runners and see what's the issue.

ColQwen shouldn't have the same issue, it was released after the major refactor

ydshieh · 2025-07-08T08:34:12Z

Hi, sorry, I think my memory got messed

#39043 (comment)

So that issue was already fixed, but my brain wasn't yet.

Cyrilvallez

Instead of adding general logic only for colpali, we should be able to add the correct names to tied_weight_keys directly no?

github-actions · 2025-07-15T07:54:16Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: colpali, colqwen2

* fix vlm with retrieval * we can't use AutoModel because new ColQwen was released after refactor * no need for colqwen * tied weight keys are necessary, if using IMageTextToText * need to apply renaming in tied weights, only for ColPali * overwrite tied keys in ColPali * fix copies, modular can't handle if-statements

fix vlm with retrieval

aae8841

we can't use AutoModel because new ColQwen was released after refactor

e61be9f

zucchini-nlp requested review from Cyrilvallez and ydshieh and removed request for ydshieh July 7, 2025 08:38

no need for colqwen

c7050a1

ydshieh approved these changes Jul 7, 2025

View reviewed changes

zucchini-nlp added 4 commits July 8, 2025 11:30

tied weight keys are necessary, if using IMageTextToText

4170e2f

Merge remote-tracking branch 'upstream/main' into fix-rerieval-vlm

fa66afa

need to apply renaming in tied weights, only for ColPali

6b76b16

Merge branch 'main' into fix-rerieval-vlm

f8d6dcf

zucchini-nlp commented Jul 9, 2025

View reviewed changes

Comment thread src/transformers/modeling_utils.py Outdated

zucchini-nlp mentioned this pull request Jul 14, 2025

fix colpali mapping #39353

Open

Merge branch 'main' into fix-rerieval-vlm

be9f781

Cyrilvallez reviewed Jul 14, 2025

View reviewed changes

Comment thread src/transformers/modeling_utils.py Outdated

zucchini-nlp added 2 commits July 15, 2025 09:53

overwrite tied keys in ColPali

26fc242

Merge branch 'main' into fix-rerieval-vlm

b7c499b

fix copies, modular can't handle if-statements

d2eeb29

zucchini-nlp merged commit 9f41f67 into huggingface:main Jul 15, 2025
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vlm] fix loading of retrieval VLMs#39242

[vlm] fix loading of retrieval VLMs#39242
zucchini-nlp merged 11 commits intohuggingface:mainfrom
zucchini-nlp:fix-rerieval-vlm

zucchini-nlp commented Jul 7, 2025

Uh oh!

zucchini-nlp commented Jul 7, 2025

Uh oh!

github-actions Bot commented Jul 7, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 7, 2025

Uh oh!

zucchini-nlp commented Jul 7, 2025

Uh oh!

ydshieh left a comment

Uh oh!

zucchini-nlp commented Jul 8, 2025

Uh oh!

ydshieh commented Jul 8, 2025

Uh oh!

Uh oh!

Cyrilvallez left a comment

Uh oh!

Uh oh!

github-actions Bot commented Jul 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

zucchini-nlp commented Jul 7, 2025

What does this PR do?

Uh oh!

zucchini-nlp commented Jul 7, 2025

Uh oh!

github-actions Bot commented Jul 7, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 7, 2025

Uh oh!

zucchini-nlp commented Jul 7, 2025

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Jul 8, 2025

Uh oh!

ydshieh commented Jul 8, 2025

Uh oh!

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Jul 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants