Fix vlm weight mappings by Cyrilvallez · Pull Request #45358 · huggingface/transformers

Cyrilvallez · 2026-04-10T10:19:42Z

What does this PR do?

Fix #45357 finally. This was not catched in the previous fix, as the model can be reloaded correctly by from_pretrained, but keys are still wrongly serialized!

After deeper look, I noticed @zucchini-nlp did not correctly copy the mappings in #44627... It is EXTREMELY IMPORTANT and can very easily silently break loading and/or saving @zucchini-nlp - we cannot touch the mappings without being 100% sure of the change. You missed a lot before
In this case, most would load correctly, but would not resave the same format

For ref, I'm using this small snippet to check formats:

import transformers
from transformers import LlavaForConditionalGeneration, LlavaNextForConditionalGeneration
from safetensors.torch import load_file
from transformers.utils.hub import cached_file, cached_files
import json

# model_id = "llava-hf/llava-1.5-7b-hf"
model_id = "llava-hf/llava-v1.6-mistral-7b-hf"
# model_id = "adept/fuyu-8b"
model_class = transformers.LlavaNextForConditionalGeneration

target_folder = "/raid/cyril/test_model"

with open(cached_file(model_id, "model.safetensors.index.json")) as f:
    index = json.load(f)
model_files = set(index["weight_map"].values())
model_files = cached_files(model_id, model_files)

original_state_dict = {}
for file in model_files:
    original_state_dict.update(load_file(file))

model = model_class.from_pretrained(model_id)
model.save_pretrained(target_folder)

saved_weights = load_file(f"{target_folder}/model.safetensors")

not_in_saved = []
for k, v in original_state_dict.items():
    if k not in saved_weights:
        not_in_saved.append(k)
    else:
        assert (v == saved_weights[k]).all()

not_in_original = []
for k, v in saved_weights.items():
    if k not in original_state_dict:
        not_in_original.append(k)
    else:
        assert (v == original_state_dict[k]).all()

print(f"The following are in original but not in saved: {not_in_saved}")
print(f"The following are saved but not in original: {not_in_original}")

model = model_class.from_pretrained(target_folder)

HuggingFaceDocBuilderDev · 2026-04-10T10:37:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2026-04-10T15:14:55Z

+            WeightRenaming(source_patterns=r"^vision_tower", target_patterns="model.vision_tower"),
+            WeightRenaming(source_patterns=r"^multi_modal_projector", target_patterns="model.multi_modal_projector"),


oh shit, another base-model-prefix. We really need a proper way to add and delete the prefix when saving the model

These were supposed to be grabbed from each model's base-model-prefix which apparently worked only when
loading 😓

and a proper test would be nice for saving, as we assume that test_reverse_mapping checks that serialized keys

github-actions · 2026-04-10T15:15:33Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aya_vision, cohere_asr, colpali, colqwen2, emu3, fuyu, gemma3, got_ocr2, gpt_oss, internvl, llava, llava_next, llava_next_video, llava_onevision, mistral3, mllama

zucchini-nlp · 2026-04-10T15:26:34Z

Also I want to flag that this tone isn't okay for me. Please keep feedback constructive and focused on the code, not the person

Cyrilvallez · 2026-04-10T15:39:45Z

Oh very sorry about the tone, did not mean to be mean or anything, just wanted to flag that those parts are absolutely critical and not really tested (because meaningful test would require knowledge of the serialized hub weights, which is not really feasible with the small "dummy" models we use for tests). So we need to be extremely careful about those

Cyrilvallez · 2026-04-10T15:41:35Z

Tested as much as I can on real weights format, and saving/loading give the same serialization format! Merging!

* fix * style * comment * comment * remove gemma3n - should never have been there * fix much more....... * skip test for base models * revert unwanted changes * fix

Cyrilvallez added 3 commits April 10, 2026 12:15

fix

43a3383

style

3881759

comment

324a0db

Cyrilvallez added the for patch Tag issues / labels that should be included in the next patch label Apr 10, 2026

comment

8e95e5d

Cyrilvallez added 2 commits April 10, 2026 12:46

remove gemma3n - should never have been there

5be7915

fix much more.......

e2b5a5d

Cyrilvallez changed the title ~~Fix qwen3_5 key renaming~~ Fix vlm weight mappings Apr 10, 2026

Cyrilvallez added 3 commits April 10, 2026 16:46

skip test for base models

b85b774

revert unwanted changes

2115d66

fix

e933550

zucchini-nlp reviewed Apr 10, 2026

View reviewed changes

Cyrilvallez merged commit f3a68c4 into main Apr 10, 2026
29 checks passed

Cyrilvallez deleted the fix-conversion branch April 10, 2026 15:41

debOliveira mentioned this pull request Apr 10, 2026

Qwen3.5: DeepSpeed ZeRO-3 fails to load weights for language_model #45313

Open

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix vlm weight mappings#45358

Fix vlm weight mappings#45358
Cyrilvallez merged 9 commits intomainfrom
fix-conversion

Cyrilvallez commented Apr 10, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2026

Uh oh!

zucchini-nlp Apr 10, 2026

Uh oh!

zucchini-nlp Apr 10, 2026

Uh oh!

github-actions Bot commented Apr 10, 2026

Uh oh!

zucchini-nlp commented Apr 10, 2026

Uh oh!

Cyrilvallez commented Apr 10, 2026

Uh oh!

Cyrilvallez commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		WeightRenaming(source_patterns=r"^vision_tower", target_patterns="model.vision_tower"),
		WeightRenaming(source_patterns=r"^multi_modal_projector", target_patterns="model.multi_modal_projector"),

Conversation

Cyrilvallez commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2026

Uh oh!

zucchini-nlp Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 10, 2026

Uh oh!

zucchini-nlp commented Apr 10, 2026

Uh oh!

Cyrilvallez commented Apr 10, 2026

Uh oh!

Cyrilvallez commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Cyrilvallez commented Apr 10, 2026 •

edited

Loading