Skip to content

Move some conversion mappings to PrefixChange#45567

Merged
Cyrilvallez merged 1 commit intomainfrom
conversion-prefix
Apr 22, 2026
Merged

Move some conversion mappings to PrefixChange#45567
Cyrilvallez merged 1 commit intomainfrom
conversion-prefix

Conversation

@Cyrilvallez
Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez commented Apr 22, 2026

What does this PR do?

As per the title.
cc @vasqu @zucchini-nlp

Confirmed with the following script that it works correctly:

import transformers
from safetensors.torch import load_file
from transformers.utils.hub import cached_file, cached_files
import json
import torch

# model_id = "Qwen/Qwen3.5-0.8B"
# model_id = "google/gemma-3n-E4B"
model_id = "vidore/colqwen2-v1.0-hf"
model_class = transformers.ColQwen2ForRetrieval

target_folder = "/raid/cyril/test_model"

try:
    with open(cached_file(model_id, "model.safetensors.index.json")) as f:
        index = json.load(f)
    model_files = set(index["weight_map"].values())
    model_files = cached_files(model_id, model_files)
except OSError:
    model_files = cached_files(model_id, ["model.safetensors"])

original_state_dict = {}
for file in model_files:
    original_state_dict.update(load_file(file))
original_state_dict = {k: v.to(torch.bfloat16) for k,v in original_state_dict.items()}

model = model_class.from_pretrained(model_id, dtype=torch.bfloat16)
model.save_pretrained(target_folder)

saved_weights = load_file(f"{target_folder}/model.safetensors")

not_in_saved = []
for k, v in original_state_dict.items():
    if k not in saved_weights:
        not_in_saved.append(k)
    else:
        assert (v == saved_weights[k]).all()

not_in_original = []
for k, v in saved_weights.items():
    if k not in original_state_dict:
        not_in_original.append(k)
    else:
        assert (v == original_state_dict[k]).all()

print(f"The following are in original but not in saved: {not_in_saved}")
print(f"The following are saved but not in original: {not_in_original}")

model = model_class.from_pretrained(target_folder)

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Cyrilvallez Cyrilvallez merged commit 71d1a7b into main Apr 22, 2026
29 of 30 checks passed
@Cyrilvallez Cyrilvallez deleted the conversion-prefix branch April 22, 2026 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants