Conversion for LLM class loading with VLM ckpt by zucchini-nlp · Pull Request #45314 · huggingface/transformers

zucchini-nlp · 2026-04-08T11:54:53Z

What does this PR do?

TBH load-save-load works for the model on main branch which is why the tests are not failing, it is only that the saved sd is completely weird and incorrect. Also smth when deepspeed loading, but I didn't check really

This works when from_pretrained just because we replace all matches with original_key.replace thus the whole language_model.language_model.language_model part is replaced

github-actions · 2026-04-08T11:56:00Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma3n, qwen3_5, qwen3_5_moe

HuggingFaceDocBuilderDev · 2026-04-08T12:05:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp added 2 commits April 8, 2026 13:50

fix

e7a4031

unskip tests

421982e

zucchini-nlp requested a review from Cyrilvallez April 8, 2026 12:16

zucchini-nlp mentioned this pull request Apr 8, 2026

Qwen3.5: DeepSpeed ZeRO-3 fails to load weights for language_model #45313

Open

Cyrilvallez mentioned this pull request Apr 9, 2026

Fix conversion mappings for vlms #45340

Merged

zucchini-nlp closed this Apr 10, 2026

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conversion for LLM class loading with VLM ckpt #45314

Conversion for LLM class loading with VLM ckpt #45314
zucchini-nlp wants to merge 2 commits intohuggingface:mainfrom
zucchini-nlp:conversion-text-only-lm

zucchini-nlp commented Apr 8, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zucchini-nlp commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

github-actions Bot commented Apr 8, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zucchini-nlp commented Apr 8, 2026 •

edited

Loading