Skip to content

Conversion for LLM class loading with VLM ckpt #45314

Closed
zucchini-nlp wants to merge 2 commits intohuggingface:mainfrom
zucchini-nlp:conversion-text-only-lm
Closed

Conversion for LLM class loading with VLM ckpt #45314
zucchini-nlp wants to merge 2 commits intohuggingface:mainfrom
zucchini-nlp:conversion-text-only-lm

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp commented Apr 8, 2026

What does this PR do?

fixes #45216 and #45310 and #45313

TBH load-save-load works for the model on main branch which is why the tests are not failing, it is only that the saved sd is completely weird and incorrect. Also smth when deepspeed loading, but I didn't check really

This works when from_pretrained just because we replace all matches with original_key.replace thus the whole language_model.language_model.language_model part is replaced

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma3n, qwen3_5, qwen3_5_moe

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Regression] Qwen3.5 saved checkpoint is not correct with save_pretrained API since version 5.4.0

2 participants