Skip to content

fix: skip qwen3_5_text checkpoint remap for nested VL language_model#45256

Closed
zozo123 wants to merge 1 commit intohuggingface:mainfrom
zozo123:fix/qwen3-5-save-pretrained-prefix
Closed

fix: skip qwen3_5_text checkpoint remap for nested VL language_model#45256
zozo123 wants to merge 1 commit intohuggingface:mainfrom
zozo123:fix/qwen3-5-save-pretrained-prefix

Conversation

@zozo123
Copy link
Copy Markdown

@zozo123 zozo123 commented Apr 5, 2026

Summary

When saving a Qwen3.5 VL model via save_pretrained, the revert_weight_conversion for qwen3_5_text replaces a leading model. segment. This wrongly matches keys that already start with model.language_model. on composite VL models, duplicating the language_model prefix in the saved safetensors keys.

Fixes #45216

Changes

  • In get_model_conversion_mapping(), detect when a qwen3_5_text submodule is the nested model.language_model trunk in a VL model
  • Skip the text remap for that submodule to prevent prefix duplication during revert_weight_conversion
  • For qwen3_5_moe_text inside MoE VL models, apply only qwen2_moe conversions
  • Add regression test verifying save_pretrained does not produce keys with triple-nested language_model segments

Testing

  • New test: test_save_pretrained_no_triple_nested_language_model_prefix in test_modeling_qwen3_5.py
  • Saves a Qwen3_5ForConditionalGeneration model and asserts no key contains language_model.language_model.language_model

Built autonomously by islo.dev

When saving a Qwen3.5-VL model with save_pretrained, the
conversion_mapping incorrectly applied the qwen3_5_text weight
remap to the nested language_model submodule. This caused
save_pretrained to produce corrupted checkpoints for VL models.

Skip the qwen3_5_text remap when the model class is not
Qwen3_5TextForCausalLM (i.e. when it's a VL wrapper).
@zozo123 zozo123 force-pushed the fix/qwen3-5-save-pretrained-prefix branch from ec19bec to 4d4173e Compare April 6, 2026 12:10
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 6, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen3_5

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 6, 2026

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45256&sha=4d4173

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Regression] Qwen3.5 saved checkpoint is not correct with save_pretrained API since version 5.4.0

1 participant