Skip to content

Fix incompatible weight names#1759

Open
mengniwang95 wants to merge 15 commits intomainfrom
mengni/fix_vlm
Open

Fix incompatible weight names#1759
mengniwang95 wants to merge 15 commits intomainfrom
mengni/fix_vlm

Conversation

@mengniwang95
Copy link
Copy Markdown
Contributor

@mengniwang95 mengniwang95 commented Apr 29, 2026

Description

transformers will do checkpoint name conversion automatically and then cause the mismatch between quantized weight name and original name

W4A16 Qwen2.5-VL-7B-Instruct generated by this PR can be loaded by sglang and transformers

Example for this issue (qwen2_5_vl):

The original weight names are:
"model.layers.0.mlp.down_proj.weight"
"visual.blocks.0.attn.proj.weight"
...

After AR quantiztaion, the saved weight names are:
"model.language_model.layers.0.mlp.down_proj.weight" (incorrect weight)
"model.visual.blocks.0.attn.proj.weight" (incorrect weight name)
"model.layers.0.mlp.down_proj.qweight"
...

the block_name_to_quantize is:
"block_name_to_quantize": [
"model.language_model.layers",
"model.layers"
],
Since the change of weight name, sglang can not load the quantized weight.

Using this PR, the AR generated weight names will be:
"model.layers.0.mlp.down_proj.qweight"
"visual.blocks.0.attn.proj.weight"
...
and the block_name_to_quantize is:
"block_name_to_quantize": "model.layers"

Type of Change

Bug fix

Related Issues

#982

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.
  • The CUDA CI has passed. You can trigger it by commenting /azp run Unit-Test-CUDA-AutoRound.

Co-authored-by: Copilot <copilot@github.com>
Copilot AI review requested due to automatic review settings April 29, 2026 03:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes weight-name mismatches caused by Transformers checkpoint key conversion (via _checkpoint_conversion_mapping), so exported quantized checkpoints/configs retain the expected “original” key names and can be loaded by downstream runtimes (e.g., sglang).

Changes:

  • Added utility helpers to apply/revert checkpoint conversion mappings for parameter/block names.
  • Applied checkpoint conversion mapping when deriving quant_block_list during inference-time model conversion.
  • Reverted checkpoint conversion mapping when saving shard tensors and when serializing to_quant_block_names into exported configs.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
auto_round/utils/common.py Adds helpers to apply/revert checkpoint conversion regex mappings.
auto_round/inference/convert_model.py Applies checkpoint conversion mapping when building quantization block prefixes.
auto_round/compressors_new/shard_writer.py Reverts mapped names when writing shard tensor keys.
auto_round/compressors_new/base.py Reverts mapped to_quant_block_names in exported serialization metadata.
auto_round/compressors/shard_writer.py Reverts mapped names when writing shard tensor keys (old arch).
auto_round/compressors/base.py Reverts mapped to_quant_block_names in exported serialization metadata (old arch).
auto_round/autoround.py Attempts to merge extra_config values into constructor args (currently introduces a crash when extra_config=None).

Comment thread auto_round/compressors_new/shard_writer.py
Comment thread auto_round/autoround.py Outdated
Comment thread auto_round/compressors/shard_writer.py
mengniwang95 and others added 2 commits April 29, 2026 13:54
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@mengniwang95 mengniwang95 requested a review from n1ck-guo April 29, 2026 06:32
@mengniwang95
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mengniwang95 mengniwang95 requested review from xin3he and yiliu30 April 29, 2026 07:50
@mengniwang95
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mengniwang95
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

@yiliu30 yiliu30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, It’d be great to have a few UTs to validate the model config.

@mengniwang95
Copy link
Copy Markdown
Contributor Author

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants