Fix convert to original state dict for VLMs by hiyouga · Pull Request #38385 · huggingface/transformers

hiyouga · 2025-05-26T16:12:52Z

What does this PR do?

#37033 introduces the base models for all VLMs. The model weights will be converted by mapping the original keys according to

transformers/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py

Lines 1735 to 1738 in 701caef

    
           _checkpoint_conversion_mapping = { 
        
               "^visual": "model.visual", 
        
               r"^model(?!\.(language_model|visual))": "model.language_model", 
        
           }

However, the previous implementation of Transformers could not properly convert the weights back due to existing bugs (replacement = re.sub(r"\(.*?\)", "", pattern) -> replacement = re.sub(r"\(.*?\)", "", replacement)) and the lack of support for nested parentheses

transformers/src/transformers/modeling_utils.py

Lines 3644 to 3657 in 701caef

    
           if any(allowed_name in self.__class__.__name__.lower() for allowed_name in VLMS): 
        
               reverse_key_mapping = {v: k for k, v in self._checkpoint_conversion_mapping.items()} 
        
               original_state_dict = {} 
        
               for key, value in state_dict.items(): 
        
                   for pattern, replacement in reverse_key_mapping.items(): 
        
                       replacement = replacement.lstrip("^")  # strip off un-needed chars and patterns 
        
                       replacement = re.sub(r"\(.*?\)", "", pattern) 
        
                       key, n_replace = re.subn(pattern, replacement, key) 
        
                       # Early exit of the loop 
        
                       if n_replace > 0: 
        
                           break 
        
                   original_state_dict[key] = value 
        
               state_dict = original_state_dict

We want to provide a more accurate weight conversion implementation to prevent issues with third-party apps.
hiyouga/LlamaFactory#8147

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker @zucchini-nlp

zucchini-nlp

thanks

HuggingFaceDocBuilderDev · 2025-05-27T10:28:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* fix convert to original state dict * fix * lint * Update modeling_utils.py

* make it go brrrr * date time * update * fix * up * uppp * up * no number i * udpate * fix * [paligemma] fix processor with suffix (#38365) fix pg processor * [video utils] group and reorder by number of frames (#38374) fix * Fix convert to original state dict for VLMs (#38385) * fix convert to original state dict * fix * lint * Update modeling_utils.py * update * warn * no verbose * fginal * ouft * style --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>

hiyouga · 2025-05-31T08:25:00Z

@zucchini-nlp Thanks for the merging. Could we have a unittest for this function?

zucchini-nlp · 2025-06-02T11:09:29Z

I don't think we need a test, since it's fixed already and we probably won't touch that part anymore. We have a test for loading the ckpt and for creating a new dummy model, loading-saving-loading-saving it back. That is enough imo

hiyouga · 2025-06-03T10:11:27Z

@zucchini-nlp Okay. By the way, do you think we should override the state_dict and named_parameters method to ensure consistency between the output and the original model weights? For example, OpenR1 uses the named_parameters method to synchronize the model weights with the inference engine.

https://github.com/huggingface/trl/blob/fef915e36f12f759b384e4ab6f650208130aa232/trl/trainer/grpo_trainer.py#L872-L875

### What does this PR do? Fixes #1710 ![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8) 1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh` ![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1) ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.

### What does this PR do? Fixes verl-project#1710 ![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8) 1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh` ![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1) ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.

Fixes verl-project#1710 ![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8) 1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh` ![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1) - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.

### What does this PR do? Fixes verl-project#1710 ![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8) 1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh` ![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1) ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.

### What does this PR do? Fixes #1710 ![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8) 1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh` ![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1) ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.

### What does this PR do? Fixes verl-project#1710 ![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8) 1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh` ![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1) ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.

hiyouga added 3 commits May 26, 2025 23:48

fix convert to original state dict

59c78a5

Merge branch 'main' into patch-17

c742cd3

fix

9d41a93

hiyouga changed the title ~~Fix convert to original state dict~~ Fix convert to original state dict for VLMs May 26, 2025

lint

7c45ce9

hiyouga mentioned this pull request May 26, 2025

Error while serving fine-tuned Qwen 2.5 VL model hiyouga/LlamaFactory#8147

Closed

1 task

zucchini-nlp reviewed May 27, 2025

View reviewed changes

Comment thread src/transformers/modeling_utils.py

hiyouga added 2 commits May 27, 2025 18:14

Update modeling_utils.py

de76d12

Merge branch 'main' into patch-17

fd79f80

zucchini-nlp approved these changes May 27, 2025

View reviewed changes

zucchini-nlp enabled auto-merge (squash) May 27, 2025 10:15

zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label May 27, 2025

zucchini-nlp merged commit 008e0d8 into huggingface:main May 27, 2025
20 checks passed

zucchini-nlp mentioned this pull request May 27, 2025

Qwen2.5-VL-7B-Instruct model keys are different between a saved model and downloaded one #38403

Closed

4 tasks

ArthurZucker pushed a commit that referenced this pull request May 27, 2025

Fix convert to original state dict for VLMs (#38385)

5251343

* fix convert to original state dict * fix * lint * Update modeling_utils.py

ArthurZucker pushed a commit that referenced this pull request May 27, 2025

Fix convert to original state dict for VLMs (#38385)

a140a42

* fix convert to original state dict * fix * lint * Update modeling_utils.py

ArthurZucker pushed a commit that referenced this pull request May 27, 2025

Fix convert to original state dict for VLMs (#38385)

9a8fdff

* fix convert to original state dict * fix * lint * Update modeling_utils.py

ArthurZucker pushed a commit that referenced this pull request May 27, 2025

Fix convert to original state dict for VLMs (#38385)

56749c0

* fix convert to original state dict * fix * lint * Update modeling_utils.py

ArthurZucker pushed a commit that referenced this pull request May 28, 2025

Fix convert to original state dict for VLMs (#38385)

2842b82

* fix convert to original state dict * fix * lint * Update modeling_utils.py

popomen mentioned this pull request May 29, 2025

[vlm] Support ulysses sequence parallelism for vlm verl-project/verl#1739

Merged

6 tasks

hiyouga mentioned this pull request Jun 6, 2025

fix qwen2vl grpo for vllm 0.9 and transformers 4.52 verl-project/verl#1880

Merged

6 tasks

nph4rd mentioned this pull request Jun 10, 2025

add multimodal support PrimeIntellect-ai/verifiers#81

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix convert to original state dict for VLMs#38385

Fix convert to original state dict for VLMs#38385
zucchini-nlp merged 6 commits intohuggingface:mainfrom
hiyouga:patch-17

hiyouga commented May 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

zucchini-nlp left a comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 27, 2025

Uh oh!

hiyouga commented May 31, 2025

Uh oh!

zucchini-nlp commented Jun 2, 2025

Uh oh!

hiyouga commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	_checkpoint_conversion_mapping = {
	"^visual": "model.visual",
	r"^model(?!\.(language_model\|visual))": "model.language_model",
	}

	if any(allowed_name in self.__class__.__name__.lower() for allowed_name in VLMS):
	reverse_key_mapping = {v: k for k, v in self._checkpoint_conversion_mapping.items()}

	original_state_dict = {}
	for key, value in state_dict.items():
	for pattern, replacement in reverse_key_mapping.items():
	replacement = replacement.lstrip("^") # strip off un-needed chars and patterns
	replacement = re.sub(r"\(.*?\)", "", pattern)
	key, n_replace = re.subn(pattern, replacement, key)
	# Early exit of the loop
	if n_replace > 0:
	break
	original_state_dict[key] = value
	state_dict = original_state_dict

Conversation

hiyouga commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 27, 2025

Uh oh!

hiyouga commented May 31, 2025

Uh oh!

zucchini-nlp commented Jun 2, 2025

Uh oh!

hiyouga commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hiyouga commented May 26, 2025 •

edited

Loading