Skip to content

[bugfix] fix torch_dtype#57

Merged
Jintao-Huang merged 4 commits intomodelscope:mainfrom
Jintao-Huang:fix_torch_dtype
Apr 30, 2026
Merged

[bugfix] fix torch_dtype#57
Jintao-Huang merged 4 commits intomodelscope:mainfrom
Jintao-Huang:fix_torch_dtype

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a recursive set_torch_dtype method to ensure that data type configurations are correctly applied throughout nested HuggingFace configuration objects. Feedback suggests improving the efficiency and safety of attribute iteration by using dict instead of dir(), and extending the logic to handle nested configurations within collections like lists or tuples.

Comment thread src/mcore_bridge/model/mm_gpts/utils.py
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a recursive set_torch_dtype static method in HuggingFaceVit to ensure all nested Hugging Face configurations are initialized with the correct data type. Feedback suggests making this method more robust by handling None values and dictionary-based configurations to prevent potential AttributeError or missed updates.

Comment on lines +41 to +46
def set_torch_dtype(hf_config, torch_dtype):
for key, value in hf_config.__dict__.items():
if isinstance(value, PretrainedConfig):
HuggingFaceVit.set_torch_dtype(value, torch_dtype)
elif key in {'torch_dtype', 'params_dtype', 'dtype'}:
setattr(hf_config, key, torch_dtype)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The set_torch_dtype method should be more robust to handle cases where hf_config is None or where nested configurations are stored as dictionaries instead of PretrainedConfig objects. In many Hugging Face models, sub-configs like vision_config can be dictionaries, and failing to update them might lead to initialization with incorrect dtypes.

Additionally, adding a None check prevents potential AttributeError if a nested attribute is missing or if the initial call is made with a None config.

Suggested change
def set_torch_dtype(hf_config, torch_dtype):
for key, value in hf_config.__dict__.items():
if isinstance(value, PretrainedConfig):
HuggingFaceVit.set_torch_dtype(value, torch_dtype)
elif key in {'torch_dtype', 'params_dtype', 'dtype'}:
setattr(hf_config, key, torch_dtype)
def set_torch_dtype(hf_config, torch_dtype):
if hf_config is None:
return
items = hf_config.items() if isinstance(hf_config, dict) else hf_config.__dict__.items()
for key, value in items:
if isinstance(value, (PretrainedConfig, dict)):
HuggingFaceVit.set_torch_dtype(value, torch_dtype)
elif key in {'torch_dtype', 'params_dtype', 'dtype'}:
if isinstance(hf_config, dict):
hf_config[key] = torch_dtype
else:
setattr(hf_config, key, torch_dtype)

@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

modelscope/ms-swift#9253

@Jintao-Huang Jintao-Huang merged commit 3591e43 into modelscope:main Apr 30, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants