Skip to content

Fix: adding pad_token_id in Qwen3VLTextConfig#43398

Merged
zucchini-nlp merged 6 commits intohuggingface:mainfrom
vaibhav-research:fix_qwen3_pad_token_id_config
Jan 22, 2026
Merged

Fix: adding pad_token_id in Qwen3VLTextConfig#43398
zucchini-nlp merged 6 commits intohuggingface:mainfrom
vaibhav-research:fix_qwen3_pad_token_id_config

Conversation

@vaibhav-research
Copy link
Copy Markdown
Contributor

What does this PR do?

This PR fixes an initialization error in Qwen3-VL text model construction when loading checkpoints whose nested text_config does not explicitly define pad_token_id.

there are 2 main issues
• Qwen3VLTextModel.init unconditionally accesses config.pad_token_id, but Qwen3VLTextConfig did not define this attribute.
• When loading via from_pretrained, this resulted in an AttributeError during model initialization.

The fix adds pad_token_id to Qwen3VLTextConfig (defaulting to None) and forwards it through PreTrainedConfig, aligning the config schema with what the model expects.
A regression test is added to ensure Qwen3VLTextModel can be instantiated when pad_token_id is not provided.

Fixes #43393 #43334

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@Rocketknight1 @zucchini-nlp

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Comment thread tests/models/qwen3_vl/test_modeling_qwen3_vl.py Outdated
Comment thread src/transformers/models/qwen3_vl/modular_qwen3_vl.py Outdated
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen3_vl

@vaibhav-research
Copy link
Copy Markdown
Contributor Author

@zucchini-nlp thanks for the review. I updated the PR based on your feedback.
reverted all changes in tests/models/qwen3_vl/test_modeling_qwen3_vl.py per your suggestion. Also updated Qwen3VLTextConfig to explicitly set self.pad_token_id = pad_token_id and removed passing it to super().init.

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for iterating. Merging!

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) January 22, 2026 15:17
@zucchini-nlp zucchini-nlp merged commit 62236a0 into huggingface:main Jan 22, 2026
19 checks passed
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vaibhav-research added a commit to vaibhav-research/transformers that referenced this pull request Jan 22, 2026
* Fix: adding pad_token_id in Qwen3VLTextConfig

* Fix: adding pad_token_id in Qwen3VLTextConfig

* updated the docstring with pad_token_id

* updated the docstring with pad_token_id

* added test nested in Qwen3VLModelTest for missing pad_token_id

* Updated pad_token_id config and removed the tests
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
* Fix: adding pad_token_id in Qwen3VLTextConfig

* Fix: adding pad_token_id in Qwen3VLTextConfig

* updated the docstring with pad_token_id

* updated the docstring with pad_token_id

* added test nested in Qwen3VLModelTest for missing pad_token_id

* Updated pad_token_id config and removed the tests
@zucchini-nlp zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label Jan 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Qwen3-VL checkpoints don't have pad_token

3 participants