Look for the pad_token_id in the right place for Llama4 by Rocketknight1 · Pull Request #43539 · huggingface/transformers

Rocketknight1 · 2026-01-27T18:13:11Z

Llama4 look for pad_token_id on self.config in some cases, but I think it actually lives on self.config.text_config. This PR should fix things! There was a similar issue with Qwen3, but thankfully I couldn't find any other affected models.

Fixes #43525

Rocketknight1 · 2026-01-27T18:15:44Z

cc @zucchini-nlp

HuggingFaceDocBuilderDev · 2026-01-27T18:21:38Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp

Thanks for a quick fix!

Can you also check if models reported in #43334 (comment) actually need a fix or not? Qwen3-VL-MoE for sure is faulty and has no PAD in its text config. I am not sure about other models

I want us to fix those pad issues all at once if possible

zucchini-nlp · 2026-01-28T08:41:33Z

-        self.pad_token_id = self.config.pad_token_id if self.config.pad_token_id is not None else -1
+        if hasattr(self.config, "pad_token_id"):
+            self.pad_token_id = self.config.pad_token_id
+        else:
+            self.pad_token_id = self.config.text_config.pad_token_id or -1


i think in case of llama4, we need to fix the modeling code to obtain it from pad_token_id = self.config.text_config.pad_token_id. Usually the special tokens live inside a text config

zucchini-nlp · 2026-01-28T09:30:58Z

I ran a tiny test and got 16 models failing, might be worth checking these ones? 👀

FAILED tests/models/esm/test_modeling_esm.py::EsmModelTest::test_attention_outputs - TypeError: ne() received an invalid combination of arguments - got (NoneType), but expected one of:
FAILED tests/models/exaone4/test_modeling_exaone4.py::Exaone4ModelTest::test_attention_outputs - AttributeError: 'Exaone4Config' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/glm46v/test_modeling_glm46v.py::Glm46VModelTest::test_attention_outputs - AttributeError: 'Glm4vTextConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/glm4v/test_modeling_glm4v.py::Glm4vModelTest::test_attention_outputs - AttributeError: 'Glm4vTextConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/glm_image/test_modeling_glm_image.py::GlmImageModelTest::test_attention_outputs - AttributeError: 'GlmImageTextConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/glm_ocr/test_modeling_glm_ocr.py::GlmOcrModelTest::test_attention_outputs - AttributeError: 'GlmOcrTextConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/gpt_bigcode/test_modeling_gpt_bigcode.py::GPTBigCodeModelTest::test_attention_outputs - AttributeError: 'GPTBigCodeConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/gpt_bigcode/test_modeling_gpt_bigcode.py::GPTBigCodeMHAModelTest::test_attention_outputs - AttributeError: 'GPTBigCodeConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/gpt_neox/test_modeling_gpt_neox.py::GPTNeoXModelTest::test_attention_outputs - AttributeError: 'GPTNeoXConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/gptj/test_modeling_gptj.py::GPTJModelTest::test_attention_outputs - AttributeError: 'GPTJConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/jetmoe/test_modeling_jetmoe.py::JetMoeModelTest::test_attention_outputs - AttributeError: 'JetMoeConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/mpt/test_modeling_mpt.py::MptModelTest::test_attention_outputs - AttributeError: 'MptConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/phi/test_modeling_phi.py::PhiModelTest::test_attention_outputs - AttributeError: 'PhiConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/qwen3_vl_moe/test_modeling_qwen3_vl_moe.py::Qwen3VLMoeModelTest::test_attention_outputs - AttributeError: 'Qwen3VLMoeTextConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/stablelm/test_modeling_stablelm.py::StableLmModelTest::test_attention_outputs - AttributeError: 'StableLmConfig' object has no attribute 'pad_token_id'. Did you mean: 'bos_token_id'?
FAILED tests/models/tvp/test_modeling_tvp.py::TVPModelTest::test_attention_outputs - AttributeError: 'TvpConfig' object has no attribute 'pad_token_id'

Rocketknight1 · 2026-02-06T18:31:50Z

Hey @zucchini-nlp, sorry for the delay while I chased CI issues! I think this is actually okay, and we don't need to fix other models. This only applies to VLMs where pad_token_id may be on the root config or the text_config, but the other cases of that were fixed here and here. In the other cases in your list, I think those are raw text LMs which probably don't have this issue, since they don't have text_config, right?

zucchini-nlp

Oh yeah, we also have llama4! Sure, let's merge it, I wonder why it was fixed before with the other batch of models haha

github-actions · 2026-02-09T12:53:55Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: llama4

…43539)

Rocketknight1 marked this pull request as ready for review January 27, 2026 18:13

Rocketknight1 mentioned this pull request Jan 27, 2026

AttributeError: 'Llama4Config' object has no attribute 'pad_token_id' #43525

Closed

4 tasks

zucchini-nlp reviewed Jan 28, 2026

View reviewed changes

zucchini-nlp mentioned this pull request Jan 29, 2026

missing pad_token_idx in StableLmConfig after 5.0 update #43572

Closed

4 tasks

Rocketknight1 force-pushed the fix_llama4_pad_token_id branch from 2b7ad24 to bba5c45 Compare February 6, 2026 18:28

zucchini-nlp approved these changes Feb 9, 2026

View reviewed changes

Look for the pad_token_id in the right place for Llama4

7eb5dda

Rocketknight1 force-pushed the fix_llama4_pad_token_id branch from bba5c45 to 7eb5dda Compare February 9, 2026 12:52

Rocketknight1 merged commit 9e4a8c4 into main Feb 9, 2026
26 checks passed

Rocketknight1 deleted the fix_llama4_pad_token_id branch February 9, 2026 17:24

tomaarsen mentioned this pull request Feb 10, 2026

[Bugfix] Extract pad_token_id from text config for Llama-4 #43497

Closed

jiosephlee pushed a commit to jiosephlee/transformers_latest that referenced this pull request Feb 11, 2026

Look for the pad_token_id in the right place for Llama4 (huggingface#…

5542cf2

…43539)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Look for the pad_token_id in the right place for Llama4#43539

Look for the pad_token_id in the right place for Llama4#43539
Rocketknight1 merged 1 commit intomainfrom
fix_llama4_pad_token_id

Rocketknight1 commented Jan 27, 2026 •

edited

Loading

Uh oh!

Rocketknight1 commented Jan 27, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Jan 27, 2026

Uh oh!

zucchini-nlp left a comment

Uh oh!

zucchini-nlp Jan 28, 2026

Uh oh!

zucchini-nlp commented Jan 28, 2026

Uh oh!

Rocketknight1 commented Feb 6, 2026

Uh oh!

zucchini-nlp left a comment

Uh oh!

github-actions Bot commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Rocketknight1 commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Jan 27, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Jan 27, 2026

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Jan 28, 2026

Uh oh!

Rocketknight1 commented Feb 6, 2026

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Rocketknight1 commented Jan 27, 2026 •

edited

Loading