Look for the pad_token_id in the right place for Llama4#43539
Look for the pad_token_id in the right place for Llama4#43539Rocketknight1 merged 1 commit intomainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Thanks for a quick fix!
Can you also check if models reported in #43334 (comment) actually need a fix or not? Qwen3-VL-MoE for sure is faulty and has no PAD in its text config. I am not sure about other models
I want us to fix those pad issues all at once if possible
| self.pad_token_id = self.config.pad_token_id if self.config.pad_token_id is not None else -1 | ||
| if hasattr(self.config, "pad_token_id"): | ||
| self.pad_token_id = self.config.pad_token_id | ||
| else: | ||
| self.pad_token_id = self.config.text_config.pad_token_id or -1 |
There was a problem hiding this comment.
i think in case of llama4, we need to fix the modeling code to obtain it from pad_token_id = self.config.text_config.pad_token_id. Usually the special tokens live inside a text config
|
I ran a tiny test and got 16 models failing, might be worth checking these ones? 👀 |
2b7ad24 to
bba5c45
Compare
|
Hey @zucchini-nlp, sorry for the delay while I chased CI issues! I think this is actually okay, and we don't need to fix other models. This only applies to VLMs where |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Oh yeah, we also have llama4! Sure, let's merge it, I wonder why it was fixed before with the other batch of models haha
bba5c45 to
7eb5dda
Compare
|
[For maintainers] Suggested jobs to run (before merge) run-slow: llama4 |
Llama4 look for
pad_token_idonself.configin some cases, but I think it actually lives onself.config.text_config. This PR should fix things! There was a similar issue with Qwen3, but thankfully I couldn't find any other affected models.Fixes #43525