Skip to content

fix: raise clear error when tokenizer config uses v5 list format on older versions#45566

Closed
armorbreak001 wants to merge 1 commit intohuggingface:mainfrom
armorbreak001:fix/gemma4-tokenizer-list-format-error
Closed

fix: raise clear error when tokenizer config uses v5 list format on older versions#45566
armorbreak001 wants to merge 1 commit intohuggingface:mainfrom
armorbreak001:fix/gemma4-tokenizer-list-format-error

Conversation

@armorbreak001
Copy link
Copy Markdown

Summary

When loading a model whose tokenizer_config.json uses the v5-style list-based extra_special_tokens format on transformers < 5.0, _set_model_specific_special_tokens() crashes with a misleading:

AttributeError: 'list' object has no attribute 'keys'

This sends users down the wrong debugging path (patching config files, reinstalling packages) when the real issue is a version mismatch.

Fix

Add an isinstance(special_tokens, list) guard at the top of _set_model_specific_special_tokens() that raises a clear ValueError telling users to upgrade to transformers >= 5.0.0.

Fixes #45376

…lder versions

When loading a model with a v5-style extra_special_tokens (list format)
on transformers < 5.0, _set_model_specific_special_tokens crashes with
a misleading AttributeError: 'list' object has no attribute 'keys'.
Add an early type check that raises a clear, actionable ValueError
telling users to upgrade.
@github-actions
Copy link
Copy Markdown
Contributor

This PR was flagged by our automated quality checks. If you're a genuine
contributor, please reply here and a maintainer will review your PR.

Common reasons for flagging:

  • New GitHub account
  • Unusually high number of repository forks in a 24-hour window

We appreciate your contribution and apologize if this is a false positive!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

2 participants