Skip to content

throw error when conversion required#45078

Merged
itazap merged 11 commits intomainfrom
tok_auto_fallback
Apr 20, 2026
Merged

throw error when conversion required#45078
itazap merged 11 commits intomainfrom
tok_auto_fallback

Conversation

@itazap
Copy link
Copy Markdown
Collaborator

@itazap itazap commented Mar 27, 2026

fixes fallback #44993

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@itazap itazap requested a review from ArthurZucker March 30, 2026 20:58
@itazap itazap force-pushed the tok_auto_fallback branch from a91275e to b65fe63 Compare March 30, 2026 21:01
@itazap itazap enabled auto-merge March 30, 2026 21:02
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have thanks for fixing

):
# new model, but we ignore it unless the model type is the same
tokenizer_class = tokenizer_class_from_name(tokenizer_config_class)
if tokenizer_class is not None and tokenizer_class.__name__ not in ("TokenizersBackend", "PythonBackend"):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if at tthis point PreTrainedTokenizersFast could still be here (remote code?) so maybe include it here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes added

return TokenizersBackend.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
except Exception as e:
logger.debug(f"Failed to use TokenizersBackend: {e}")
return TokenizersBackend.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing here, we let it fail?

finally:
os.chdir(prev_dir)

@require_sentencepiece
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty! can we add one with mock sentencepiece not available!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@itazap itazap added this pull request to the merge queue Mar 31, 2026
@ArthurZucker ArthurZucker removed this pull request from the merge queue due to a manual request Mar 31, 2026
@itazap itazap requested a review from ArthurZucker April 8, 2026 13:21
@itazap itazap added this pull request to the merge queue Apr 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 16, 2026
@itazap itazap added this pull request to the merge queue Apr 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto

@itazap itazap added this pull request to the merge queue Apr 20, 2026
Merged via the queue into main with commit cd5bcad Apr 20, 2026
29 checks passed
@itazap itazap deleted the tok_auto_fallback branch April 20, 2026 11:05
lvliang-intel pushed a commit to lvliang-intel/transformers that referenced this pull request Apr 21, 2026
* throw error when conversion required

* fix

* typo

* comment

* add test without sentencepiece and add PreTrainedTokenizerFast to class list
artem-spector pushed a commit to artem-spector/transformers that referenced this pull request Apr 21, 2026
* throw error when conversion required

* fix

* typo

* comment

* add test without sentencepiece and add PreTrainedTokenizerFast to class list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants