Fix AutoProcessor.from_pretrained silently dropping hub kwargs#44710
Merged
Cyrilvallez merged 2 commits intohuggingface:mainfrom Mar 25, 2026
Merged
Fix AutoProcessor.from_pretrained silently dropping hub kwargs#44710Cyrilvallez merged 2 commits intohuggingface:mainfrom
Cyrilvallez merged 2 commits intohuggingface:mainfrom
Conversation
The previous code used inspect.signature(cached_file).parameters to filter kwargs before passing them to cached_file(). However, since cached_file() is defined with **kwargs in its signature, only 'path_or_repo_id', 'filename', and 'kwargs' were visible as parameter names. This meant user-supplied hub kwargs like force_download, cache_dir, token, revision, etc. were silently dropped and never forwarded. Replace the inspect.signature approach with an explicit tuple of known hub parameter names that cached_file actually accepts (via cached_files). This matches how other auto classes like AutoTokenizer handle the same situation. Fixes huggingface#44704 Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com>
Contributor
Author
|
Bump — the hub kwargs are still being silently dropped. Happy to adjust if there's feedback. |
Member
|
cc @Cyrilvallez since I think this was last touched in #36033 |
Cyrilvallez
approved these changes
Mar 25, 2026
Member
Cyrilvallez
left a comment
There was a problem hiding this comment.
Indeed! Nice find @he-yufeng, important to update!
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto |
Contributor
|
View the CircleCI Test Summary for this PR: https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44710&sha=7d9e04 |
zucchini-nlp
pushed a commit
to zucchini-nlp/transformers
that referenced
this pull request
Mar 27, 2026
…ngface#44710) * Fix AutoProcessor.from_pretrained silently dropping hub kwargs The previous code used inspect.signature(cached_file).parameters to filter kwargs before passing them to cached_file(). However, since cached_file() is defined with **kwargs in its signature, only 'path_or_repo_id', 'filename', and 'kwargs' were visible as parameter names. This meant user-supplied hub kwargs like force_download, cache_dir, token, revision, etc. were silently dropped and never forwarded. Replace the inspect.signature approach with an explicit tuple of known hub parameter names that cached_file actually accepts (via cached_files). This matches how other auto classes like AutoTokenizer handle the same situation. Fixes huggingface#44704 Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com> * narrow it a bit --------- Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
NielsRogge
pushed a commit
to NielsRogge/transformers
that referenced
this pull request
Mar 30, 2026
…ngface#44710) * Fix AutoProcessor.from_pretrained silently dropping hub kwargs The previous code used inspect.signature(cached_file).parameters to filter kwargs before passing them to cached_file(). However, since cached_file() is defined with **kwargs in its signature, only 'path_or_repo_id', 'filename', and 'kwargs' were visible as parameter names. This meant user-supplied hub kwargs like force_download, cache_dir, token, revision, etc. were silently dropped and never forwarded. Replace the inspect.signature approach with an explicit tuple of known hub parameter names that cached_file actually accepts (via cached_files). This matches how other auto classes like AutoTokenizer handle the same situation. Fixes huggingface#44704 Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com> * narrow it a bit --------- Signed-off-by: Yufeng He <40085740+he-yufeng@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes
AutoProcessor.from_pretrainedsilently dropping hub kwargs likeforce_download,cache_dir,token,revision, etc.The bug
The existing code on line ~300 filters kwargs using
inspect.signature(cached_file).parameters:But
cached_file()is defined as:So
inspect.signatureonly sees three parameter names:path_or_repo_id,filename, andkwargs. Hub parameters likeforce_download,cache_dir,token, etc. are never matched, and get silently dropped before reaching thecached_filecalls.The fix
Replace the
inspect.signaturefiltering with an explicit tuple of the hub parameter names thatcached_fileactually accepts (viacached_files). This is consistent with how other auto classes likeAutoTokenizerhandle the same situation -- they pass hub kwargs explicitly by name rather than trying to introspect the signature.Also removes the now-unused
import inspect.Fixes #44704
Who can review?
@ArthurZucker @Rocketknight1