Fix maybe_autocast crashing on meta device tensors#44984
Merged
Cyrilvallez merged 1 commit intohuggingface:mainfrom Mar 25, 2026
Merged
Fix maybe_autocast crashing on meta device tensors#44984Cyrilvallez merged 1 commit intohuggingface:mainfrom
maybe_autocast crashing on meta device tensors#44984Cyrilvallez merged 1 commit intohuggingface:mainfrom
Conversation
`torch.is_autocast_enabled("meta")` raises a RuntimeError because
torch does not support autocast for the meta device. This breaks any
code that runs a forward pass on meta tensors (e.g. nnsight's `.scan()`
for tracing without materializing weights).
Since autocast is meaningless on meta tensors, return `nullcontext()`
early when `device_type == "meta"`.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Author
|
@scienceetonnante would be nice if this could be merged and included in next transformers version for nnterp! This came out in @gsarti mech interp class with this nnterp snippet model = StandardizedTransformer(
"meta-llama/Llama-3.1-70B",
device_map="auto",
remote=True)I'll also push a fix for nnterp to fallback to remote execution renaming check if scan fails but that's a bit annoying |
Member
|
This LGTM but I should check with the people who touched this code recently - cc @Cyrilvallez @hmellor for final review! |
Cyrilvallez
approved these changes
Mar 25, 2026
Member
Cyrilvallez
left a comment
There was a problem hiding this comment.
Weird usage pattern but alright!
Contributor
Author
|
Thanks for the quick review and merge! |
zucchini-nlp
pushed a commit
to zucchini-nlp/transformers
that referenced
this pull request
Mar 27, 2026
`torch.is_autocast_enabled("meta")` raises a RuntimeError because
torch does not support autocast for the meta device. This breaks any
code that runs a forward pass on meta tensors (e.g. nnsight's `.scan()`
for tracing without materializing weights).
Since autocast is meaningless on meta tensors, return `nullcontext()`
early when `device_type == "meta"`.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NielsRogge
pushed a commit
to NielsRogge/transformers
that referenced
this pull request
Mar 30, 2026
`torch.is_autocast_enabled("meta")` raises a RuntimeError because
torch does not support autocast for the meta device. This breaks any
code that runs a forward pass on meta tensors (e.g. nnsight's `.scan()`
for tracing without materializing weights).
Since autocast is meaningless on meta tensors, return `nullcontext()`
early when `device_type == "meta"`.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
maybe_autocastcallstorch.is_autocast_enabled(device_type)which raises aRuntimeErrorwhendevice_typeis"meta":This breaks any code that runs a forward pass on meta tensors — for example,
nnsight's.scan()context, which traces the computational graph using meta tensors without materializing weights.The error path is:
LlamaRotaryEmbedding.forward→maybe_autocast(device_type="meta", enabled=False)→torch.is_autocast_enabled("meta")→ 💥Since autocast is meaningless on meta tensors (they don't compute anything), this PR returns
nullcontext()early whendevice_type == "meta".This affects all 20+ model files that use
maybe_autocast(via RoPE or directly), so fixing it at the source inmaybe_autocastis preferable to patching each callsite.Reproduction
Fix
Note on AI usage
🤖 Generated with Claude Code
👨 reviewed by the human:
This bug came out in some mech interp class and I maintain the nnterp library that was used and encounter this issue. claude helped me to traceback the issue to transformers and the fix proposed fix our issue and is clean and minimal