Skip to content

Fix maybe_autocast crashing on meta device tensors#44984

Merged
Cyrilvallez merged 1 commit intohuggingface:mainfrom
Butanium:fix/maybe-autocast-meta-device
Mar 25, 2026
Merged

Fix maybe_autocast crashing on meta device tensors#44984
Cyrilvallez merged 1 commit intohuggingface:mainfrom
Butanium:fix/maybe-autocast-meta-device

Conversation

@Butanium
Copy link
Copy Markdown
Contributor

@Butanium Butanium commented Mar 25, 2026

What does this PR do?

maybe_autocast calls torch.is_autocast_enabled(device_type) which raises a RuntimeError when device_type is "meta":

RuntimeError: unknown device type for autocast in get_autocast_dispatch_key_from_device_type

This breaks any code that runs a forward pass on meta tensors — for example, nnsight's .scan() context, which traces the computational graph using meta tensors without materializing weights.

The error path is: LlamaRotaryEmbedding.forwardmaybe_autocast(device_type="meta", enabled=False)torch.is_autocast_enabled("meta") → 💥

Since autocast is meaningless on meta tensors (they don't compute anything), this PR returns nullcontext() early when device_type == "meta".

This affects all 20+ model files that use maybe_autocast (via RoPE or directly), so fixing it at the source in maybe_autocast is preferable to patching each callsite.

Reproduction

import torch
torch.is_autocast_enabled("meta")  # RuntimeError
# With nnsight / any meta-tensor forward pass:
from nnsight import LanguageModel
model = LanguageModel("meta-llama/Llama-3.1-70B")
with model.scan("test"):  # crashes in RoPE forward
    pass

Fix

def maybe_autocast(device_type, dtype=None, enabled=True, cache_enabled=None):
    if device_type == "meta":
        return nullcontext()
    ...

Note on AI usage

🤖 Generated with Claude Code
👨 reviewed by the human:
This bug came out in some mech interp class and I maintain the nnterp library that was used and encounter this issue. claude helped me to traceback the issue to transformers and the fix proposed fix our issue and is clean and minimal

`torch.is_autocast_enabled("meta")` raises a RuntimeError because
torch does not support autocast for the meta device. This breaks any
code that runs a forward pass on meta tensors (e.g. nnsight's `.scan()`
for tracing without materializing weights).

Since autocast is meaningless on meta tensors, return `nullcontext()`
early when `device_type == "meta"`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Butanium
Copy link
Copy Markdown
Contributor Author

@scienceetonnante would be nice if this could be merged and included in next transformers version for nnterp! This came out in @gsarti mech interp class with this nnterp snippet

model = StandardizedTransformer(
    "meta-llama/Llama-3.1-70B",
    device_map="auto",
    remote=True)

I'll also push a fix for nnterp to fallback to remote execution renaming check if scan fails but that's a bit annoying

@Rocketknight1
Copy link
Copy Markdown
Member

This LGTM but I should check with the people who touched this code recently - cc @Cyrilvallez @hmellor for final review!

Copy link
Copy Markdown
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable

Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird usage pattern but alright!

@Cyrilvallez Cyrilvallez merged commit c17877c into huggingface:main Mar 25, 2026
18 of 20 checks passed
@Butanium
Copy link
Copy Markdown
Contributor Author

Thanks for the quick review and merge!

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Mar 27, 2026
`torch.is_autocast_enabled("meta")` raises a RuntimeError because
torch does not support autocast for the meta device. This breaks any
code that runs a forward pass on meta tensors (e.g. nnsight's `.scan()`
for tracing without materializing weights).

Since autocast is meaningless on meta tensors, return `nullcontext()`
early when `device_type == "meta"`.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NielsRogge pushed a commit to NielsRogge/transformers that referenced this pull request Mar 30, 2026
`torch.is_autocast_enabled("meta")` raises a RuntimeError because
torch does not support autocast for the meta device. This breaks any
code that runs a forward pass on meta tensors (e.g. nnsight's `.scan()`
for tracing without materializing weights).

Since autocast is meaningless on meta tensors, return `nullcontext()`
early when `device_type == "meta"`.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants