Skip to content

[AMD CI] Fix torch.compile/export failures on AMD CI due to untraceable set.__contains__ #45282

Merged
Abdennacer-Badaoui merged 8 commits intohuggingface:mainfrom
Abdennacer-Badaoui:fix-amd-ci
Apr 13, 2026
Merged

[AMD CI] Fix torch.compile/export failures on AMD CI due to untraceable set.__contains__ #45282
Abdennacer-Badaoui merged 8 commits intohuggingface:mainfrom
Abdennacer-Badaoui:fix-amd-ci

Conversation

@Abdennacer-Badaoui
Copy link
Copy Markdown
Member

_register_model_output_pytree_node was calling set.contains during TorchDynamo tracing, which is unsupported in PyTorch 2.8.0 (ROCm). Added an early return when torch.compiler.is_compiling() is True, since pytree nodes are already registered from the preceding eager run.
This should fix a bunch of new failures in AMD CI.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@tarekziade tarekziade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, do you think there's a way to isolate the behavior in a small test?

@Abdennacer-Badaoui
Copy link
Copy Markdown
Member Author

I added a test for that. Let me know what you think :)

Comment thread tests/utils/test_generic.py Outdated
# Regression test: on AMD CI (PyTorch 2.8.0+rocm), `set.__contains__` is not
# traceable by TorchDynamo. `_register_model_output_pytree_node` must return
# early when called inside a compiled context, before touching the set.
from dataclasses import dataclass
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to do late imports for those?

Copy link
Copy Markdown
Member Author

@Abdennacer-Badaoui Abdennacer-Badaoui Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason actually. I'll move them to the top.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will keep _register_model_output_pytree_node as the only late import, it's a private internal symbol so keeping it local is reasonable i think

Copy link
Copy Markdown
Collaborator

@tarekziade tarekziade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the test, LGTM just the nit

@Abdennacer-Badaoui
Copy link
Copy Markdown
Member Author

Who should we tag here for a force merge?

@Abdennacer-Badaoui Abdennacer-Badaoui added this pull request to the merge queue Apr 9, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 9, 2026
@Abdennacer-Badaoui Abdennacer-Badaoui added this pull request to the merge queue Apr 13, 2026
Merged via the queue into huggingface:main with commit 87a69a3 Apr 13, 2026
28 checks passed
@Abdennacer-Badaoui Abdennacer-Badaoui deleted the fix-amd-ci branch April 13, 2026 15:59
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
…le set.__contains__ (huggingface#45282)

* fix torch.compile/export failures on amd

* test

* move imports
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants