Skip to content

Avoid hard failure for gpt-oss GGUF architecture by falling back to g…#43757

Open
TheSanjBot wants to merge 4 commits intohuggingface:mainfrom
TheSanjBot:gguf-gpt-oss-support
Open

Avoid hard failure for gpt-oss GGUF architecture by falling back to g…#43757
TheSanjBot wants to merge 4 commits intohuggingface:mainfrom
TheSanjBot:gguf-gpt-oss-support

Conversation

@TheSanjBot
Copy link
Copy Markdown

What does this PR do?

This PR avoids a hard failure when loading GGUF models that declare the
gpt-oss architecture.

Currently, such models raise a ValueError during GGUF config loading.
This change maps gpt-oss to the closest supported architecture
(gpt-neox) and emits a clear warning to communicate current limitations.

The goal is to allow GGUF checkpoints using gpt-oss to be loaded without
crashing, enabling downstream tools (e.g. vLLM) to proceed.

Notes / Limitations

  • This does not implement full GPT-OSS support.
  • MoE layers are not supported and inference correctness is not guaranteed.
  • This is a best-effort fallback to avoid hard failure, not a claim of correctness.

Fixes #43366

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

No tests were added as GGUF integration tests require large binary artifacts
and this change only affects architecture handling and error prevention.

Who can review?

cc @SunMarc @Rocketknight1

@TheSanjBot
Copy link
Copy Markdown
Author

CI failure seems unrelated to this PR.

The failing test (test_sample_generate_dict_output in GLM Image) is marked as FLAKY
and fails inside modeling_glm_image.py, which is not touched here.

Could you please rerun CI?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 5, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: grounding_dino

pull Bot pushed a commit to itsbrex/transformers that referenced this pull request Apr 22, 2026
…upersedes huggingface#43757) latest (huggingface#45506)

* Add GPT-OSS GGUF support with YaRN rope scaling reconstruction

* Add GGUF loading test suite for GPT‑OSS

* docs: add GGUF loading section to gpt_oss.md

* fix: correct import of GptOssTensorProcessor in test; remove from model __all__

* Finalize GPT‑OSS GGUF support: move test, adjust config reconstruction

* fixed docs not closing example bracket

* Fix lint: remove trailing whitespace

* Fix tensor construction consistency

* reverting to original docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GGUF model with architecture gpt-oss support

1 participant