Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) latest by sirzechs66 · Pull Request #45500 · huggingface/transformers

sirzechs66 · 2026-04-18T02:16:41Z

What does this PR do?

This PR adds full GGUF loading support for GPT‑OSS models (20B/120B). It allows Transformers (and consequently vLLM) to directly load GPT‑OSS GGUF files without falling back to a wrong architecture. The changes include:

Architecture registration in GGUF mappings.
A custom GptOssTensorProcessor to handle MoE expert splitting and gate/up interleaving.
Reconstruction of nested rope_scaling (YaRN) from flat GGUF metadata.
Tests: fast registration test + slow integration test using a real 20B GGUF file.

Fixes #43366, supersedes #43757.
Related vLLM issue: vllm-project/vllm#22353

Code Agent Policy

I confirm that this is not a pure code agent PR.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). – Not applicable, it adds a feature.
Did you read the contributor guideline, Pull Request section? – Yes.
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. – Issue GGUF model with architecture gpt-oss support #43366, discussion in comments.
Did you make sure to update the documentation with your changes? – yes!
Did you write any new necessary tests? – Yes, in test/quantization/test_ggml.py

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Tagging:

@SunMarc (original issue tagger, quantization)
@Cyrilvallez (model loading)
@ArthurZucker (tokenizers, model structure)

…el __all__

github-actions · 2026-04-18T02:17:45Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: ggml

github-actions · 2026-04-18T02:36:10Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45500&sha=483e8f

sirzechs66 added 6 commits April 18, 2026 07:32

Add GPT-OSS GGUF support with YaRN rope scaling reconstruction

7dc6f04

Add GGUF loading test suite for GPT‑OSS

e332a44

docs: add GGUF loading section to gpt_oss.md

cb45f7f

fix: correct import of GptOssTensorProcessor in test; remove from mod…

d00c3c3

…el __all__

Finalize GPT‑OSS GGUF support: move test, adjust config reconstruction

16fec5f

fixed docs not closing example bracket

483e8fe

evalstate mentioned this pull request Apr 21, 2026

[mergeability] Cluster cluster-43366-4: merged 1 PRs evalstate/transformers#2

Closed

sirzechs66 closed this Apr 22, 2026

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) latest#45500

Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) latest#45500
sirzechs66 wants to merge 6 commits intohuggingface:mainfrom
sirzechs66:gpt-oss-support

sirzechs66 commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sirzechs66 commented Apr 18, 2026

What does this PR do?

Code Agent Policy

Before submitting

Who can review?

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

github-actions Bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant