Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) latest by sirzechs66 · Pull Request #45506 · huggingface/transformers

sirzechs66 · 2026-04-18T08:43:19Z

What does this PR do?

This PR adds full GGUF loading support for GPT‑OSS models (20B/120B). It allows Transformers (and consequently vLLM) to directly load GPT‑OSS GGUF files without falling back to a wrong architecture. The changes include:

Architecture registration in GGUF mappings.
A custom GptOssTensorProcessor to handle MoE expert splitting and gate/up interleaving.
Reconstruction of nested rope_scaling (YaRN) from flat GGUF metadata.
Tests: fast registration test + slow integration test using a real 20B GGUF file.

Fixes #43366, supersedes #43757.
Related vLLM issue: vllm-project/vllm#22353

Code Agent Policy

I confirm that this is not a pure code agent PR.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). – Not applicable, it adds a feature.
Did you read the contributor guideline, Pull Request section? – Yes.
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. – Issue GGUF model with architecture gpt-oss support #43366, discussion in comments.
Did you make sure to update the documentation with your changes? – yes!
Did you write any new necessary tests? – Yes, in test/quantization/test_ggml.py

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Tagging:

@SunMarc (original issue tagger, quantization)
@Cyrilvallez (model loading)
@ArthurZucker (tokenizers, model structure)

…el __all__

sirzechs66 · 2026-04-18T09:01:44Z

@SunMarc please review this

sirzechs66 · 2026-04-19T06:16:12Z

@ArthurZucker , @Rocketknight1 please review and merge these changes all things are tested and ready

sirzechs66 · 2026-04-20T08:12:23Z

@SunMarc please review it is ready to be merged

SunMarc

Thanks ! just a nit

github-actions · 2026-04-20T16:07:57Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: ggml

HuggingFaceDocBuilderDev · 2026-04-22T06:38:44Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sirzechs66 added 6 commits April 18, 2026 14:05

Add GPT-OSS GGUF support with YaRN rope scaling reconstruction

69f669e

Add GGUF loading test suite for GPT‑OSS

073b3d3

docs: add GGUF loading section to gpt_oss.md

8c33e37

fix: correct import of GptOssTensorProcessor in test; remove from mod…

53b7efb

…el __all__

Finalize GPT‑OSS GGUF support: move test, adjust config reconstruction

fcde5f8

fixed docs not closing example bracket

c6945b3

sirzechs66 marked this pull request as ready for review April 18, 2026 08:58

github-actions Bot requested review from ArthurZucker and Rocketknight1 April 18, 2026 08:58

sirzechs66 mentioned this pull request Apr 18, 2026

Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) #45118

Closed

6 tasks

Fix lint: remove trailing whitespace

af5ad57

Fix tensor construction consistency

a3b7e4e

SunMarc approved these changes Apr 20, 2026

View reviewed changes

Comment thread docs/source/en/model_doc/gpt_oss.md Outdated

reverting to original docs

e08938e

evalstate mentioned this pull request Apr 21, 2026

[mergeability] Cluster cluster-43366-4: merged 1 PRs evalstate/transformers#2

Closed

ArthurZucker approved these changes Apr 22, 2026

View reviewed changes

ArthurZucker enabled auto-merge April 22, 2026 06:28

ArthurZucker added this pull request to the merge queue Apr 22, 2026

Merged via the queue into huggingface:main with commit 5e23868 Apr 22, 2026
28 checks passed

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) latest#45506

Add full GGUF loading support for GPT‑OSS (fixes #43366, supersedes #43757) latest#45506
ArthurZucker merged 9 commits intohuggingface:mainfrom
sirzechs66:gpt-oss-support-fix

sirzechs66 commented Apr 18, 2026

Uh oh!

sirzechs66 commented Apr 18, 2026

Uh oh!

sirzechs66 commented Apr 19, 2026

Uh oh!

sirzechs66 commented Apr 20, 2026

Uh oh!

SunMarc left a comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sirzechs66 commented Apr 18, 2026

What does this PR do?

Code Agent Policy

Before submitting

Who can review?

Uh oh!

sirzechs66 commented Apr 18, 2026

Uh oh!

sirzechs66 commented Apr 19, 2026

Uh oh!

sirzechs66 commented Apr 20, 2026

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants