Add full GGUF loading support for GPT‑OSS (fixes #43366) by sirzechs66 · Pull Request #45116 · huggingface/transformers

sirzechs66 · 2026-03-30T12:18:41Z

What does this PR do?

This PR adds full GGUF loading support for GPT‑OSS models (20B/120B). It allows Transformers (and consequently vLLM) to directly load GPT‑OSS GGUF files without falling back to a wrong architecture. The changes include:

Architecture registration in GGUF mappings.
A custom GptOssTensorProcessor to handle MoE expert splitting and gate/up interleaving.
Reconstruction of nested rope_scaling (YaRN) from flat GGUF metadata.
Tests: fast registration test + slow integration test using a real 20B GGUF file.

Fixes #43366, supersedes #43757.
Related vLLM issue: vllm-project/vllm#22353

Code Agent Policy

I confirm that this is not a pure code agent PR.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). – Not applicable, it adds a feature.
Did you read the contributor guideline, Pull Request section? – Yes.
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. – Issue GGUF model with architecture gpt-oss support #43366, discussion in comments.
Did you make sure to update the documentation with your changes? – Yes!
Did you write any new necessary tests? – Yes, GptOssGgufLoadingTest in tests/models/gpt_oss/test_modeling_gpt_oss.py.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Tagging:

@SunMarc (original issue tagger, quantization)
@Cyrilvallez (model loading)
@ArthurZucker (tokenizers, model structure)

github-actions · 2026-03-30T12:50:08Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gpt_oss

github-actions · 2026-03-30T13:02:23Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45116&sha=6e2e65

sirzechs66 · 2026-03-30T16:50:24Z

i thought the CI conflicts were something from my end , my apologies please feel free to refer to this pr #45118 it still has some issues but i think are from other unmodified files, any help is appretiated

sirzechs66 added 2 commits March 30, 2026 15:12

Add GPT-OSS GGUF support with YaRN rope scaling reconstruction

a77d70b

Add GGUF loading test suite for GPT‑OSS

f9ba4a8

sirzechs66 marked this pull request as draft March 30, 2026 12:20

docs: add GGUF loading section to gpt_oss.md

5d4515b

sirzechs66 force-pushed the fix-transformers-issue branch from 91529e7 to 5d4515b Compare March 30, 2026 12:48

Merge branch 'main' into fix-transformers-issue

6e2e655

sirzechs66 marked this pull request as ready for review March 30, 2026 13:03

sirzechs66 marked this pull request as draft March 30, 2026 13:03

github-actions Bot requested review from ArthurZucker and Rocketknight1 March 30, 2026 13:03

sirzechs66 closed this Mar 30, 2026

sirzechs66 deleted the fix-transformers-issue branch March 30, 2026 13:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add full GGUF loading support for GPT‑OSS (fixes #43366)#45116

Add full GGUF loading support for GPT‑OSS (fixes #43366)#45116
sirzechs66 wants to merge 4 commits intohuggingface:mainfrom
sirzechs66:fix-transformers-issue

sirzechs66 commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

sirzechs66 commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sirzechs66 commented Mar 30, 2026

What does this PR do?

Code Agent Policy

Before submitting

Who can review?

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

sirzechs66 commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant