feat[vLLM × v5]: Add vLLM compatibility for audio models by harshaljanjani · Pull Request #45326 · huggingface/transformers

harshaljanjani · 2026-04-08T18:28:35Z

What does this PR do?

→ This PR introduces compat fixes across several audio models to ensure they can be loaded and used by a companion vLLM PR. These changes are deliberate and are blocking this vLLM PR which adds audio backend compatibility to vLLM. Once this PR is merged, the other PR will be marked ready for review!
→ Outlining the design choices of one PR without context from the other didn't make much sense to me, so I wrote a doc that outlines both sets of changes together and explains their deliberate nature, amongst other valuable things!
→ The v5 tracker doesn’t mention the audio backend, but it is certainly a significant gap that needs to be addressed. After this is merged, I'll open an issue tracker for the Transformers audio backend work in vLLM so the efforts can stay organized.

Please refer to the document for the reasoning behind these changes in context with the vLLM PR!
Document: v5 x vLLM Audio Backend Support Document

Related Issues:

→ Current v5 tracker: vllm-project/vllm#38379
→ vllm-project/vllm#38902

@vasqu @ArthurZucker

Code Agent Policy

I confirm that this is not a pure code agent PR.

Before submitting

Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.

harshaljanjani · 2026-04-09T08:37:50Z

The CI failures are unrelated to this PR (the GraniteSpeech failure is likely a pre-existing issue as I documented).

Rocketknight1 · 2026-04-09T13:35:32Z

cc @hmellor as well

ArthurZucker

LGTM but can you make sure it's tested ! ?

eustlb

Amazing to see it's working out of the box! 🔥 How did you test?
Edit: Okay I see it's there

github-actions · 2026-04-14T15:30:14Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: audioflamingo3, auto, glmasr, granite_speech, musicflamingo, vibevoice_acoustic_tokenizer, vibevoice_asr

harshaljanjani · 2026-04-14T15:45:08Z

LGTM but can you make sure it's tested ! ?

@ArthurZucker So I tested Granite Speech, Audio Flamingo 3, GLM-ASR and VibeVoice-ASR (the changed models) again just to verify the tests pass, and they do, except the one I mentioned in the document previously which isn't related to this change. It's missing a skip like AudioFlamingo3 and Voxtral. Guess it doesn't hurt to add it so I've added it within the scope of this PR itself, and after that the CI is green (not sure about the local issue on my side when fetching the URL since it's actually valid, didn't happen in the last run).

Before the necessary skip (GraniteSpeechForConditionalGenerationModelTest::test_inputs_embeds_matches_input_ids):

After the necessary skip (all model tests):

RUN_SLOW=1 pytest -q -o log_cli=false tests/models/granite_speech/test_modeling_granite_speech.py tests/models/audioflamingo3/test_modeling_audioflamingo3.py tests/models/glmasr/test_modeling_glmasr.py tests/models/vibevoice_asr/test_modeling_vibevoice_asr.py

Amazing to see it's working out of the box! 🔥 How did you test?
Edit: Okay I see it's there

@eustlb Yupp it's tested :)
Also I think you'd find this gist for how I benchmarked the models, and the documentation of the quirks I encountered valuable as well. I'll mark the vLLM PR ready for review once this is out of the way.

harshaljanjani · 2026-04-20T04:11:10Z

Good day @ArthurZucker @eustlb, just checking in to see if there have been any updates so that the vLLM PR can be unblocked :)

ArthurZucker

Let's go, I'd be happy if we can see some stuff than can be taken from the vllm PR to here to help standardize! 🤗

HuggingFaceDocBuilderDev · 2026-04-20T09:22:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

harshaljanjani · 2026-04-20T14:34:21Z

Got kicked out of the merge queue 😓

…#45326) * chore: Add vLLM compat for audio models * fix: Fix ci/circleci: check_repository_consistency * nit: Skip incompatible test

chore: Add vLLM compat for audio models

61971d2

harshaljanjani mentioned this pull request Apr 8, 2026

feat[vLLM × v5]: Add audio support for the Transformers backend vllm-project/vllm#39330

Open

7 tasks

fix: Fix ci/circleci: check_repository_consistency

6c3b855

Merge branch 'main' into feat/audio-vllm-attention-backend

992c229

ArthurZucker reviewed Apr 14, 2026

View reviewed changes

eustlb reviewed Apr 14, 2026

View reviewed changes

nit: Skip incompatible test

5c1e328

harshaljanjani requested a review from ArthurZucker April 14, 2026 15:45

ArthurZucker approved these changes Apr 20, 2026

View reviewed changes

ArthurZucker enabled auto-merge April 20, 2026 09:12

ArthurZucker added this pull request to the merge queue Apr 20, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 20, 2026

ArthurZucker merged commit a6dab9f into huggingface:main Apr 21, 2026
28 checks passed

harshaljanjani deleted the feat/audio-vllm-attention-backend branch April 21, 2026 07:11

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat[vLLM × v5]: Add vLLM compatibility for audio models#45326

feat[vLLM × v5]: Add vLLM compatibility for audio models#45326
ArthurZucker merged 4 commits intohuggingface:mainfrom
harshaljanjani:feat/audio-vllm-attention-backend

harshaljanjani commented Apr 8, 2026 •

edited

Loading

Uh oh!

harshaljanjani commented Apr 9, 2026 •

edited

Loading

Uh oh!

Rocketknight1 commented Apr 9, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

eustlb left a comment •

edited

Loading

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

harshaljanjani commented Apr 14, 2026 •

edited

Loading

Uh oh!

harshaljanjani commented Apr 20, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 20, 2026

Uh oh!

Uh oh!

harshaljanjani commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

harshaljanjani commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Code Agent Policy

Before submitting

Uh oh!

harshaljanjani commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Apr 9, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

eustlb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

harshaljanjani commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

harshaljanjani commented Apr 20, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 20, 2026

Uh oh!

Uh oh!

harshaljanjani commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

harshaljanjani commented Apr 8, 2026 •

edited

Loading

harshaljanjani commented Apr 9, 2026 •

edited

Loading

eustlb left a comment •

edited

Loading

harshaljanjani commented Apr 14, 2026 •

edited

Loading