Remove many output_attentions and other traced outputs on 100+ models by molbap · Pull Request #43590 · huggingface/transformers

molbap · 2026-01-29T12:23:56Z

What does this PR do?

In model additions, we often see old standards not using check_model_inputs, can_return_tuple and it's often a first review comment/something that can slip through. Doing a wide scan to try to remove all occurrences systematically.

Background

Every model used to manually resolve output_attentions, output_hidden_states, and return_dict in each forward, then collect intermediate outputs in a loop, then convert to tuple at the end. That's ~30 lines of boilerplate per model, reimplemented everywhere with subtle inconsistencies.

Two decorators now handle this:

@capture_outputs goes on the base model forward (the one with the layer loop). It reads output_attentions/output_hidden_states from kwargs or config, installs hooks on modules listed in _can_record_outputs, collects intermediate outputs automatically, injects them into the ModelOutput, and handles return_dict. The model just needs to declare which module classes produce which outputs (e.g. _can_record_outputs = {"hidden_states": DecoderLayer, "attentions": Attention}).
@can_return_tuple goes on wrapper forwards (ForCausalLM, ForSequenceClassification, VLM wrappers) that only need return_dict conversion. Wrapper models should not use @capture_outputs to avoid nested hook chains.

What changes per model

output_attentions, output_hidden_states, return_dict dropped from forward signatures, replaced by **kwargs: Unpack[TransformersKwargs]
Explicit parameter resolution lines removed
Manual all_hidden_states += (hidden_states,) collection loops removed
Decoder layers return a single tensor instead of a tuple
Attention modules always return (attn_output, attn_weights) — the if not output_attentions: attn_weights = None guard is removed since hooks capture directly from the module output

HuggingFaceDocBuilderDev · 2026-01-29T12:33:14Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…neric.py

github-actions · 2026-03-12T08:23:38Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/roberta", "models/roberta_prelayernorm", "models/roc_bert", "models/sam", "models/speech_to_text", "models/splinter", "models/stablelm", "models/time_series_transformer", "models/timesfm2_5", "models/timm_wrapper", "models/video_llama_3", "models/video_llava", "models/videomae"]
quantizations: []

github-actions · 2026-03-12T08:59:35Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	68de84a6	workflow commit (merge commit)
PR	fdaa667f	branch commit (from PR)
main	535f289d	base commit (on `main`)

Model CI Report

❌ 5 new failed tests from this PR 😭

time_series_transformer:
tests/models/time_series_transformer/test_modeling_time_series_transformer.py::TimeSeriesTransformerModelTest::test_forward_signature (✅ ⟹ ❌)
video_llama_3:
tests/models/video_llama_3/test_modeling_video_llama_3.py::VideoLlama3IntegrationTest::test_small_model_integration_test (❌ ⟹ ❌)
tests/models/video_llama_3/test_modeling_video_llama_3.py::VideoLlama3IntegrationTest::test_small_model_integration_test_batch (❌ ⟹ ❌)
tests/models/video_llama_3/test_modeling_video_llama_3.py::VideoLlama3IntegrationTest::test_small_model_integration_test_batch_different_resolutions (❌ ⟹ ❌)
tests/models/video_llama_3/test_modeling_video_llama_3.py::VideoLlama3IntegrationTest::test_small_model_integration_test_batch_wo_image (❌ ⟹ ❌)

vasqu · 2026-03-12T11:51:59Z

run-slow: vipllava,vit_mae,vit_msn,vitpose_backbone,vivit,vjepa2,voxtral_realtime,xglm,xlm_roberta,xlm_roberta_xl,xlstm,xmod,yolos,zamba,qwen2_5_omni,qwen2_audio,qwen2_vl,qwen3_5

github-actions · 2026-03-12T11:53:28Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/qwen2_5_omni", "models/qwen2_audio", "models/qwen2_vl", "models/qwen3_5", "models/vipllava", "models/vit_mae", "models/vit_msn", "models/vitpose_backbone", "models/vivit", "models/vjepa2", "models/voxtral_realtime", "models/xglm", "models/xlm_roberta", "models/xlm_roberta_xl", "models/xlstm", "models/xmod", "models/yolos", "models/zamba"]
quantizations: []

github-actions · 2026-03-12T12:28:50Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	68de84a6	workflow commit (merge commit)
PR	fdaa667f	branch commit (from PR)
main	535f289d	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

vasqu · 2026-03-12T12:54:07Z

run-slow: qwen3_5_moe,qwen3_omni_moe,qwen3_vl,qwen3_vl_moe

github-actions · 2026-03-12T12:55:28Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/qwen3_5_moe", "models/qwen3_omni_moe", "models/qwen3_vl", "models/qwen3_vl_moe"]
quantizations: []

github-actions · 2026-03-12T17:02:59Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, align, altclip, apertus, aria, audio_spectrogram_transformer, audioflamingo3, autoformer, aya_vision, bamba, bart, beit, bert, bert_generation, big_bird, bigbird_pegasus

github-actions · 2026-03-12T17:05:32Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, align, altclip, apertus, aria, audio_spectrogram_transformer, audioflamingo3, autoformer, aya_vision, bamba, bart, beit, bert, bert_generation, big_bird, bigbird_pegasus

github-actions · 2026-03-12T17:06:16Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, align, altclip, apertus, aria, audio_spectrogram_transformer, audioflamingo3, autoformer, aya_vision, bamba, bart, beit, bert, bert_generation, big_bird, bigbird_pegasus

github-actions · 2026-03-12T17:15:12Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, align, altclip, apertus, aria, audio_spectrogram_transformer, audioflamingo3, autoformer, aya_vision, bamba, bart, beit, bert, bert_generation, big_bird, bigbird_pegasus

github-actions · 2026-03-12T17:44:21Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	68de84a6	workflow commit (merge commit)
PR	fdaa667f	branch commit (from PR)
main	535f289d	base commit (on `main`)

⚠️ Model CI failed to report results

The test failure analysis could not be completed. Please check the workflow run for details.

github-actions · 2026-03-12T18:44:43Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, align, altclip, apertus, aria, audio_spectrogram_transformer, audioflamingo3, autoformer, aya_vision, bamba, bart, beit, bert, bert_generation, big_bird, bigbird_pegasus

vasqu · 2026-03-12T19:08:32Z

Force merging flaky test, and also important for a model addition (and several other refactors)

first batch, let's see

e253ef2

molbap added 28 commits January 30, 2026 11:36

propagate changes

4183863

fixup broken tests

49b1547

simplify more models

9382257

hack

e6c4b89

fix attentions

6c002b5

Merge branch 'main' into update_all_decorators

0ca6f96

I'm bullied by the new fix-repo

b327b19

this one too

807b17f

fix-repo

f0916ea

...fix-repo?

24bf3b5

change up

15bf2d7

fixup merges with main

a0569c8

propagate changes again

8ae95c5

more changes

389dde9

fixes

21b65ef

revert CLAP to the back burner

37e2841

more changes

6affc73

some broken clap stuff

b5f3996

New batch of models

2b53f63

something forgotten?

7867d98

biiig batch + handle explicit None values for output_attentions in ge…

6c22616

…neric.py

make fixup

becb675

Merge branch 'main' into update_all_decorators

41257e2

remove dummy record outputs

7961fba

Merge branch 'main' into update_all_decorators

2147358

batch, with some difficult ones (dpt)

531919c

ugly fix

77e4af8

update

67dc0e9

zhang-prog mentioned this pull request Mar 12, 2026

[Model] Add PP-OCRV5_mobile_det Model Support #43247

Merged

5 tasks

vasqu mentioned this pull request Mar 12, 2026

Add RF-DETR #36895

Open

5 tasks

vasqu added 2 commits March 12, 2026 18:04

style

2c1678f

revert pr to capturing

d6fb85a

Merge branch 'main' into update_all_decorators

0a647b5

vasqu enabled auto-merge March 12, 2026 17:15

vasqu disabled auto-merge March 12, 2026 17:23

Merge branch 'main' into update_all_decorators

6770bea

vasqu enabled auto-merge March 12, 2026 18:44

vasqu added this pull request to the merge queue Mar 12, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Mar 12, 2026

vasqu merged commit e2d4ac0 into main Mar 12, 2026
29 checks passed

vasqu deleted the update_all_decorators branch March 12, 2026 19:08

vasqu mentioned this pull request Mar 12, 2026

Sdpa for owlvit #42136

Merged

2 tasks

XanxusCrypto mentioned this pull request Apr 2, 2026

[Bug] XOR logic for input_ids/inputs_embeds validation produces wrong or misleading error messages across multiple models #45183

Closed

4 tasks

audiodude mentioned this pull request Apr 25, 2026

MusicgenMelody ignores audio conditioning (regression between 4.48 and 4.57) #45647

Open

Conversation

molbap commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Background

What changes per model

Uh oh!

HuggingFaceDocBuilderDev commented Jan 29, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

CI Results

Commit Info

Model CI Report

Uh oh!

vasqu commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

CI Results

Commit Info

Uh oh!

vasqu commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

github-actions Bot commented Mar 12, 2026

CI Results

Commit Info

Uh oh!

github-actions Bot commented Mar 12, 2026

Uh oh!

Uh oh!

vasqu commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

molbap commented Jan 29, 2026 •

edited

Loading