Migrate wav2vec2, wav2vec2_conformer, and wav2vec2_bert to standardized output collection decorators by 23atharvaS · Pull Request #44114 · huggingface/transformers

23atharvaS · 2026-02-17T21:17:35Z

Summary

This PR migrates the wav2vec2 family to the standardized output-capturing interface (@capture_outputs + @can_return_tuple) and includes follow-up compatibility fixes required to make full CI green.

What changed

Core migration (`wav2vec2`, `wav2vec2_conformer`, `wav2vec2_bert`)

Added _can_record_outputs on:
- Wav2Vec2PreTrainedModel
- Wav2Vec2ConformerPreTrainedModel
- Wav2Vec2BertPreTrainedModel
Added @capture_outputs on base model forwards:
- Wav2Vec2Model.forward
- Wav2Vec2ConformerModel.forward (via modular -> generated)
- Wav2Vec2BertModel.forward (via modular -> generated)
Added @can_return_tuple on wrapper forwards (CTC / sequence classification / audio frame classification / xvector / pretraining where applicable).
Removed manual hidden-state/attention collection loops and legacy output plumbing in migrated paths.
Updated encoder layer return flow to align with hook-based output capture.
In conformer/bert modular encoders, ensured only output_attentions is forwarded to self-attention (preventing unrelated kwargs from propagating too deep).

Follow-up compatibility fixes (from CI)

Restored config-driven forwarding of output_attentions / output_hidden_states in Wav2Vec2Model.forward so config-only output requests are honored.
Synced the same behavior across related wav2vec2-derived models through modular/generated updates:
- hubert
- sew
- unispeech
- unispeech_sat
Fixed SEWEncoderLayer output contract to match encoder expectations.
Updated UniSpeechForPreTraining / UniSpeechSatForPreTraining to propagate config-driven output flags.
Updated XcodecModel to accept and thread capture-output kwargs (output_attentions, output_hidden_states) through its semantic backbone call path, fixing test_capture_outputs_decorator.
Kept attention return compatibility where required by downstream copied implementations.

Files touched (high-level)

src/transformers/models/wav2vec2/modeling_wav2vec2.py
src/transformers/models/wav2vec2_conformer/modular_wav2vec2_conformer.py
src/transformers/models/wav2vec2_conformer/modeling_wav2vec2_conformer.py (regenerated)
src/transformers/models/wav2vec2_bert/modular_wav2vec2_bert.py
src/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py (regenerated)
src/transformers/models/hubert/modular_hubert.py
src/transformers/models/hubert/modeling_hubert.py (regenerated)
src/transformers/models/sew/modular_sew.py
src/transformers/models/sew/modeling_sew.py (regenerated)
src/transformers/models/unispeech/modular_unispeech.py
src/transformers/models/unispeech/modeling_unispeech.py (regenerated)
src/transformers/models/unispeech_sat/modular_unispeech_sat.py
src/transformers/models/unispeech_sat/modeling_unispeech_sat.py (regenerated)
src/transformers/models/xcodec/modeling_xcodec.py

Validation

Regeneration / consistency

python utils/check_modular_conversion.py --fix_and_overwrite
python utils/check_modular_conversion.py
python utils/check_copies.py --fix_and_overwrite (when needed by CI drift)

Code quality

python -m ruff check ... on touched files (clean)

Focused output-interface tests

python -m pytest tests/models/wav2vec2/test_modeling_wav2vec2.py tests/models/wav2vec2_conformer/test_modeling_wav2vec2_conformer.py tests/models/wav2vec2_bert/test_modeling_wav2vec2_bert.py -k "capture_outputs_decorator or attention_outputs or hidden_states_output or model_outputs_equivalence" -q
Result: pass

Targeted regression reruns for CI failures

hubert: test_attention_outputs (regular + robust) — pass
sew: test_attention_outputs — pass
unispeech: test_attention_outputs (robust) — pass
unispeech_sat: test_attention_outputs (regular + robust) — pass
xcodec: test_capture_outputs_decorator — pass
xcodec: test_model_forward_default_config_values — pass
glm_ocr: test_generate_with_and_without_position_ids — pass

CI status

tests_torch and all major CircleCI shards pass.
Remaining merge blockers are maintainer workflow approval / required review, not code failures.

Notes

On Windows, test_save_load may fail with a safetensors file-lock issue (os error 1224), which appears environment-specific and unrelated to this migration logic.

…turing decorators

… family files

23atharvaS · 2026-02-18T06:14:26Z

This PR is still under testing phase on CI

…ert/sew/unispeech/unispeech-sat

23atharvaS · 2026-02-18T07:20:47Z

The PR has passed all of the CircleCI tests

github-actions · 2026-02-18T20:34:53Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: data2vec, hubert, patchtsmixer, patchtst, sew, sew_d, unispeech, unispeech_sat, wav2vec2, wav2vec2_bert, wav2vec2_conformer, wavlm, xcodec

Migrate wav2vec2, wav2vec2_conformer, and wav2vec2_bert to output cap…

452f0c0

…turing decorators

23atharvaS mentioned this pull request Feb 17, 2026

Call to contributions: refactor output tracing in transformers #43979

Closed

23atharvaS and others added 6 commits February 17, 2026 16:22

Merge branch 'main' into wav2vec2-output-capturing-migration

7f5215e

Merge branch 'main' into wav2vec2-output-capturing-migration

d1c5efa

Sync copied wav2vec2/conformer blocks after output-capturing migration

49a2fcb

Trigger CI rerun

8970f8d

Fix attention return compatibility and resync modular/copied wav2vec2…

ac53346

… family files

Fix config-driven attention output capture in wav2vec2 family

0b5a2a9

23atharvaS added 3 commits February 18, 2026 01:15

Fix wav2vec2-family attention/hidden-state capture regressions in hub…

15cfff5

…ert/sew/unispeech/unispeech-sat

Fix xcodec capture-output kwargs plumbing for semantic backbone

2c7f050

Fix ruff SIM401 in xcodec semantic kwargs handling

76d877f

23atharvaS added 2 commits February 18, 2026 13:38

Merge branch 'main' into wav2vec2-output-capturing-migration

41545cd

Merge branch 'main' into wav2vec2-output-capturing-migration

f2b1777

This was referenced Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate wav2vec2, wav2vec2_conformer, and wav2vec2_bert to standardized output collection decorators#44114

Migrate wav2vec2, wav2vec2_conformer, and wav2vec2_bert to standardized output collection decorators#44114
23atharvaS wants to merge 12 commits intohuggingface:mainfrom
23atharvaS:wav2vec2-output-capturing-migration

23atharvaS commented Feb 17, 2026 •

edited

Loading

Uh oh!

23atharvaS commented Feb 18, 2026 •

edited

Loading

Uh oh!

23atharvaS commented Feb 18, 2026

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

23atharvaS commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Core migration (wav2vec2, wav2vec2_conformer, wav2vec2_bert)

Follow-up compatibility fixes (from CI)

Files touched (high-level)

Validation

Regeneration / consistency

Code quality

Focused output-interface tests

Targeted regression reruns for CI failures

CI status

Notes

Uh oh!

23atharvaS commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

23atharvaS commented Feb 18, 2026

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

23atharvaS commented Feb 17, 2026 •

edited

Loading

Core migration (`wav2vec2`, `wav2vec2_conformer`, `wav2vec2_bert`)

23atharvaS commented Feb 18, 2026 •

edited

Loading