Migrate wav2vec2, wav2vec2_conformer, and wav2vec2_bert to standardized output collection decorators#44114
Open
23atharvaS wants to merge 12 commits intohuggingface:mainfrom
Open
Conversation
…turing decorators
Author
|
This PR is still under testing phase on CI |
Author
|
The PR has passed all of the CircleCI tests |
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: data2vec, hubert, patchtsmixer, patchtst, sew, sew_d, unispeech, unispeech_sat, wav2vec2, wav2vec2_bert, wav2vec2_conformer, wavlm, xcodec |
This was referenced Apr 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR migrates the
wav2vec2family to the standardized output-capturing interface (@capture_outputs+@can_return_tuple) and includes follow-up compatibility fixes required to make full CI green.What changed
Core migration (
wav2vec2,wav2vec2_conformer,wav2vec2_bert)_can_record_outputson:Wav2Vec2PreTrainedModelWav2Vec2ConformerPreTrainedModelWav2Vec2BertPreTrainedModel@capture_outputson base model forwards:Wav2Vec2Model.forwardWav2Vec2ConformerModel.forward(via modular -> generated)Wav2Vec2BertModel.forward(via modular -> generated)@can_return_tupleon wrapper forwards (CTC / sequence classification / audio frame classification / xvector / pretraining where applicable).output_attentionsis forwarded to self-attention (preventing unrelated kwargs from propagating too deep).Follow-up compatibility fixes (from CI)
output_attentions/output_hidden_statesinWav2Vec2Model.forwardso config-only output requests are honored.hubertsewunispeechunispeech_satSEWEncoderLayeroutput contract to match encoder expectations.UniSpeechForPreTraining/UniSpeechSatForPreTrainingto propagate config-driven output flags.XcodecModelto accept and thread capture-output kwargs (output_attentions,output_hidden_states) through its semantic backbone call path, fixingtest_capture_outputs_decorator.Files touched (high-level)
src/transformers/models/wav2vec2/modeling_wav2vec2.pysrc/transformers/models/wav2vec2_conformer/modular_wav2vec2_conformer.pysrc/transformers/models/wav2vec2_conformer/modeling_wav2vec2_conformer.py(regenerated)src/transformers/models/wav2vec2_bert/modular_wav2vec2_bert.pysrc/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py(regenerated)src/transformers/models/hubert/modular_hubert.pysrc/transformers/models/hubert/modeling_hubert.py(regenerated)src/transformers/models/sew/modular_sew.pysrc/transformers/models/sew/modeling_sew.py(regenerated)src/transformers/models/unispeech/modular_unispeech.pysrc/transformers/models/unispeech/modeling_unispeech.py(regenerated)src/transformers/models/unispeech_sat/modular_unispeech_sat.pysrc/transformers/models/unispeech_sat/modeling_unispeech_sat.py(regenerated)src/transformers/models/xcodec/modeling_xcodec.pyValidation
Regeneration / consistency
python utils/check_modular_conversion.py --fix_and_overwritepython utils/check_modular_conversion.pypython utils/check_copies.py --fix_and_overwrite(when needed by CI drift)Code quality
python -m ruff check ...on touched files (clean)Focused output-interface tests
python -m pytest tests/models/wav2vec2/test_modeling_wav2vec2.py tests/models/wav2vec2_conformer/test_modeling_wav2vec2_conformer.py tests/models/wav2vec2_bert/test_modeling_wav2vec2_bert.py -k "capture_outputs_decorator or attention_outputs or hidden_states_output or model_outputs_equivalence" -qTargeted regression reruns for CI failures
hubert:test_attention_outputs(regular + robust) — passsew:test_attention_outputs— passunispeech:test_attention_outputs(robust) — passunispeech_sat:test_attention_outputs(regular + robust) — passxcodec:test_capture_outputs_decorator— passxcodec:test_model_forward_default_config_values— passglm_ocr:test_generate_with_and_without_position_ids— passCI status
tests_torchand all major CircleCI shards pass.Notes
test_save_loadmay fail with a safetensors file-lock issue (os error 1224), which appears environment-specific and unrelated to this migration logic.