[GPT2] Refactor output tracing to use capture_outputs/can_return_tuple decorators by lakprigan · Pull Request #44059 · huggingface/transformers

lakprigan · 2026-02-16T20:14:30Z

Summary

Migrates GPT2 to the standardized output collection interface as part of #43979.

Added _can_record_outputs to GPT2PreTrainedModel (including cross_attentions via OutputRecorder targeting the crossattention submodule)
Added @capture_outputs on GPT2Model.forward()
Added @can_return_tuple on all wrapper model forwards (GPT2LMHeadModel, GPT2DoubleHeadsModel, GPT2ForSequenceClassification, GPT2ForTokenClassification, GPT2ForQuestionAnswering)
Removed manual output_attentions, output_hidden_states, and return_dict handling from all forward methods
GPT2Block.forward() now returns a single torch.Tensor instead of a tuple
GPT2Attention.forward() always returns (attn_output, attn_weights) — hooks capture weights when needed

Net reduction: ~89 lines of boilerplate removed (44 insertions, 133 deletions).

Testing

All 136 non-compile GPT2 tests pass. The 2 torch.compile test failures (test_generate_compilation_all_outputs, test_generate_compile_model_forward_fullgraph) are pre-existing environment-specific issues (arm64 torch inductor) and also fail on unmodified main.

References

Used llama, nllb_moe, and t5gemma as reference implementations for the cross-attention OutputRecorder pattern.

…decorators Part of huggingface#43979. Migrates GPT2 to the standardized output collection interface, removing ~89 lines of manual output_attentions, output_hidden_states, and return_dict boilerplate. Changes: - Add _can_record_outputs to GPT2PreTrainedModel (including cross_attentions via OutputRecorder) - Add @capture_outputs on GPT2Model.forward() - Add @can_return_tuple on all wrapper model forwards - GPT2Block returns a single tensor instead of a tuple - GPT2Attention always returns (attn_output, attn_weights)

github-actions · 2026-02-16T20:15:29Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gpt2

lakprigan · 2026-02-25T17:47:45Z

@molbap This GPT2 refactor is ready for review per #43979. All 136 non-compile tests pass. The 2 torch.compile failures are pre-existing on main (arm64 inductor)

This was referenced Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPT2] Refactor output tracing to use capture_outputs/can_return_tuple decorators#44059

[GPT2] Refactor output tracing to use capture_outputs/can_return_tuple decorators#44059
lakprigan wants to merge 1 commit intohuggingface:mainfrom
lakprigan:refactor-gpt2-output-tracing

lakprigan commented Feb 16, 2026

Uh oh!

github-actions Bot commented Feb 16, 2026

Uh oh!

lakprigan commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lakprigan commented Feb 16, 2026

Summary

Testing

References

Uh oh!

github-actions Bot commented Feb 16, 2026

Uh oh!

lakprigan commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant