Skip to content

Refactor GPT-Neo output tracing to use capture_outputs/can_return_tuple#44018

Open
yashbora9 wants to merge 2 commits intohuggingface:mainfrom
yashbora9:refactor-gpt-neo-output-tracing
Open

Refactor GPT-Neo output tracing to use capture_outputs/can_return_tuple#44018
yashbora9 wants to merge 2 commits intohuggingface:mainfrom
yashbora9:refactor-gpt-neo-output-tracing

Conversation

@yashbora9
Copy link
Copy Markdown

Summary

  • Migrates gpt_neo to the standardized output collection interface as part of Call to contributions: refactor output tracing in transformers #43979
  • Adds @capture_outputs decorator on GPTNeoModel.forward (base model)
  • Adds @can_return_tuple decorator on all wrapper model forwards (ForCausalLM, ForSequenceClassification, ForTokenClassification, ForQuestionAnswering)
  • Adds _can_record_outputs mapping on GPTNeoPreTrainedModel pointing to GPTNeoBlock (hidden_states) and GPTNeoAttention (attentions)
  • Removes manual output_attentions, output_hidden_states, return_dict parameter handling and collection loops
  • GPTNeoBlock now returns hidden_states directly instead of (hidden_states, attn_weights) tuple

Test plan

  • Run existing GPT-Neo model tests (tests/models/gpt_neo/)
  • Verify output_attentions=True still returns attention weights via hooks
  • Verify output_hidden_states=True still returns per-layer hidden states via hooks
  • Verify return_dict=False still returns tuples via @can_return_tuple

🤖 Generated with Claude Code

Yash Bora and others added 2 commits February 15, 2026 20:33
…tors

Migrates GPT-Neo to the standardized output collection interface as part of huggingface#43979.
Removes manual output_attentions/output_hidden_states/return_dict handling
in favor of hook-based output capturing via decorators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…moval

- Added @merge_with_config_defaults to resolve use_cache from config
- Removed output_attentions=True from test_local_attn_probs (attention
  layer always returns weights, no longer accepts this param)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: gpt_neo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant