Refactor GPT-J to use standardized output tracing (#43979) by jayavelubalaji-ai · Pull Request #44066 · huggingface/transformers

jayavelubalaji-ai · 2026-02-17T05:12:11Z

Migrate GPT-J from manual boilerplate output collection to the new decorator-based output tracing system:

Add _can_record_outputs to GPTJPreTrainedModel
Add @capture_outputs and @merge_with_config_defaults to GPTJModel.forward
Add @can_return_tuple to GPTJForCausalLM, GPTJForSequenceClassification, and GPTJForQuestionAnswering forwards
Simplify GPTJBlock.forward to return hidden_states directly
Remove output_attentions, output_hidden_states, return_dict params from signatures (now handled by decorators)
Propagate changes to CodeGen via # Copied from annotation

Net reduction of ~70 lines of boilerplate code.

What does this PR do?

This PR migrates the GPT-J model from manual boilerplate output collection to the new standardized decorator-based output tracing system introduced in #43979.

Before: Each forward method manually resolved output_attentions, output_hidden_states, and return_dict from config defaults, maintained accumulator lists (all_hidden_states, all_self_attentions), and conditionally appended outputs in the decoder loop.

After: Decorators (@capture_outputs, @can_return_tuple, @merge_with_config_defaults) and PyTorch forward hooks handle all output collection automatically. GPTJBlock.forward returns only hidden_states (a single tensor) instead of a tuple, and wrapper model forwards use attribute access on the output object.

Changes to CodeGenBlock were auto-propagated via make fix-repo through the existing # Copied from transformers.models.gptj.modeling_gptj.GPTJBlock annotation.

Tests: All 107 GPT-J model tests pass (139 skipped — GPU/Hub dependent, expected on CPU-only).

Fixes #43979 (partial — GPT-J model only)

Migrate GPT-J from manual boilerplate output collection to the new decorator-based output tracing system: - Add _can_record_outputs to GPTJPreTrainedModel - Add @capture_outputs and @merge_with_config_defaults to GPTJModel.forward - Add @can_return_tuple to GPTJForCausalLM, GPTJForSequenceClassification, and GPTJForQuestionAnswering forwards - Simplify GPTJBlock.forward to return hidden_states directly - Remove output_attentions, output_hidden_states, return_dict params from signatures (now handled by decorators) - Propagate changes to CodeGen via Copied from annotation Net reduction of ~70 lines of boilerplate code.

github-actions · 2026-02-18T18:27:53Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: codegen, gptj

jayavelubalaji-ai · 2026-02-18T18:29:33Z

Hi @ArthurZucker , Can you please review this

jayavelubalaji-ai and others added 3 commits February 17, 2026 00:08

Merge branch 'main' into 43979/refactor-gptj-output-tracing

ceca4be

Merge branch 'main' into 43979/refactor-gptj-output-tracing

091bc17

burtenshaw mentioned this pull request Apr 10, 2026

Refactor GPT-J output tracing to use standardized decorators #45365

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor GPT-J to use standardized output tracing (#43979)#44066

Refactor GPT-J to use standardized output tracing (#43979)#44066
jayavelubalaji-ai wants to merge 3 commits intohuggingface:mainfrom
jayavelubalaji-ai:43979/refactor-gptj-output-tracing

jayavelubalaji-ai commented Feb 17, 2026

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

jayavelubalaji-ai commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jayavelubalaji-ai commented Feb 17, 2026

What does this PR do?

Uh oh!

github-actions Bot commented Feb 18, 2026

Uh oh!

jayavelubalaji-ai commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant