[Auto] Merge output tracing refactors across selected models (cluster-43979-11): merged 5 of 10 PRs by evalstate · Pull Request #17 · evalstate/transformers

evalstate · 2026-04-23T22:45:52Z

Automated cluster merge for cluster-43979-11 against main.

Merged PRs:

[ResNet] Refactor output tracing to decorator-based interface huggingface/transformers#44007: Merged with manual conflict resolution in ResNet and RegNet; kept current main backbone/filter behavior while applying output-capture refactor, plus RT-DETR ResNet hidden-state capture.
Ouptut tracing: Standardizing MobileNetv2 huggingface/transformers#44013: Merged after resolving stale return_dict/output_hidden_states conflicts in MobileNetV2 to current decorator-based style.
Refactor DeBERTa's output tracing interface huggingface/transformers#44044: Merged after straightforward forward-port of DeBERTa v2 output tracing changes onto current main.
refactor efficientnet output tracing with @capture_outputs and @can_r… huggingface/transformers#44072: Merged after small conflict resolution in EfficientNet wrapper/base forward plumbing.
Refactor gptj output tracing to use standardized decorators huggingface/transformers#44722: Merged as the best GPT-J candidate; forward-ported decorator-based GPT-J refactor onto current cache_position-based code and included CodeGen copy-sync adjustment.

Skipped PRs:

Refactor GPT-J to use standardized output tracing (#43979) huggingface/transformers#44066: Skipped as an older GPT-J duplicate superseded by PR Refactor gptj output tracing to use standardized decorators huggingface/transformers#44722 on the same files/area.
Refactor RemBERT to use output tracing decorators huggingface/transformers#44085: Skipped as another GPT-J duplicate; patch targets GPT-J despite misleading RemBERT title/metadata, and Refactor gptj output tracing to use standardized decorators huggingface/transformers#44722 was the stronger candidate.

Failed PRs:

Refactor FNet and CVT output tracing huggingface/transformers#43996: Merge conflicted in FNet and CVT; FNet looked forward-portable but CVT was a partial/manual hidden-state rewrite rather than a clean current-style migration.
Refactor SpeechT5 output tracing to standardized output capture huggingface/transformers#44129: Merge conflicted heavily in SpeechT5; current main has newer cache_position/decoder-stack plumbing requiring a deeper forward-port.
Refactored vits to match standardized output collection interface huggingface/transformers#44154: Merge conflicted and the branch looked stale/low-confidence, including obsolete capture_outputs import usage and commented-out old argument handling.

Notes:

Verified branch merge-cluster-cluster-43979-11-20260423223633 in the target repo before merging.
Cluster issue Call to contributions: refactor output tracing in transformers huggingface/transformers#43979 is closed, but the merged PRs still had merit because their target model files were not yet refactored on current main.
Open PR Refactor resnet to use @capture_outputs / @can_return_tuple output tracing huggingface/transformers#44019 also overlaps with ResNet, but [ResNet] Refactor output tracing to decorator-based interface huggingface/transformers#44007 was broader because it covered ResNet, RegNet, and RT-DETR ResNet.
Ran python -m compileall on the merged model files; compilation succeeded.
Repo was left on the target branch with only pre-existing untracked local agent/log files remaining.

Next steps:

Optionally run make style or make fix-repo to normalize imports and generated consistency before any further handoff.
Run targeted tests for the merged model families: ResNet/RegNet/RT-DETR ResNet, MobileNetV2, DeBERTa v2, EfficientNet, GPT-J/CodeGen.
If more of the cluster should be salvaged, SpeechT5 would need a real manual forward-port against current cache_position-based decoder code.

…eturn_tuple decorators

…tuple Migrate the GPT-J model to use the new standardized output collection decorators, replacing manual accumulation of hidden states and attention weights with hook-based capturing. Changes: - Add `_can_record_outputs` to `GPTJPreTrainedModel` mapping hidden_states to GPTJBlock and attentions to GPTJAttention - Add `@capture_outputs` and `@merge_with_config_defaults` to `GPTJModel.forward()` - Add `@can_return_tuple` to all task head models (ForCausalLM, ForSequenceClassification, ForQuestionAnswering) - Remove `output_attentions`, `output_hidden_states`, and `return_dict` parameters from all forward signatures - Remove manual accumulator loops and return_dict branching - Simplify GPTJBlock to return plain `torch.Tensor` instead of tuple - Update attention forward signatures to always return `(attn_output, attn_weights)` without conditional logic Resolves huggingface#43979

The CodeGenBlock is a documented copy of GPTJBlock. This syncs it to match the updated signature after removing output_attentions parameter and simplifying the return type to plain torch.Tensor. Generated via `python utils/check_copies.py --fix_and_overwrite`.

The previous commit auto-synced CodeGenBlock.forward() with the refactored GPTJBlock, but CodeGenModel still passes output_attentions to CodeGenBlock and expects a tuple return. Since the CodeGen model has not been refactored to use the new decorators yet, restore CodeGenBlock's original forward() signature and remove the '# Copied from' directive to decouple it from GPTJBlock until CodeGen gets its own output tracing refactor.

…23633 # Conflicts: # src/transformers/models/regnet/modeling_regnet.py # src/transformers/models/resnet/modeling_resnet.py

…23633 # Conflicts: # src/transformers/models/mobilenet_v2/modeling_mobilenet_v2.py

…23633 # Conflicts: # src/transformers/models/deberta_v2/modeling_deberta_v2.py

…23633 # Conflicts: # src/transformers/models/efficientnet/modeling_efficientnet.py

…23633 # Conflicts: # src/transformers/models/gptj/modeling_gptj.py

evalstate · 2026-04-23T22:46:24Z

Trace for this mergeability run: https://huggingface.co/datasets/evalstate/transformers-merge-experiments/blob/main/2604232336-Ocse9d__dev__codex.jsonl

Paritosh Dwivedi and others added 18 commits February 15, 2026 12:56

Refactor ResNet output handling to decorators

08f02cf

refactor: tracing

1d5aa9c

add hooks to deberta_v2

a1c15f9

fix ruff

1361b6d

fix output_attentions arg

c8f5b2a

fix output_attentions arg

75ff815

refactor efficientnet output tracing with @capture_outputs and @can_r…

e6be956

…eturn_tuple decorators

Fix repository copy checks after ResNet output refactor

2f9a2db

Fix missing RegNet decorator imports after copy sync

6e022f6

Enable hidden-state capture for RegNet outputs

411c2c7

Merge branch 'pr-44007' into merge-cluster-cluster-43979-11-202604232…

6e3bdad

…23633 # Conflicts: # src/transformers/models/regnet/modeling_regnet.py # src/transformers/models/resnet/modeling_resnet.py

Merge branch 'pr-44013' into merge-cluster-cluster-43979-11-202604232…

5435428

…23633 # Conflicts: # src/transformers/models/mobilenet_v2/modeling_mobilenet_v2.py

Merge branch 'pr-44044' into merge-cluster-cluster-43979-11-202604232…

a838421

…23633 # Conflicts: # src/transformers/models/deberta_v2/modeling_deberta_v2.py

Merge branch 'pr-44072' into merge-cluster-cluster-43979-11-202604232…

5f138ab

…23633 # Conflicts: # src/transformers/models/efficientnet/modeling_efficientnet.py

Merge branch 'pr-44722' into merge-cluster-cluster-43979-11-202604232…

68450e1

…23633 # Conflicts: # src/transformers/models/gptj/modeling_gptj.py

evalstate closed this Apr 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Auto] Merge output tracing refactors across selected models (cluster-43979-11): merged 5 of 10 PRs#17

[Auto] Merge output tracing refactors across selected models (cluster-43979-11): merged 5 of 10 PRs#17
evalstate wants to merge 18 commits intomainfrom
merge-cluster-cluster-43979-11-20260423223633

evalstate commented Apr 23, 2026

Uh oh!

evalstate commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

evalstate commented Apr 23, 2026

Uh oh!

evalstate commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants