Skip to content

[Refactor] Migrate MPT to standardized output tracing decorators#44071

Open
ArivunidhiA wants to merge 1 commit intohuggingface:mainfrom
ArivunidhiA:refactor/mpt-output-tracing-43979
Open

[Refactor] Migrate MPT to standardized output tracing decorators#44071
ArivunidhiA wants to merge 1 commit intohuggingface:mainfrom
ArivunidhiA:refactor/mpt-output-tracing-43979

Conversation

@ArivunidhiA
Copy link
Copy Markdown

What does this PR do?

Refactors the MPT model to use the new standardized output collection interface as part of #43979.

Changes:

  • Added _can_record_outputs to MptPreTrainedModel mapping hidden_statesMptBlock and attentionsMptAttention
  • Added @merge_with_config_defaults and @capture_outputs to MptModel.forward
  • Added @can_return_tuple to MptForCausalLM, MptForSequenceClassification, MptForTokenClassification, and MptForQuestionAnswering
  • Removed manual output_attentions/output_hidden_states/return_dict parameter resolution and collection loops from MptModel.forward
  • Simplified MptBlock.forward to return only hidden_states (attentions captured via hooks)
  • Removed manual return_dict handling from all wrapper forwards
  • Updated wrapper forwards to access outputs via named attributes (e.g., .last_hidden_state) instead of index

Testing:

All 111 MPT tests pass locally.

Fixes part of #43979

Migrate MPT model to use @capture_outputs and @can_return_tuple
decorators as part of huggingface#43979.

- Add _can_record_outputs to MptPreTrainedModel
- Add @merge_with_config_defaults and @capture_outputs to MptModel.forward
- Add @can_return_tuple to MptForCausalLM, MptForSequenceClassification,
  MptForTokenClassification, and MptForQuestionAnswering
- Remove manual output_attentions/output_hidden_states/return_dict handling
- Simplify MptBlock.forward to return only hidden_states
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: mpt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant