Skip to content

mllama outputs refactor#39643

Merged
itazap merged 9 commits intomainfrom
mllama_new_outputs
Jul 28, 2025
Merged

mllama outputs refactor#39643
itazap merged 9 commits intomainfrom
mllama_new_outputs

Conversation

@itazap
Copy link
Copy Markdown
Collaborator

@itazap itazap commented Jul 24, 2025

refactor using latest outputs merge

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@itazap itazap force-pushed the mllama_new_outputs branch from e0beb91 to d409f10 Compare July 28, 2025 08:07
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup!

Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment on lines +766 to +770
"hidden_states": [
OutputRecorder(MllamaTextSelfAttention, index=0),
OutputRecorder(MllamaTextCrossAttention, index=0),
],
"attentions": [
OutputRecorder(MllamaTextSelfAttention, index=1, layer_name="self_attn"),
OutputRecorder(MllamaTextSelfAttention, index=1, layer_name="cross_attn"),
OutputRecorder(MllamaTextCrossAttention, index=1, layer_name="cross_attn"),
],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Index can be optinal, by default hidden is 0 and attention is 1

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated hidden_states to "hidden_states": [MllamaSelfAttentionDecoderLayer, MllamaCrossAttentionDecoderLayer],, but for attentions still using direct layers

Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
@itazap itazap force-pushed the mllama_new_outputs branch from af0e94c to 3bd4c0a Compare July 28, 2025 12:20
@itazap itazap requested a review from ArthurZucker July 28, 2025 12:41
@itazap itazap force-pushed the mllama_new_outputs branch from f7811d4 to 4ebcc20 Compare July 28, 2025 12:42
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decoder layer should only need to return HS, never the attention weights no?

Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread src/transformers/models/mllama/modeling_mllama.py Outdated
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: mllama

@itazap itazap force-pushed the mllama_new_outputs branch from b71d470 to fa870ff Compare July 28, 2025 13:38
@itazap itazap merged commit da823fc into main Jul 28, 2025
20 checks passed
@itazap itazap deleted the mllama_new_outputs branch July 28, 2025 13:59
@itazap itazap restored the mllama_new_outputs branch July 30, 2025 10:10
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* mllama outputs refactor

* forgot kwargs

* fix output

* add can_record_outputs

* correct @check_model_inputs placement

* ruff and copies

* rebase

* feedback

* only return hidden_states

---------

Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* mllama outputs refactor

* forgot kwargs

* fix output

* add can_record_outputs

* correct @check_model_inputs placement

* ruff and copies

* rebase

* feedback

* only return hidden_states

---------

Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* mllama outputs refactor

* forgot kwargs

* fix output

* add can_record_outputs

* correct @check_model_inputs placement

* ruff and copies

* rebase

* feedback

* only return hidden_states

---------

Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* mllama outputs refactor

* forgot kwargs

* fix output

* add can_record_outputs

* correct @check_model_inputs placement

* ruff and copies

* rebase

* feedback

* only return hidden_states

---------

Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* mllama outputs refactor

* forgot kwargs

* fix output

* add can_record_outputs

* correct @check_model_inputs placement

* ruff and copies

* rebase

* feedback

* only return hidden_states

---------

Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* mllama outputs refactor

* forgot kwargs

* fix output

* add can_record_outputs

* correct @check_model_inputs placement

* ruff and copies

* rebase

* feedback

* only return hidden_states

---------

Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* mllama outputs refactor

* forgot kwargs

* fix output

* add can_record_outputs

* correct @check_model_inputs placement

* ruff and copies

* rebase

* feedback

* only return hidden_states

---------

Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-161-153.ec2.internal>
Co-authored-by: ita.zaporozhets@huggingface.co <ita_zaporozhets@ip-26-0-162-14.ec2.internal>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants