Skip to content

Fix GPT2 with cross attention#39754

Merged
zucchini-nlp merged 5 commits intohuggingface:mainfrom
zucchini-nlp:gpt2-cross-atnn-fix
Jul 29, 2025
Merged

Fix GPT2 with cross attention#39754
zucchini-nlp merged 5 commits intohuggingface:mainfrom
zucchini-nlp:gpt2-cross-atnn-fix

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

What does this PR do?

Fixes #39746

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: gpt2, vision_encoder_decoder

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/gpt2', 'models/vision_encoder_decoder']
quantizations: [] ...

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@ArthurZucker ArthurZucker added the for patch Tag issues / labels that should be included in the next patch label Jul 29, 2025
Comment thread tests/test_modeling_common.py Outdated

inputs_dict["output_attentions"] = True
config.output_hidden_states = False
config._attn_implementation = "eager"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use config.set_....

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: decision_transformer, gpt2, vision_encoder_decoder

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/decision_transformer', 'models/gpt2', 'models/vision_encoder_decoder']
quantizations: [] ...

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: decision_transformer, gpt2, vision_encoder_decoder

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: decision_transformer, gpt2, vision_encoder_decoder

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ['models/decision_transformer', 'models/gpt2', 'models/vision_encoder_decoder']
quantizations: [] ...

@zucchini-nlp zucchini-nlp merged commit ccb2e0e into huggingface:main Jul 29, 2025
25 of 26 checks passed
ArthurZucker pushed a commit that referenced this pull request Jul 29, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* fix

* use new mask API

* style

* fix copies and attention tests

* fix head pruning tests
snorkelopstesting1-a11y pushed a commit to snorkel-marlin-repos/huggingface_transformers_pr_39754_98691a15-1ced-432f-b45c-05d8b3775262 that referenced this pull request Oct 11, 2025
Original PR #39754 by zucchini-nlp
Original: huggingface/transformers#39754
snorkelopstesting1-a11y added a commit to snorkel-marlin-repos/huggingface_transformers_pr_39754_98691a15-1ced-432f-b45c-05d8b3775262 that referenced this pull request Oct 11, 2025
snorkelopstesting1-a11y pushed a commit to snorkel-marlin-repos/huggingface_transformers_pr_39754_44d67845-f462-43d8-ade1-6ef6cd744afd that referenced this pull request Oct 11, 2025
Original PR #39754 by zucchini-nlp
Original: huggingface/transformers#39754
snorkelopstesting1-a11y added a commit to snorkel-marlin-repos/huggingface_transformers_pr_39754_44d67845-f462-43d8-ade1-6ef6cd744afd that referenced this pull request Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

encoder decoder model compile failed after refactor cache

3 participants