Skip to content

Fix Inefficient GELU implementation in GPT2#40059

Open
null-pointer-access wants to merge 1 commit intohuggingface:mainfrom
null-pointer-access:fix-gpt2-gelu-performance
Open

Fix Inefficient GELU implementation in GPT2#40059
null-pointer-access wants to merge 1 commit intohuggingface:mainfrom
null-pointer-access:fix-gpt2-gelu-performance

Conversation

@null-pointer-access
Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes #39073 by using fused GELU instead of the custom implementation. This can improve the e2e efficiency by up to 12%.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

cc @ArthurZucker

… (uses NewGELUActivation) to gelu (uses GELUActivation)
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Aug 9, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: gpt2

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I am not sure this change is worth it given that it will not change anything for models already pushed online to the hub!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inefficient default GELU implementation in GPT2

3 participants