-
Notifications
You must be signed in to change notification settings - Fork 751
support skip atten in export #16104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support skip atten in export #16104
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16104
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ You can merge normally! (1 Unrelated Failure)As of commit 72fd468 with merge base 4014597 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@jackzhxng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D88399533. |
This PR needs a
|
Summary:
Support export for llama model variants with attention layer skipping. We only need to specify the attention skip patterns in config.json in layer_type. E.g.,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"skip_attention",
"skip_attention",
"skip_attention"
]
Differential Revision: D88399533
dca1f2c to
375d689
Compare
Summary:
Support export for llama model variants with attention layer skipping. We only need to specify the attention skip patterns in config.json in layer_type. E.g.,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"skip_attention",
"skip_attention",
"skip_attention"
]
Differential Revision: D88399533
375d689 to
72fd468
Compare
| ) | ||
| elif ( | ||
| model_args.layer_types | ||
| and model_args.layer_types[layer_id] == "skip_attention" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 'skip_attention' standard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it is a standard name if we want to call https://github.com/pytorch/executorch/blob/main/examples/models/llama/attention.py#L525
Summary:
Support export for llama model variants with attention layer skipping. We only need to specify the attention skip patterns in config.json in layer_type. E.g.,
"layer_types": [
"full_attention",
"full_attention",
"full_attention",
"skip_attention",
"skip_attention",
"skip_attention"
]
Differential Revision: D88399533