Error handle for non-sm80/sm90 GPUs when using fused attention#393
Merged
ksivaman merged 6 commits intoNVIDIA:mainfrom Aug 25, 2023
Merged
Conversation
8b81606 to
1cc7ee8
Compare
Collaborator
Author
|
/te-ci |
Signed-off-by: Reese Wang <rewang@nvidia.com>
1cc7ee8 to
1a3f728
Compare
Collaborator
Author
|
/te-ci |
Collaborator
Author
|
@timmoon10 @ksivaman The CI reports "no space left on device" when initializing the container, could you take a look? Thanks |
timmoon10
approved these changes
Aug 21, 2023
Collaborator
timmoon10
left a comment
There was a problem hiding this comment.
LGTM
@ksivaman @ptrendx This is relevant to our discussion on the common headers at #382 (comment).
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
ksivaman
approved these changes
Aug 21, 2023
Merged
Collaborator
|
BTW, no reason to worry about the GitHub CI failures as long as the GitLab tests all pass. The provided nodes don't seem to be beefy enough to handle the current PyTorch container and I'm still thinking about workarounds. |
cyanguwa
reviewed
Aug 21, 2023
Signed-off-by: Reese Wang <rewang@nvidia.com>
4acce81 to
f195aa4
Compare
Collaborator
Author
|
/te-ci |
Collaborator
Author
|
/te-ci |
timmoon10
reviewed
Aug 23, 2023
Merged
cyanguwa
reviewed
Aug 23, 2023
Signed-off-by: Reese Wang <rewang@nvidia.com>
Signed-off-by: Reese Wang <rewang@nvidia.com>
Collaborator
Author
|
/te-ci |
janekb04
pushed a commit
to janekb04/TransformerEngine
that referenced
this pull request
Sep 1, 2023
…A#393) * Fused attention kernel only supports sm80 and sm90 Signed-off-by: Reese Wang <rewang@nvidia.com> * Update transformer_engine/jax/csrc/modules.cpp Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> * arbitary fused kernel supports sm86/sm89 after 8.9.3 Signed-off-by: Reese Wang <rewang@nvidia.com> * Skip sm70 Signed-off-by: Reese Wang <rewang@nvidia.com> * Forward is_fused_attn_kernel_available to cpp backend Signed-off-by: Reese Wang <rewang@nvidia.com> * Remove cpp is_fused_attn_available API Signed-off-by: Reese Wang <rewang@nvidia.com> --------- Signed-off-by: Reese Wang <rewang@nvidia.com> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: Jan Bielak <jbielak@nvidia.com>
RuiWang1998
pushed a commit
to RuiWang1998/TransformerEngine
that referenced
this pull request
Sep 11, 2023
…A#393) * Fused attention kernel only supports sm80 and sm90 Signed-off-by: Reese Wang <rewang@nvidia.com> * Update transformer_engine/jax/csrc/modules.cpp Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> * arbitary fused kernel supports sm86/sm89 after 8.9.3 Signed-off-by: Reese Wang <rewang@nvidia.com> * Skip sm70 Signed-off-by: Reese Wang <rewang@nvidia.com> * Forward is_fused_attn_kernel_available to cpp backend Signed-off-by: Reese Wang <rewang@nvidia.com> * Remove cpp is_fused_attn_available API Signed-off-by: Reese Wang <rewang@nvidia.com> --------- Signed-off-by: Reese Wang <rewang@nvidia.com> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: Rui Wang <rui@helixon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
cuDNN max 512 seqlen fused kernel only supports sm80 and sm90, and arbitrary seqlen requires 8.9.3 to support all CC >= 80.
is_fused_attn_availableAPI for different attention setup