Conversation
Signed-off-by: Tim Moon <tmoon@nvidia.com>
|
Pipeline 9489089 |
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Avoid split-k kernels on Ada. Signed-off-by: Tim Moon <tmoon@nvidia.com>
|
Running on an L40, I found that the JAX FP8 GEMM tests on integer matrices were failing. It seems cuBLAS chooses a split-k kernel that prevents us from getting bit-wise correct results, although it is still within the expected FP8 error. I've changed the matrix dims to help cuBLAS pick a nicer kernel. Pipeline 9504876. |
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
|
Pipeline 9617488 is green. |
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
6526b44 to
b79b163
Compare
|
I've tweaked the PyTorch and JAX fused attention tests so we check if there's a supported backed (namely #403 adds some PyTorch attention tests and #411 adds backend detection logic to Paddle. We should hold off on merging until those are in. |
Signed-off-by: Tim Moon <tmoon@nvidia.com>
|
This PR is now good to go, pending |
Signed-off-by: Tim Moon <tmoon@nvidia.com>
b06201d to
0a864fc
Compare
cyanguwa
left a comment
There was a problem hiding this comment.
Just that one comment, otherwise looks good!
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Review suggestion from @cyanguwa Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
|
Tests passed in pipeline 10211932. |
This applies the changes in #393 to the PyTorch and Paddle tests. In particular, it only runs tests involving cuDNN fused attention on compute capabilities 8.0 and 9.0.