Conversation
pelesh
left a comment
There was a problem hiding this comment.
This is great work! My main suggestion is to have functionality tests for CUDA and HIP backends implemented and added to this PR. There are a few minor issues I also mentioned in this PR.
|
Ascent build fails with following error message: |
|
You need cuda architecture 7.0 (or 70) to build this. Othewise atomicAdd is defined only for floats but not for doubles. Default is 5.2 somehow. |
|
I don't know why PNNL tests aren't running, but now ORNL tests are passing @ryandanehy |
|
My lastest commit has a working version of cuda 11+ compatible SpSV triangular solve in ILU0 class. I also applied Slaven's dependency comments to 2 files. |
cameronrutherford
left a comment
There was a problem hiding this comment.
Looking good!
|
@kswirydo to get this to run PNNL tests the pr conflicts need to be fixed (probally best to do it through rebase). Then PNNL tests should start running. |
Implementations for AMD and NVIDIA GPUs
8eb6d19 to
0769ba4
Compare
It seems all we needed was to rebase to |
Co-authored-by: pelesh <peless@ornl.gov>
There was a problem hiding this comment.
Remaining items before this can be merged:
- Add functionality tests for randomized (F)GMRES on AMD and NVIDIA GPUs
- Use STL math on the host (this will take care of a big chunk of warnings)
- Fix other warnings
- Remove
roctx.hheader fromLinSolverDirectRocSparseILU0.hpp.
Almost there ;)
Co-authored-by: pelesh <peless@ornl.gov>
d6959e4 to
0ac0955
Compare
|
I fixed our little mishap with rebasing. All is needed to do is to:
|
GitHub shouldn't allow files that big. pre-commit can also help detect when these large files are added, and custom thresholds can be configured cc @ryandanehy |
pelesh
left a comment
There was a problem hiding this comment.
All review issues have been addressed. Merge once CI tests pass.
Working randomized GMRES with preconditioner. You can make it into FGMRES by setting flexible to 1.
Caveat: the cuda version uses ancient version of triangular solve
cusparseDcsrsv2(I am using ILU0 preconditioner that needs triangular solve); despite my best efforts, I was not able to get the more modern version,cusparseSpSVto generate correct results. Hence, the code generates correct results... but uses older version of triangular solve so it compiles with tones of warnings ondeprecatedCUDA functions. I followed this example from cuda samples to no avail. See previous commits.What has been added:
To test, run
runResolveOneMatrixwith matrix of your choice (I tested withvas_stokes_1Mfrom SuiteSparse).Also fixes #84