Randomized solver (both cuda and hip) by kswirydo · Pull Request #83 · ORNL/ReSolve

kswirydo · 2023-11-28T05:12:52Z

Working randomized GMRES with preconditioner. You can make it into FGMRES by setting flexible to 1.

Caveat: the cuda version uses ancient version of triangular solve cusparseDcsrsv2 (I am using ILU0 preconditioner that needs triangular solve); despite my best efforts, I was not able to get the more modern version, cusparseSpSV to generate correct results. Hence, the code generates correct results... but uses older version of triangular solve so it compiles with tones of warnings on deprecated CUDA functions. I followed this example from cuda samples to no avail. See previous commits.

What has been added:

Two randomized classes (for fast Walsh Hadamard tranform based sketching and for Count-Sketch based sketching); cuda and hip kerneles for both,
Randomized solver that deploys Theta (sketching).
A cuda and hip-based examples.

To test, run runResolveOneMatrix with matrix of your choice (I tested with vas_stokes_1M from SuiteSparse).

Also fixes #84

pelesh

This is great work! My main suggestion is to have functionality tests for CUDA and HIP backends implemented and added to this PR. There are a few minor issues I also mentioned in this PR.

pelesh · 2023-11-28T17:08:19Z

Ascent build fails with following error message:

Building CUDA object resolve/cuda/CMakeFiles/resolve_backend_cuda.dir/MemoryUtils.cu.o
/gpfs/wolf/proj-shared/csc359/ci/489235/resolve/cuda/cudaKernels.cu(157): error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (double *, double)
1 error detected in the compilation of "/gpfs/wolf/proj-shared/csc359/ci/489235/resolve/cuda/cudaKernels.cu".
make[2]: *** [resolve/cuda/CMakeFiles/resolve_backend_cuda.dir/build.make:76: resolve/cuda/CMakeFiles/resolve_backend_cuda.dir/cudaKernels.cu.o] Error 1

kswirydo · 2023-11-28T18:15:34Z

You need cuda architecture 7.0 (or 70) to build this. Othewise atomicAdd is defined only for floats but not for doubles. Default is 5.2 somehow.

cameronrutherford · 2023-11-28T19:26:47Z

I don't know why PNNL tests aren't running, but now ORNL tests are passing @ryandanehy

kswirydo · 2023-11-29T06:27:42Z

My lastest commit has a working version of cuda 11+ compatible SpSV triangular solve in ILU0 class. I also applied Slaven's dependency comments to 2 files.

cameronrutherford

Looking good!

ryandanehy · 2023-11-29T16:43:35Z

@kswirydo to get this to run PNNL tests the pr conflicts need to be fixed (probally best to do it through rebase). Then PNNL tests should start running.

Implementations for AMD and NVIDIA GPUs

pelesh · 2023-11-29T17:02:21Z

@kswirydo to get this to run PNNL tests the pr conflicts need to be fixed (probally best to do it through rebase). Then PNNL tests should start running.

It seems all we needed was to rebase to develop, not sure why.

Co-authored-by: pelesh <peless@ornl.gov>

pelesh

Remaining items before this can be merged:

Add functionality tests for randomized (F)GMRES on AMD and NVIDIA GPUs
Use STL math on the host (this will take care of a big chunk of warnings)
Fix other warnings
Remove roctx.h header from LinSolverDirectRocSparseILU0.hpp.

Almost there ;)

Co-authored-by: pelesh <peless@ornl.gov>

pelesh · 2023-11-30T19:11:10Z

I fixed our little mishap with rebasing. All is needed to do is to:

Generate test matrices inside functionality tests. This is lesser evil than keeping 170MB test matrix in a repo.
Add again randomized solvers tests to CMake build and testing.

cameronrutherford · 2023-11-30T20:45:36Z

I fixed our little mishap with rebasing. All is needed to do is to:

Generate test matrices inside functionality tests. This is lesser evil than keeping 170MB test matrix in a repo.

Add again randomized solvers tests to CMake build and testing.

GitHub shouldn't allow files that big. pre-commit can also help detect when these large files are added, and custom thresholds can be configured cc @ryandanehy

pelesh

All review issues have been addressed. Merge once CI tests pass.

kswirydo requested review from cameronrutherford and pelesh November 28, 2023 05:12

pelesh requested changes Nov 28, 2023

View reviewed changes

pelesh mentioned this pull request Nov 28, 2023

Setting performance tuning parameters for GPU kernels #85

Open

pelesh added the enhancement New feature or request label Nov 28, 2023

pelesh added this to the First Release milestone Nov 28, 2023

pelesh mentioned this pull request Nov 29, 2023

Abstract GPU kernels from the algorithm code #87

Closed

cameronrutherford requested changes Nov 29, 2023

View reviewed changes

This was referenced Nov 29, 2023

Consider using smart pointers in ReSolve #88

Open

Consider using const in ReSolve code more aggressively #89

Open

kswirydo mentioned this pull request Nov 29, 2023

CMake has in built list operands - use them in CmakeLists. #90

Closed

Randomized (F)GMRES solver with ILU0 preconditioner

0769ba4

Implementations for AMD and NVIDIA GPUs

pelesh force-pushed the rand-gmres-dev2 branch from 8eb6d19 to 0769ba4 Compare November 29, 2023 16:53

pelesh reviewed Nov 29, 2023

View reviewed changes

Comment thread resolve/LinSolverDirectRocSparseILU0.hpp Outdated

pelesh mentioned this pull request Nov 29, 2023

Install only needed headers #91

Closed

pelesh reviewed Nov 29, 2023

View reviewed changes

Comment thread resolve/LinSolverDirectRocSparseILU0.cpp Outdated

Update resolve/LinSolverDirectRocSparseILU0.hpp

0315b5b

Co-authored-by: pelesh <peless@ornl.gov>

pelesh requested changes Nov 29, 2023

View reviewed changes

Update resolve/LinSolverDirectRocSparseILU0.cpp

76c24fd

Co-authored-by: pelesh <peless@ornl.gov>

pelesh mentioned this pull request Nov 30, 2023

Precommit addition to the repo #82

Closed

pelesh and others added 3 commits November 29, 2023 20:13

Fix warnings in rand-gmres-dev2 branch. (#93)

36bad31

fixing the failing rocm rand solver

03d3a66

Fix bugs, add functionality tests.

0ac0955

pelesh force-pushed the rand-gmres-dev2 branch from d6959e4 to 0ac0955 Compare November 30, 2023 19:07

kswirydo and others added 3 commits December 1, 2023 21:30

working uda rand solver test

927433e

working rand HIP test

d217b5b

Fix CMake issue that was tripping functionality tests.

7fa67ad

pelesh approved these changes Dec 2, 2023

View reviewed changes

pelesh self-assigned this Dec 2, 2023

pelesh merged commit 4c009e6 into develop Dec 2, 2023

pelesh deleted the rand-gmres-dev2 branch December 11, 2023 16:57

Conversation

kswirydo commented Nov 28, 2023 • edited by pelesh Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pelesh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pelesh commented Nov 28, 2023

Uh oh!

kswirydo commented Nov 28, 2023

Uh oh!

cameronrutherford commented Nov 28, 2023

Uh oh!

kswirydo commented Nov 29, 2023

Uh oh!

cameronrutherford left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryandanehy commented Nov 29, 2023

Uh oh!

pelesh commented Nov 29, 2023

Uh oh!

Uh oh!

Uh oh!

pelesh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pelesh commented Nov 30, 2023

Uh oh!

cameronrutherford commented Nov 30, 2023

Uh oh!

pelesh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kswirydo commented Nov 28, 2023 •

edited by pelesh

Loading

pelesh left a comment •

edited

Loading