Updates for CUDA versions >= 12 by The9Cat · Pull Request #87 · EXP-code/EXP

The9Cat · 2024-10-10T15:22:26Z

Summary of updates

As of CUDA 12.4, the header only version of nvtx (nvtx3) is the default. The header of the previous version of nvtx is provided but seems to conflict with nvtx3. Preprocessor flags are used to select the compatible version.
As of CUDA 12.6, the thrust and cub ABI encode the NVidia architecture and are inconsistent with the same code compiled with the host compiler. The solution here is to compile all code referencing thrust types through nvcc. Many thanks to Georgia Stuart who contributed the CMakeLists.txt recipes for doing this!

…e code through nvcc for consistence with the thrust and cub ABI changes

…assing

michael-petersen

This compiles for me on cuda 12.6 and runs a basic regression simulation. I have not tried on 11.x again to make sure we are backwards compatible. Anyone else tried this?

michael-petersen · 2024-10-10T18:45:48Z

src/Cylinder.cc

This is a nice maintenance change, forced by the new cuda compile strategy, but I think it also makes the codebase simpler.

Right. The goal here was to get the implementation out of the header where it required NVTX-specific structures.

michael-petersen · 2024-10-10T18:46:05Z

src/ExternalForce.H

I assume this is still for testing?

Looks like cruft to me.

michael-petersen · 2024-10-10T18:46:45Z

src/Orient.H

Are we planning on stylistically removing this everywhere? Fine if so and I'll start when I touch any other code.

Not really stylistic. In most compilers, using the "" first checks your local directory, and if it doesn't find a match then moves on to check the system paths. Using <> starts the search with system headers. NVTX.H should always be local. So I would say: use "" when we really mean local directory.

michael-petersen · 2024-10-10T18:47:43Z

src/global.H

Is there a reason extern int cudaGlobalDevice; now comes earlier, or just convenience?

Yes, cudaGlobalDevice should be defined independently of whether nvcc is guiding the compile. It's not about earlier but getting it out of the __NVCC__ block.

michael-petersen · 2024-10-10T18:48:51Z

utils/PhaseSpace/CMakeLists.txt

Just checking -- is this true for all cuda >12? Or just 12.4 and up?

From the release notes, it's true for all of CUDA 12. It could be that some of this was deprecated for a few point releases before finally not working at 12.4.

michael-petersen · 2024-10-10T18:50:31Z

src/cudaCylinder.cu

Clean fix here, thanks!

Yeah, that's old cruft from emacs.

michael-petersen · 2024-10-10T18:56:57Z

Somewhat related, GPU runners are now available (see here), so I could imagine working this compile into CI so that we have some advance notice of issues.

Signed-off-by: Georgia Stuart <gstuart@umass.edu>

Rework cuda file handling for >= 12

michael-petersen

I've been working with this on cuda 12.6, everything seems to be working as expected. Anything left to do before merge?

michael-petersen · 2024-10-21T09:45:13Z

src/ExternalForce.H

  //! Finish and clean-up (caching data necessary for restart)
  virtual void finish() {}

+  // #if HAVE_LIBCUDA==1


Suggested change

// #if HAVE_LIBCUDA==1

The9Cat · 2024-10-23T19:07:30Z

I think we can merge this after quickly checking that a compile with CUDA 11.x still works. I'm a little nervous that this PR is really a stop-gap measure. In the end, we should probably move the whole compile to C++-20 modules to avoid this nvcc morass. But that's a big job.

The9Cat · 2024-10-23T21:00:23Z

Okay, compiles on CUDA < 12. It's a wrap.

Martin D. Weinberg added 3 commits October 10, 2024 11:04

Implement that CUDA 16 change in NVTX for nvtx3; run CUDA-aware sourc…

d006e86

…e code through nvcc for consistence with the thrust and cub ABI changes

Remove a few unused parameters from gendisk source code [no ci]

b8ac63f

Put Sphere.cc in nvcc compile list to prevent issues with parameter p…

d24ae4f

…assing

michael-petersen reviewed Oct 10, 2024

View reviewed changes

georgiastuart and others added 3 commits October 13, 2024 23:03

Rework cuda file handling

876a923

Signed-off-by: Georgia Stuart <gstuart@umass.edu>

Merge pull request #89 from georgiastuart/fix-cuda-16

a35a90c

Rework cuda file handling for >= 12

Need at least CMake 3.25 to get the cuda::nvtx3 target

92c1f8b

michael-petersen approved these changes Oct 23, 2024

View reviewed changes

The9Cat merged commit 181b8d1 into main Oct 23, 2024

The9Cat deleted the fix-cuda-16 branch October 23, 2024 21:00

Conversation

The9Cat commented Oct 10, 2024

Uh oh!

michael-petersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michael-petersen commented Oct 10, 2024

Uh oh!

michael-petersen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

The9Cat commented Oct 23, 2024

Uh oh!

The9Cat commented Oct 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants