Conversation
…e code through nvcc for consistence with the thrust and cub ABI changes
michael-petersen
left a comment
There was a problem hiding this comment.
This compiles for me on cuda 12.6 and runs a basic regression simulation. I have not tried on 11.x again to make sure we are backwards compatible. Anyone else tried this?
There was a problem hiding this comment.
This is a nice maintenance change, forced by the new cuda compile strategy, but I think it also makes the codebase simpler.
There was a problem hiding this comment.
Right. The goal here was to get the implementation out of the header where it required NVTX-specific structures.
There was a problem hiding this comment.
I assume this is still for testing?
There was a problem hiding this comment.
Are we planning on stylistically removing this everywhere? Fine if so and I'll start when I touch any other code.
There was a problem hiding this comment.
Not really stylistic. In most compilers, using the "" first checks your local directory, and if it doesn't find a match then moves on to check the system paths. Using <> starts the search with system headers. NVTX.H should always be local. So I would say: use "" when we really mean local directory.
There was a problem hiding this comment.
Is there a reason extern int cudaGlobalDevice; now comes earlier, or just convenience?
There was a problem hiding this comment.
Yes, cudaGlobalDevice should be defined independently of whether nvcc is guiding the compile. It's not about earlier but getting it out of the __NVCC__ block.
There was a problem hiding this comment.
Just checking -- is this true for all cuda >12? Or just 12.4 and up?
There was a problem hiding this comment.
From the release notes, it's true for all of CUDA 12. It could be that some of this was deprecated for a few point releases before finally not working at 12.4.
There was a problem hiding this comment.
Clean fix here, thanks!
There was a problem hiding this comment.
Yeah, that's old cruft from emacs.
|
Somewhat related, GPU runners are now available (see here), so I could imagine working this compile into CI so that we have some advance notice of issues. |
Signed-off-by: Georgia Stuart <gstuart@umass.edu>
Rework cuda file handling for >= 12
michael-petersen
left a comment
There was a problem hiding this comment.
I've been working with this on cuda 12.6, everything seems to be working as expected. Anything left to do before merge?
| //! Finish and clean-up (caching data necessary for restart) | ||
| virtual void finish() {} | ||
|
|
||
| // #if HAVE_LIBCUDA==1 |
There was a problem hiding this comment.
| // #if HAVE_LIBCUDA==1 |
|
I think we can merge this after quickly checking that a compile with CUDA 11.x still works. I'm a little nervous that this PR is really a stop-gap measure. In the end, we should probably move the whole compile to C++-20 modules to avoid this |
|
Okay, compiles on CUDA < 12. It's a wrap. |
Summary of updates
nvcc. Many thanks to Georgia Stuart who contributed theCMakeLists.txtrecipes for doing this!