Use cusparsespmv_preprocess() now that Raft implements it. by vitor1001 · Pull Request #120 · NVIDIA/cuopt

vitor1001 · 2025-06-24T10:51:23Z

Description

This removes a temporary hack.

Checklist

I am familiar with the Contributing Guidelines.
Testing
- New or existing tests cover these changes
- Added tests
- Created an issue to follow-up
- NA
Documentation
- The documentation is up to date with these changes
- Added new documentation
- NA

copy-pr-bot · 2025-06-24T10:51:26Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

rgsl888prabhu · 2025-06-24T15:15:29Z

/ok to test c77e264

copy-pr-bot · 2025-06-24T15:15:33Z

/ok to test c77e264

@rgsl888prabhu, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

rgsl888prabhu · 2025-06-24T15:27:10Z

/ok to test 457269b

rgsl888prabhu · 2025-06-24T20:48:57Z

/ok to test 5ce59ef

Kh4ster · 2025-06-25T09:08:33Z

Please run : https://github.com/NVIDIA/cuopt/blob/branch-25.08/ci/check_style.sh (or just the pre-commit part) to pass the check-style

vitor1001 · 2025-06-25T09:22:23Z

Please run : https://github.com/NVIDIA/cuopt/blob/branch-25.08/ci/check_style.sh (or just the pre-commit part) to pass the check-style

Thanks for the tip, done!

vitor1001 · 2025-07-01T06:38:33Z

Hi @Kh4ster! Can you have a look at this PR? Anything else needed from my side?

Kh4ster · 2025-07-03T13:05:23Z

@vitor1001
Please tell me if I'm mistaking but AFAIK you didn't cover my two concerns regarding the resize and the custom mechanism in preprocess

vitor1001 · 2025-07-03T13:46:14Z

@Kh4ster

Please tell me if I'm mistaking but AFAIK you didn't cover my two concerns regarding the resize and the custom mechanism in preprocess

Can you elaborate? I didn't find any comment about those. The only comment I saw was about running check_style.

Kh4ster · 2025-07-04T13:10:58Z

@vitor1001 Sorry maybe it's on my side then. I have shared a screenshot with the review comments I left, don't you see those in you go up the thread or in the file section?

vitor1001 · 2025-07-04T13:29:47Z

@vitor1001 Sorry maybe it's on my side then. I have shared a screenshot with the review comments I left, don't you see those in you go up the thread or in the file section?

@Kh4ster I confirm those are not visible. Maybe there is a "finish review" or "publish comments" button?

About the comment: Raft doesn't do dlopen() explicitly (link). That said, I guess if one have a similar runtime/header mismatch Raft will be broken and maybe a better fix is having Raft CMake checking for the mismatch and asking the user to fix the environment? Such work-arounds sound dangerous, is it still necessary?

Kh4ster · 2025-07-10T09:30:08Z

@rgsl888prabhu regarding the comments I put on the PR as review, do you know why external contributors would not see those?

Kh4ster · 2025-07-10T09:34:56Z

@vitor1001 sorry for the delay in the response I was on PTO.

I don't know why you can't see my review comment but this is an issue, we will try to understand why.

Regarding dl_open:
I agree that the result is "dangerous" but this is the only way we had to make it work at the time.
@rgsl888prabhu do you confirm because we rely on RAPIDS CI, we may have a mismatch the compiled CUDA version and the runtime CUDA version? Or can you confirm that we do not support anything < 12.4 anymore?

Regarding my last comment @vitor1001 let me put it here as screenshot:

vitor1001 · 2025-07-10T12:07:05Z

Regarding my last comment @vitor1001 let me put it here as screenshot:

@Kh4ster this is a fair point. That said, I think the Raft API do make sense, since it forces the buffer to have the right alignment.

Should I do something like in this line:

  buffer_transpose.resize((buffer_size_transpose + sizeof(f_t) - 1) / sizeof(f_t), handle_ptr->get_stream());

?

rgsl888prabhu · 2025-07-10T14:24:01Z

@vitor1001 Sorry maybe it's on my side then. I have shared a screenshot with the review comments I left, don't you see those in you go up the thread or in the file section?

@Kh4ster I confirm those are not visible. Maybe there is a "finish review" or "publish comments" button?

About the comment: Raft doesn't do dlopen() explicitly (link). That said, I guess if one have a similar runtime/header mismatch Raft will be broken and maybe a better fix is having Raft CMake checking for the mismatch and asking the user to fix the environment? Such work-arounds sound dangerous, is it still necessary?

I agree, those comments are pending and need to be submitted, that's why you can see them but not us.

Kh4ster · 2025-07-11T13:02:04Z

@vitor1001 your point is valid. I thought RMM would always return aligned data but I'm not 100% sure, let me check with them.

rgsl888prabhu · 2025-07-11T14:34:32Z

Quote reply

Yes, we build on cuda 12.9, but we test on cuda > 12.4 and few other options since we can't test complete matrix.

Kh4ster · 2025-07-15T08:48:08Z

@vitor1001

runtime mismatch: I think we should keep this logic at least for now even if I agree that's it's unsafe: we need to make sure we don't break the CI in the future. That being said we can use RAFT wrapper
Alignment: I couldn't get an exact answer from RMM team regarding allocating with uint8_t but from what I could find it is / will be indeed aligned so I think we should keep it this way to save memory. If you want to be extra sure you can add an assert to check that after the resize .data() is indeed aligned

rgsl888prabhu · 2025-07-22T15:50:43Z

/ok to test 38b202f

Kh4ster · 2025-07-23T09:44:42Z

/ok to test

copy-pr-bot · 2025-07-23T09:44:44Z

/ok to test

@Kh4ster, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

Kh4ster · 2025-07-23T09:45:51Z

/ok to test 2871288

Kh4ster · 2025-07-23T12:49:08Z

/merge

Kh4ster · 2025-06-25T09:06:16Z

-}
-#endif
-
 // This cstr is used in pdhg


If we use RAFT cusparsespmv_preprocess will the above logic be maintained?

Kh4ster · 2025-06-25T09:07:22Z

-  rmm::device_uvector<uint8_t> buffer_non_transpose;
-  rmm::device_uvector<uint8_t> buffer_transpose;
+  rmm::device_uvector<f_t> buffer_non_transpose;
+  rmm::device_uvector<f_t> buffer_transpose;


Since we use resize on those, I think it will overallocate if we use f_t (double in most cases) instead of uint8_t or am I mistaking?

This removes a temporary hack. Authors: - Vitor Sessak (https://github.com/vitor1001) - Ramakrishnap (https://github.com/rgsl888prabhu) - Nicolas Blin (https://github.com/Kh4ster) Approvers: - Nicolas Blin (https://github.com/Kh4ster) URL: #120

This removes a temporary hack. Authors: - Vitor Sessak (https://github.com/vitor1001) - Ramakrishnap (https://github.com/rgsl888prabhu) - Nicolas Blin (https://github.com/Kh4ster) Approvers: - Nicolas Blin (https://github.com/Kh4ster) URL: NVIDIA#120

vitor1001 requested a review from a team as a code owner June 24, 2025 10:51

vitor1001 requested review from aliceb-nv and kaatish June 24, 2025 10:51

rgsl888prabhu assigned vitor1001 Jun 24, 2025

rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Jun 24, 2025

rgsl888prabhu added this to the 25.08 milestone Jun 24, 2025

Kh4ster requested review from Kh4ster and removed request for aliceb-nv and kaatish June 25, 2025 09:07

Kh4ster added the pdlp label Jun 25, 2025

Use cusparsespmv_preprocess() now that Raft implements it.

0e42574

vitor1001 force-pushed the cusparse_fix branch from 5ce59ef to 0e42574 Compare June 25, 2025 09:17

vitor1001 and others added 4 commits June 25, 2025 13:40

Merge branch 'branch-25.08' into cusparse_fix

23c566d

Merge branch 'branch-25.08' into cusparse_fix

760ad88

Merge branch 'branch-25.08' into cusparse_fix

187a68a

Merge branch 'branch-25.08' into cusparse_fix

064fb17

vitor1001 added 2 commits July 2, 2025 08:31

Merge branch 'branch-25.08' into cusparse_fix

a29907d

Merge branch 'branch-25.08' into cusparse_fix

a8e0e4f

Merge branch 'branch-25.08' into cusparse_fix

30409f4

vitor1001 removed their assignment Jul 8, 2025

Merge branch 'branch-25.08' into cusparse_fix

9ecfc2c

Merge branch 'branch-25.08' into cusparse_fix

38b202f

Kh4ster and others added 2 commits July 23, 2025 11:52

Merge branch 'branch-25.08' into cusparse_fix

89c9846

put back dynamic logic and uint8_t

2871288

Kh4ster approved these changes Jul 23, 2025

View reviewed changes

rapids-bot bot merged commit c5bcf44 into NVIDIA:branch-25.08 Jul 23, 2025
142 of 144 checks passed

Conversation

vitor1001 commented Jun 24, 2025

Description

Checklist

Uh oh!

copy-pr-bot bot commented Jun 24, 2025

Uh oh!

rgsl888prabhu commented Jun 24, 2025

Uh oh!

copy-pr-bot bot commented Jun 24, 2025

Uh oh!

rgsl888prabhu commented Jun 24, 2025

Uh oh!

rgsl888prabhu commented Jun 24, 2025

Uh oh!

Kh4ster commented Jun 25, 2025

Uh oh!

vitor1001 commented Jun 25, 2025

Uh oh!

vitor1001 commented Jul 1, 2025

Uh oh!

Kh4ster commented Jul 3, 2025

Uh oh!

vitor1001 commented Jul 3, 2025

Uh oh!

Kh4ster commented Jul 4, 2025

Uh oh!

vitor1001 commented Jul 4, 2025

Uh oh!

Kh4ster commented Jul 10, 2025

Uh oh!

Kh4ster commented Jul 10, 2025

Uh oh!

vitor1001 commented Jul 10, 2025

Uh oh!

rgsl888prabhu commented Jul 10, 2025

Uh oh!

Kh4ster commented Jul 11, 2025

Uh oh!

rgsl888prabhu commented Jul 11, 2025

Uh oh!

Kh4ster commented Jul 15, 2025

Uh oh!

rgsl888prabhu commented Jul 22, 2025

Uh oh!

Kh4ster commented Jul 23, 2025

Uh oh!

copy-pr-bot bot commented Jul 23, 2025

Uh oh!

Kh4ster commented Jul 23, 2025

Uh oh!

Kh4ster commented Jul 23, 2025

Uh oh!

Kh4ster Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

Kh4ster Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants