Skip to content

Use cusparsespmv_preprocess() now that Raft implements it.#120

Merged
rapids-bot[bot] merged 12 commits intoNVIDIA:branch-25.08from
vitor1001:cusparse_fix
Jul 23, 2025
Merged

Use cusparsespmv_preprocess() now that Raft implements it.#120
rapids-bot[bot] merged 12 commits intoNVIDIA:branch-25.08from
vitor1001:cusparse_fix

Conversation

@vitor1001
Copy link
Copy Markdown
Contributor

Description

This removes a temporary hack.

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@vitor1001 vitor1001 requested a review from a team as a code owner June 24, 2025 10:51
@vitor1001 vitor1001 requested review from aliceb-nv and kaatish June 24, 2025 10:51
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Jun 24, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@rgsl888prabhu rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Jun 24, 2025
@rgsl888prabhu rgsl888prabhu added this to the 25.08 milestone Jun 24, 2025
@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

/ok to test c77e264

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Jun 24, 2025

/ok to test c77e264

@rgsl888prabhu, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

/ok to test 457269b

@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

/ok to test 5ce59ef

@Kh4ster Kh4ster requested review from Kh4ster and removed request for aliceb-nv and kaatish June 25, 2025 09:07
@Kh4ster Kh4ster added the pdlp label Jun 25, 2025
@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jun 25, 2025

Please run : https://github.com/NVIDIA/cuopt/blob/branch-25.08/ci/check_style.sh (or just the pre-commit part) to pass the check-style

@vitor1001
Copy link
Copy Markdown
Contributor Author

Please run : https://github.com/NVIDIA/cuopt/blob/branch-25.08/ci/check_style.sh (or just the pre-commit part) to pass the check-style

Thanks for the tip, done!

@vitor1001
Copy link
Copy Markdown
Contributor Author

Hi @Kh4ster! Can you have a look at this PR? Anything else needed from my side?

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 3, 2025

@vitor1001
Please tell me if I'm mistaking but AFAIK you didn't cover my two concerns regarding the resize and the custom mechanism in preprocess

@vitor1001
Copy link
Copy Markdown
Contributor Author

@Kh4ster

Please tell me if I'm mistaking but AFAIK you didn't cover my two concerns regarding the resize and the custom mechanism in preprocess

Can you elaborate? I didn't find any comment about those. The only comment I saw was about running check_style.

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 4, 2025

@vitor1001 Sorry maybe it's on my side then. I have shared a screenshot with the review comments I left, don't you see those in you go up the thread or in the file section?
2025-07-04 15_09_13-Use cusparsespmv_preprocess() now that Raft implements it  by vitor1001 · Pull R

@vitor1001
Copy link
Copy Markdown
Contributor Author

@vitor1001 Sorry maybe it's on my side then. I have shared a screenshot with the review comments I left, don't you see those in you go up the thread or in the file section?

@Kh4ster I confirm those are not visible. Maybe there is a "finish review" or "publish comments" button?

About the comment: Raft doesn't do dlopen() explicitly (link). That said, I guess if one have a similar runtime/header mismatch Raft will be broken and maybe a better fix is having Raft CMake checking for the mismatch and asking the user to fix the environment? Such work-arounds sound dangerous, is it still necessary?

@vitor1001 vitor1001 removed their assignment Jul 8, 2025
@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 10, 2025

@rgsl888prabhu regarding the comments I put on the PR as review, do you know why external contributors would not see those?

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 10, 2025

@vitor1001 sorry for the delay in the response I was on PTO.

I don't know why you can't see my review comment but this is an issue, we will try to understand why.

Regarding dl_open:
I agree that the result is "dangerous" but this is the only way we had to make it work at the time.
@rgsl888prabhu do you confirm because we rely on RAPIDS CI, we may have a mismatch the compiled CUDA version and the runtime CUDA version? Or can you confirm that we do not support anything < 12.4 anymore?

Regarding my last comment @vitor1001 let me put it here as screenshot:
image

@vitor1001
Copy link
Copy Markdown
Contributor Author

Regarding my last comment @vitor1001 let me put it here as screenshot:

@Kh4ster this is a fair point. That said, I think the Raft API do make sense, since it forces the buffer to have the right alignment.

Should I do something like in this line:

  buffer_transpose.resize((buffer_size_transpose + sizeof(f_t) - 1) / sizeof(f_t), handle_ptr->get_stream());

?

@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

@vitor1001 Sorry maybe it's on my side then. I have shared a screenshot with the review comments I left, don't you see those in you go up the thread or in the file section?

@Kh4ster I confirm those are not visible. Maybe there is a "finish review" or "publish comments" button?

About the comment: Raft doesn't do dlopen() explicitly (link). That said, I guess if one have a similar runtime/header mismatch Raft will be broken and maybe a better fix is having Raft CMake checking for the mismatch and asking the user to fix the environment? Such work-arounds sound dangerous, is it still necessary?

I agree, those comments are pending and need to be submitted, that's why you can see them but not us.

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 11, 2025

@vitor1001 your point is valid. I thought RMM would always return aligned data but I'm not 100% sure, let me check with them.

@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

Quote reply

Yes, we build on cuda 12.9, but we test on cuda > 12.4 and few other options since we can't test complete matrix.

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 15, 2025

@vitor1001

  1. runtime mismatch: I think we should keep this logic at least for now even if I agree that's it's unsafe: we need to make sure we don't break the CI in the future. That being said we can use RAFT wrapper
  2. Alignment: I couldn't get an exact answer from RMM team regarding allocating with uint8_t but from what I could find it is / will be indeed aligned so I think we should keep it this way to save memory. If you want to be extra sure you can add an assert to check that after the resize .data() is indeed aligned

@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

/ok to test 38b202f

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 23, 2025

/ok to test

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Jul 23, 2025

/ok to test

@Kh4ster, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 23, 2025

/ok to test 2871288

@Kh4ster
Copy link
Copy Markdown
Contributor

Kh4ster commented Jul 23, 2025

/merge

}
#endif

// This cstr is used in pdhg
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use RAFT cusparsespmv_preprocess will the above logic be maintained?

rmm::device_uvector<uint8_t> buffer_non_transpose;
rmm::device_uvector<uint8_t> buffer_transpose;
rmm::device_uvector<f_t> buffer_non_transpose;
rmm::device_uvector<f_t> buffer_transpose;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we use resize on those, I think it will overallocate if we use f_t (double in most cases) instead of uint8_t or am I mistaking?

@rapids-bot rapids-bot bot merged commit c5bcf44 into NVIDIA:branch-25.08 Jul 23, 2025
142 of 144 checks passed
aliceb-nv pushed a commit that referenced this pull request Sep 22, 2025
jieyibi pushed a commit to yining043/cuopt that referenced this pull request Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change pdlp

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants