Skip to content

Add missing nvtx ranges and cuda checks#42

Closed
hlinsen wants to merge 3 commits intoNVIDIA:branch-25.05from
hlinsen:add-ranges
Closed

Add missing nvtx ranges and cuda checks#42
hlinsen wants to merge 3 commits intoNVIDIA:branch-25.05from
hlinsen:add-ranges

Conversation

@hlinsen
Copy link
Copy Markdown
Contributor

@hlinsen hlinsen commented May 28, 2025

This PR adds:

  • More comprehensive profiling by populating more methods with nvtx ranges.The bottlenecks shift depending on instance properties and are hard to track.
  • Missing checks after kernel calls.

@hlinsen hlinsen requested a review from a team as a code owner May 28, 2025 05:09
@hlinsen hlinsen requested review from Kh4ster and rg20 May 28, 2025 05:09
@hlinsen hlinsen added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels May 28, 2025
@hlinsen hlinsen changed the base branch from branch-25.05 to branch-25.08 May 28, 2025 15:37
@hlinsen hlinsen requested review from a team as code owners May 28, 2025 15:37
@hlinsen hlinsen requested review from rgsl888prabhu and removed request for a team May 28, 2025 15:37
@anandhkb anandhkb added this to the 25.08 milestone Jul 1, 2025
@rgsl888prabhu
Copy link
Copy Markdown
Collaborator

@rg20 @Kh4ster May I get your review on this PR

@rgsl888prabhu rgsl888prabhu changed the base branch from branch-25.08 to branch-25.05 July 16, 2025 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants