-
Notifications
You must be signed in to change notification settings - Fork 77
Add handle cache for AMD platform #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces handle caching for the AMD platform to prevent reaching IPC handle limits when multiple processes open the same handles. NVIDIA GPUs handle reference counting internally, so this optimization is AMD-specific. The implementation uses a thread-safe cache with weak pointers to automatically reuse and release handles.
Key Changes:
- Adds custom hash and equality operators for
cudaIpcMemHandle_tto enable use instd::unordered_map - Implements
getPeerMemoryHandle()function with AMD-specific caching using weak pointers and mutex protection - Refactors
RegisteredMemory::Implto usestd::shared_ptrfor automatic IPC handle lifetime management
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/registered_memory.cc | Adds hash/equality operators for cudaIpcMemHandle_t, implements getPeerMemoryHandle with AMD-specific caching, updates constructor to use cached handles, and removes manual IPC handle cleanup from destructor |
| src/include/registered_memory.hpp | Adds peerHandle field to RegisteredMemory::Impl for managing IPC handle lifetime via shared_ptr |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@Binyang2014 I've opened a new pull request, #699, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@Binyang2014 I've opened a new pull request, #700, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- [x] Move hash specialization and equality operator from std/global namespace to custom namespace - [x] Update unordered_map to use custom hash and equality as template parameters - [x] Add noexcept to equality operator - [x] Verify the changes build correctly - [x] Run code review and security checks <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/microsoft/mscclpp/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Binyang2014 <9415966+Binyang2014@users.noreply.github.com> Co-authored-by: Binyang Li <binyli@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@Binyang2014 I've opened a new pull request, #710, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 3 pipeline(s). |
Introduce handle cache for AMD platform.
Avoid reaching handle limitation if we open too much IPC handles
For nvidia, we don't need this feature since nvidia will count the handle reference internally and reuse the same handle if already be opened