-
Notifications
You must be signed in to change notification settings - Fork 17
[Perf] Streams 2: Add AMDGPU/HIP stream support #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hughperkins
wants to merge
61
commits into
hp/streams-quadrantsic-1-cuda-streams
Choose a base branch
from
hp/streams-quadrantsic-2-amdgpu-cpu
base: hp/streams-quadrantsic-1-cuda-streams
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
61 commits
Select commit
Hold shift + click to select a range
7bd18ca
Add AMDGPU/HIP stream support and async memory operations
hughperkins b133bd7
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins 7555ec5
Move AMDGPU mem_free_async before transfers sync to match CUDA ordering
hughperkins c12d23e
Convert AMDGPU sync memcpy_host_to_device to async on active_stream
hughperkins 1673a38
Document ROCm >= 5.4 requirement for hipMallocAsync/hipFreeAsync
hughperkins 60d015b
Relax concurrency test threshold and log timings
hughperkins c4be4ff
Add handle==0 guard to AMDGPU stream_synchronize and make stream_ thr…
hughperkins b28e7c6
Revert "Relax concurrency test threshold and log timings"
hughperkins 3970abc
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 31fffbf
Apply clang-format
hughperkins 1056bb4
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 798f87a
Exclude flaky test_perf_dispatch_python from Metal and Vulkan
hughperkins 22c5524
Merge origin/hp/streams-quadrantsic-1-cuda-streams, resolve conflict …
hughperkins f42d4eb
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins 2238969
[Doc] Update streams doc with AMDGPU support
hughperkins 228150a
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins e368b4d
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins 958c247
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins aff950d
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins 84715de
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 8efd51f
Address review comments: fix AMDGPU stream issues
hughperkins 34e9fa6
Use HIP_STREAM_NON_BLOCKING for AMDGPU stream_create to mirror CUDA path
hughperkins 675542a
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 162239e
Use active stream for AMDGPU adstack metadata copies in publish_adsta…
hughperkins 9334efd
Add make_current() to all AMDGPU stream/event Program methods
hughperkins c7eed44
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 1fba4f5
Use async DtoH on active_stream for AMDGPU resolve_num_threads readback
hughperkins 0af8e19
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins f89bde0
Sync active_stream unconditionally at end of AMDGPU launch_llvm_kernel
hughperkins ef3b95b
Use async DtoH on active_stream for sizer stride readback
hughperkins 64a389d
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 7f0f299
Fix end-of-launcher sync: conditional + dealloc race on AMDGPU
hughperkins 5e8d198
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 84806cf
Fix NULL-stream DtoH races in synchronize() and allocate_llvm_runtime…
hughperkins 05dcb4d
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins ae1c932
Reflow comments and docstring to 120-char line width
hughperkins 3ef0340
Use context/device synchronize in synchronize() to drain all streams
hughperkins 3a81a46
Use synchronous mem_free in dealloc_memory pool branch
hughperkins 02ac865
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 3499bbc
Thread active_stream through AMDGPU profiler event_record and sync
hughperkins ce2fc6b
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 117a71f
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 8f71c91
Merge base branch: drop autodiff stream changes per new policy
hughperkins b030e4c
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 6e49c52
Restore context_pointer free comment in AMDGPU kernel launcher
hughperkins 176e7d3
Merge base branch: add AMDGPU support to extracted program_stream.cpp
hughperkins c1562f2
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 1c81322
Fix clang-format in program_stream.h
hughperkins 91fae3f
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins d3317f5
Fix AMDGPU branches in StreamManager: use arch_ member instead of com…
hughperkins 33f2a04
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins b7eb63a
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 52a3be1
Merge branch 'hp/streams-quadrantsic-2-amdgpu-cpu' of github.com:Gene…
hughperkins 4cef21b
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 4711160
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins b4450f7
Fix clang-format in export_stream.cpp
hughperkins 93cd166
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins e8d9cf0
Allow synchronizing the default AMDGPU stream (handle 0)
hughperkins 3f5a868
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins 392b19a
Merge remote-tracking branch 'origin/hp/streams-quadrantsic-1-cuda-st…
hughperkins f67e7fd
Merge branch 'hp/streams-quadrantsic-1-cuda-streams' into hp/streams-…
hughperkins File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.