Skip to content

[SYCL][Graph] Support for native-command#16871

Merged
sommerlukas merged 7 commits intointel:syclfrom
reble:graph_native_enqueue
Mar 17, 2025
Merged

[SYCL][Graph] Support for native-command#16871
sommerlukas merged 7 commits intointel:syclfrom
reble:graph_native_enqueue

Conversation

@EwanC
Copy link
Copy Markdown
Contributor

@EwanC EwanC commented Feb 3, 2025

Support sycl_ext_codeplay_enqueue_native_command with SYCL-Graph for all of L0, CUDA, HIP, and OpenCL backends.

Introduces interop_handle::ext_oneapi_get_native_graph<backend>() to give the user access to the native graph object which native commands can be appended to. Implemented using new UR command-buffer entry-points urCommandBufferAppendNativeCommandExp and urCommandBufferGetNativeHandleExp.

To use CUDA as an example, code using ext_codeplay_enqueue_native_command eagerly can be updated from:

 CGH.ext_codeplay_enqueue_native_command([=](interop_handle IH) {
       auto NativeStream = IH.get_native_queue<cuda>();
       myNativeLibraryCall(NativeStream);
}

To

CGH.ext_codeplay_enqueue_native_command([=](interop_handle IH) {
     if (IH.ext_oneapi_has_graph())  {
       auto NativeGraph = IH.ext_oneapi_get_native_graph<cuda>();
       auto NativeStream = IH.get_native_queue<cuda>();

       // Start capture stream calls into graph
       cuStreamBeginCaptureToGraph(NativeStream, NativeGraph, nullptr,
                                             nullptr, 0,
                                             CU_STREAM_CAPTURE_MODE_GLOBAL);

       myNativeLibraryCall(NativeStream);

       // Stop capturing stream calls into graph
       cuStreamEndCapture(NativeStream, &NativeGraph);
     } else {
       auto NativeStream = IH.get_native_queue<cuda>();
       myNativeLibraryCall(NativeStream );
    }
}

Example of how this integration could work in GROMACS https://gitlab.com/gromacs/gromacs/-/merge_requests/4954

@EwanC EwanC force-pushed the graph_native_enqueue branch from 4fa2a2f to 2ead142 Compare February 3, 2025 16:07
@EwanC EwanC force-pushed the graph_native_enqueue branch from 2ead142 to 80269a8 Compare February 4, 2025 16:49
@EwanC EwanC force-pushed the graph_native_enqueue branch from 80269a8 to d40e0d1 Compare February 4, 2025 17:30
@EwanC EwanC force-pushed the graph_native_enqueue branch from d40e0d1 to 4aeba98 Compare February 4, 2025 21:17
@EwanC EwanC force-pushed the graph_native_enqueue branch 2 times, most recently from e4498ce to 7d12878 Compare February 5, 2025 09:46
@EwanC EwanC force-pushed the graph_native_enqueue branch from 7d12878 to 764aebc Compare February 5, 2025 13:21
@EwanC EwanC force-pushed the graph_native_enqueue branch from 764aebc to 4e62db9 Compare February 5, 2025 14:13
@EwanC EwanC force-pushed the graph_native_enqueue branch 3 times, most recently from 047eb0e to c07c67a Compare February 12, 2025 09:45
@EwanC EwanC changed the title DRAFT - [SYCL][Graph] Support for native-command [SYCL][Graph] Support for native-command Feb 12, 2025
@EwanC EwanC force-pushed the graph_native_enqueue branch 2 times, most recently from f499268 to 19bb3dc Compare February 19, 2025 09:53
Copy link
Copy Markdown
Contributor

@npmiller npmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA/HIP LGTM

Copy link
Copy Markdown
Contributor

@Pennycook Pennycook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specification changes LGTM. Thanks for adding the other examples!

Copy link
Copy Markdown
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small comments but otherwise LGTM!

Comment thread sycl/doc/extensions/supported/sycl_ext_oneapi_backend_level_zero.md Outdated
Comment thread sycl/source/detail/scheduler/commands.cpp Outdated
Copy link
Copy Markdown
Contributor

@martygrant martygrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UR changes LGTM. The new CTS test checks the new urCommandBufferGetNativeHandleExp function, but AppendNativeCommandExp was also added, should there be a test for this too?

Copy link
Copy Markdown
Contributor

@kbenzie kbenzie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UR LGTM

Comment thread sycl/source/detail/queue_impl.hpp Outdated
Comment thread unified-runtime/scripts/core/EXP-COMMAND-BUFFER.rst Outdated
Comment thread sycl/test-e2e/EnqueueNativeCommand/Graph/cuda_multiple_native_commands.cpp Outdated
Comment thread unified-runtime/source/adapters/level_zero/v2/api.cpp Outdated
@EwanC
Copy link
Copy Markdown
Contributor Author

EwanC commented Mar 13, 2025

UR changes LGTM. The new CTS test checks the new urCommandBufferGetNativeHandleExp function, but AppendNativeCommandExp was also added, should there be a test for this too?

Good question, I briefly looked into this but the amount of work to get testing setup for all the L0,CUDA,HIP, OpenCL adapters was a fair bit due to the need for backend specific code, so I stuck with using E2E tests for verification. I agree this is something we should do, so I created #17448 as follow-on work for adding UR CTS testing.

Copy link
Copy Markdown
Contributor

@keyradical keyradical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread sycl/test-e2e/EnqueueNativeCommand/Graph/cuda_explicit_usm.cpp Outdated
Comment thread sycl/include/sycl/interop_handle.hpp Outdated
Comment thread unified-runtime/source/adapters/level_zero/v2/command_buffer.cpp Outdated
Copy link
Copy Markdown
Contributor

@hvdijk hvdijk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NativeCPU looks fine to mark this as unsupported.

Comment thread sycl/include/sycl/interop_handle.hpp Outdated
Copy link
Copy Markdown

@AerialMantis AerialMantis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Ewan Crawford added 7 commits March 14, 2025 20:06
Support [sycl_ext_codeplay_enqueue_native_command](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) with SYCL-Graph.

Introduces `interop_handle::ext_codeplay_get_native_graph<backend>()` to
give the user access to the native graph object which native commands
can be appended to.

To use CUDA as an example, code using `ext_codeplay_enqueue_native_command`
eagerly can be updated from:

```cpp
 CGH.ext_codeplay_enqueue_native_command([=](interop_handle IH) {
       auto NativeStream = IH.get_native_queue<cuda>();
       myNativeLibraryCall(NativeStream);
}
```

To

```cpp
CGH.ext_codeplay_enqueue_native_command([=](interop_handle IH) {
     if (IH.ext_codeplay_has_graph())  {
       auto NativeGraph = IH.ext_codeplay_get_native_graph<cuda>();
       auto NativeStream = IH.get_native_queue<cuda>();

       // Start capture stream calls into graph
       cuStreamBeginCaptureToGraph(NativeStream, NativeGraph, nullptr,
                                             nullptr, 0,
                                             CU_STREAM_CAPTURE_MODE_GLOBAL);

       myNativeLibraryCall(NativeStream);

       // Stop capturing stream calls into graph
       cuStreamEndCapture(NativeStream, &NativeGraph);
     } else {
       auto NativeStream = IH.get_native_queue<cuda>();
       myNativeLibraryCall(NativeStream );
    }
}
```

Example of how this integration could work in GROMACS https://gitlab.com/gromacs/gromacs/-/merge_requests/4954
@EwanC
Copy link
Copy Markdown
Contributor Author

EwanC commented Mar 17, 2025

@intel/llvm-gatekeepers This is ready to merge, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.