-
Notifications
You must be signed in to change notification settings - Fork 69
Initial OpenCL-OpenGL interop overview #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| @@ -0,0 +1,47 @@ | |||
| # OpenCL-OpenGL interop | |||
|
|
|||
| Both OpenCL and OpenGL have specific extensions targeting resource sharing and synchronizing between the two runtimes. Doing so one may omit fetching data from the device, only to send it immediately back resulting in significant performane gains. Because the way the two APIs work, there are few thing to keep in mind when designing applications that intend interoperating. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Both OpenCL and OpenGL have specific extensions targeting resource sharing and synchronizing between the two runtimes. Doing so one may omit fetching data from the device, only to send it immediately back resulting in significant performane gains. Because the way the two APIs work, there are few thing to keep in mind when designing applications that intend interoperating. | |
| Both OpenCL and OpenGL have specific extensions targeting resource sharing and synchronizing between the two runtimes. Doing so one may omit fetching data from the device, only to send it immediately back resulting in significant performance gains. Because the way the two APIs work, there are few thing to keep in mind when designing applications that intend interoperating. |
|
|
||
| The core of the OpenGL API has remained backward compatible with itself all the way back to it's initial incarnations. This feature of OpenGL imposes some restrictions on how interoperability can be setup. | ||
|
|
||
| In layman's terms, OpenCL is the "smarter" API, OpenGL does some part of init unaware of OpenCL, or even before any OpenCL API function has been invoked. Once all the shared resources (buffers and textures) were created in OpenGL, _only then_ is the OpenCL interop context even created. While OpenGL created resources as normal, OpenCL (and only OpenCL) has special functions which take `GLuint` as input to designate which exact OpenGL resource is bing given a corresponding OpenCL handle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't sound quit right. Although some information about the OpenGL context / etc. does need to be provided when an OpenCL interop context is created, the shared resources themselves don't necessarily need to be created before creating the OpenCL interop context. See my CL-GL sharing sample as an example:
https://github.com/bashbaug/SimpleOpenCLSamples/tree/master/samples/opengl/00_juliagl
It creates the OpenGL context first, then the OpenCL context (using the OpenGL context), then the shared texture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, typo: "bing"
|
|
||
| Implicit sync from the code's perspective resembles that of the previous approach when one does not sync, just flushes the queues instead of finishing them. (Flushing a queue in OpenGL does not involve the OpenGL server.) | ||
|
|
||
| _(Note: If in a loop one is calling GL-CL-GL-CL... commands in succession, one blocking sany somewhere will still be required, otherwise such loops on the host may spin faster than rendering and compute commands are processed on the device, leading to spilling the limit of commands in the queues. Blocking can both be done on OpenGL sync objects or OpenCL events.)_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check this sentence - there's a typo here ("one blocking sany somewhere") but I'm not quite sure how to fix it.
|
|
||
| When `cl_khr_gl_event` is supported but the context cannot be made current on the thread enqueueing OpenCL commands, one may still sync faster than invoking `glFinish()`/`clFinish()`. Because the OpenCL runtime cannot directly observe the OpenGL context, some channel of information need be made explicit for syncing to occur. As the name suggests, this extension involves events, specifically one is able to create an OpenCL event from an OpenGL sync object. | ||
|
|
||
| By mapping a sync object that is enqueue after a render command using some shared resource to an OpenCL event, one can use such events in the call to `clEnqueueAcquireGLObjects` in the event wait list. That way `glFinish()` may be omitted, as OpenCL can explicitly wait on certain parts of the rendering queue to complete. Note than using only this, `clFinish()` strictly speaking is still required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is clFinish still required? Could a clFlush be used instead?
|
|
||
| ## How is it different than using OpenGL compute shaders? | ||
|
|
||
| OpenGL compute shaders are slightly more restricted than OpenCL compute kernels. This is also reflected in the duality of the intermediate formats they can be compiled to. When using SPIR-V as an intermediate representation (IR), compute shaders are compiled to the graphics flavor of SIPR-V, which must exhibit structured control flow and must not use pointer arithmetic. These two cannot arise when using GLSL or other traditional shading languages. OpenCL C, being a C-derivate is far more liberal in the expressable language constructs than shading languages and as such requires a more feature complete intermediate representation, the so called compute flavor of SPIR-V. Different compiler infrastructure is required behind the scenes to process these two types of workloads, irrespective of ingesting IR or compiling from source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a bit of a distraction to discuss the IL formats here. I wonder if we can make this a bit simpler. To me, the key points are:
- OpenCL kernels are (generally) more capable than GLSL compute shaders.
- OpenCL kernels use a familiar C and C++ syntax compared to GLSL compute shaders.
For (2) we could link to the C++ for OpenCL section of this guide as an example.
This PR adds a new doc giving a high-level overview of how OpenCL-OpenGL interop should be setup and how it works.