-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Make RTC compatible with CUDA enhanced compatibility #19364
Conversation
|
Hey @ptrendx , Thanks for submitting the PR
CI supported jobs: [sanity, windows-cpu, centos-gpu, miscellaneous, website, unix-gpu, edge, windows-gpu, centos-cpu, clang, unix-cpu] Note: |
|
Note: this does not touch the legacy RTC part (https://mxnet.apache.org/versions/1.6/api/python/docs/api/mxnet/rtc/index.html) - what is the plan for it @szha? |
|
I think we need to continue to support mx.rtc |
|
Are there any people using it? The interface is not great, since the CudaModule from there is not even an operator so can't be used in a model. We could reuse the recent RTC stuff to make it much better experience (and actually potentially pretty useful). That said, this PR does not touch that functionality (because the compilation options there are set by the user). I could make it so if you specify the proper option ( |
|
Agreed. In 2.0 we can change the interface. |
|
@mxnet-bot run ci [centos-cpu, centos-gpu, edge, miscellaneous] |
|
Jenkins CI successfully triggered : [edge, centos-cpu, miscellaneous, centos-gpu] |
|
@mxnet-bot run ci [centos-cpu, unix-gpu, edge, website] |
|
Jenkins CI successfully triggered : [edge, website, unix-gpu, centos-cpu] |
DickJC123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For others interested in understanding better the motivation behind this PR, I suggest https://docs.nvidia.com/deploy/pdf/CUDA_Compatibility.pdf . One paragraph worth repeating from that doc is:
To use other CUDA APIs introduced in a minor release (that require a new
driver), one would have to implement fallbacks or fail gracefully. This situation
is not different from what is available today where developers use macros to
compile out features based on CUDA versions. Users should refer to the CUDA
headers and documentation for new CUDA APIs introduced in a release.
Thus, it's fair to use an 11.1 feature that is supported by both 11.1 and 11.0 kernel-mode drivers. Before using an 11.1 feature that requires an 11.1 kernel-mode driver, one should check dynamically for that feature's presence at runtime, as suggested in the document section 3.2 "Handling New CUDA Features." This is particularly important to pay attention to while the upstream CI testing has no enhanced-compatibility build.
| const auto getSize = use_cubin ? nvrtcGetCUBINSize : nvrtcGetPTXSize; | ||
| const auto getFunc = use_cubin ? nvrtcGetCUBIN : nvrtcGetPTX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, while nvrtcGetCUBINSize() and nvrtcGetCUBIN() are not yet in the nvrtc docs, their use is described in https://docs.nvidia.com/deploy/cuda-compatibility/
|
@mxnet-bot run ci [unix-cpu] |
|
Jenkins CI successfully triggered : [unix-cpu] |
* Guard RTC better * Use nvrtcGetCUBIN * Fix lint * Enable cubin loading in legacy rtc path * Fixes from review
Description
Starting with CUDA 11.1 it is possible to run programs compiled with newer CUDA toolkit with older driver (as long as the major version is the same, e.g. CUDA 11.1 works with CUDA 11.0 driver) without the compat library. This requires a few changes to API used by nvRTC however, which are addressed by this PR.
Checklist
Essentials
Comments