-
Notifications
You must be signed in to change notification settings - Fork 238
Add Nvfatbin Bindings #1467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Nvfatbin Bindings #1467
Conversation
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
| # | ||
| # SPDX-License-Identifier: LicenseRef-NVIDIA-SOFTWARE-LICENSE | ||
| # | ||
| # This code was automatically generated with version 13.0.0. Do not modify it directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to bump this to 13.1.0 and we should get the nvFatbinAddTileIR function as well.
|
|
||
| ############################################################################### | ||
| # Wrapper functions | ||
| ############################################################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we're also missing nvFatbinGetErrorString
| global __py_nvfatbin_init | ||
|
|
||
| cdef void* handle = NULL | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For double checked locking, I believe you missed the following here or else the init function always has to acquire the lock to check if the object was initialized.
if __py_nvfatbin_init:
return 0
| return 0 | ||
|
|
||
| # Load function | ||
| global __nvFatbinCreate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider a data driving this to avoid repeating the same pattern.
handle = None
def resolve_symbol(sym_name):
global handle
sym = dlsym(RTLD_DEFAULT, sym_name)
if sym is None: # or == NULL depending on your binding
if handle is None:
handle = load_library()
sym = dlsym(handle, sym_name)
return sym # optionally raise if still None
_SYMBOLS = [
"nvFatbinCreate",
"nvFatbinDestroy",
"nvFatbinAddPTX",
"nvFatbinAddCubin",
"nvFatbinAddLTOIR",
"nvFatbinAddReloc",
"nvFatbinSize",
"nvFatbinGet",
"nvFatbinVersion",
]
for name in _SYMBOLS:
globals()["__" + name] = resolve_symbol(name)
|
FYI @rparolin all of these are generated code except for the tests. There is an internal MR targeting the codegen for this. |
| with open(tmpdir / "object.cu", "w") as f: | ||
| f.write(empty_cplusplus_kernel) | ||
|
|
||
| subprocess.check_output(["nvcc", "-arch", arch, "-o", str(tmpdir / "object.o"), str(tmpdir / "object.cu")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need a check here and skip if NVCC is not usable (ex: there is no host compiler) or even available. For example, at the test stage we don't have access to NVCC in the CI (same in the numba-cuda repo).
Description
closes #156
Checklist