Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 17 additions & 45 deletions cuda_bindings/tests/test_nvvm.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,12 @@

import binascii
import re
import textwrap
from contextlib import contextmanager

import pytest
from cuda.bindings import nvvm

MINIMAL_NVVMIR_FIXTURE_PARAMS = ["txt", "bitcode_static"]
try:
import llvmlite.binding as llvmlite_binding # Optional test dependency.
except ImportError:
llvmlite_binding = None
else:
MINIMAL_NVVMIR_FIXTURE_PARAMS.append("bitcode_dynamic")

MINIMAL_NVVMIR_TXT = b"""\
MINIMAL_NVVMIR_TXT_TEMPLATE = b"""\
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-i128:128:128-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
target triple = "nvptx64-nvidia-cuda"
Expand Down Expand Up @@ -130,43 +121,24 @@
"6e673e0000000000",
}

MINIMAL_NVVMIR_CACHE = {}


@pytest.fixture(params=MINIMAL_NVVMIR_FIXTURE_PARAMS)
@pytest.fixture(params=("txt", "bitcode_static"))
def minimal_nvvmir(request):
for pass_counter in range(2):
nvvmir = MINIMAL_NVVMIR_CACHE.get(request.param, -1)
if nvvmir != -1:
if nvvmir is None:
pytest.skip(f"UNAVAILABLE: {request.param}")
return nvvmir
if pass_counter:
raise AssertionError("This code path is meant to be unreachable.")
# Build cache entries, then try again (above).
major, minor, debug_major, debug_minor = nvvm.ir_version()
txt = MINIMAL_NVVMIR_TXT % (major, debug_major)
if llvmlite_binding is None:
bitcode_dynamic = None
else:
bitcode_dynamic = llvmlite_binding.parse_assembly(txt.decode()).as_bitcode()
bitcode_static = MINIMAL_NVVMIR_BITCODE_STATIC.get((major, debug_major))
if bitcode_static is not None:
bitcode_static = binascii.unhexlify(bitcode_static)
MINIMAL_NVVMIR_CACHE["txt"] = txt
MINIMAL_NVVMIR_CACHE["bitcode_dynamic"] = bitcode_dynamic
MINIMAL_NVVMIR_CACHE["bitcode_static"] = bitcode_static
if bitcode_static is None:
if bitcode_dynamic is None:
raise RuntimeError("Please `pip install llvmlite` to generate `bitcode_static` (see PR #443)")
bitcode_hex = binascii.hexlify(bitcode_dynamic).decode("ascii")
print("\n\nMINIMAL_NVVMIR_BITCODE_STATIC = { # PLEASE ADD TO test_nvvm.py")
print(f" ({major}, {debug_major}): # (major, debug_major)")
lines = textwrap.wrap(bitcode_hex, width=80)
for line in lines[:-1]:
print(f' "{line}"')
print(f' "{lines[-1]}",')
print("}\n", flush=True)
major, minor, debug_major, debug_minor = nvvm.ir_version()

if request.param == "txt":
return MINIMAL_NVVMIR_TXT_TEMPLATE % (major, debug_major)

bitcode_static_binascii = MINIMAL_NVVMIR_BITCODE_STATIC.get((major, debug_major))
if bitcode_static_binascii:
return binascii.unhexlify(bitcode_static_binascii)
raise RuntimeError(
"Static bitcode for NVVM IR version "
f"{major}.{debug_major} is not available in this test.\n"
"Maintainers: Please run the helper script to generate it and add the "
"output to the MINIMAL_NVVMIR_BITCODE_STATIC dict:\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow why we can't make llvmlite a required testing dependency and always generate the bitcode and eliminate the static testing fixture altogether.

Then we can avoid a lot of this complexity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mutually exclusive with my JSON file suggestion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow why we can't make llvmlite a required testing dependency and always generate the bitcode and eliminate the static testing fixture altogether.

Ah, that's whole point of this PR: remove llvmlite even as an optional test dependency.

@leofang please correct me if I'm wrong, but I got the impression you were skeptical of the llvmlite dependency all the while.

Originally I wanted to keep things simple and make llvmlite a required dependency.

Before we had the recent llvmlite v0.45 related breakage (see #988 and numba/llvmlite#1297), I had it on my list to simplify the test_nvvm.py code, with llvmlite as a hard dependency.

But after the breakage, and seeing how @rparolin stumbled even over the optional dependency last week (see team chat), I got to think it's more trouble than it's worth to even have it as an optional dependency.

With this PR, the cuda_bindings unit tests will be fully isolated from llvmlite, so we won't stumble that much in the future.

Only every couple years probably someone has to dust off the helper script to add a new entry in test_nvvm.py. Estimated effort: 30-60 minutes, depending on prior background of the person who takes this on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't we still need llvmlite eventually to regenerate the bitcode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but only "offline", in the sense that the routine cuda_bindings unit tests (CI) don't need it.

Based on the few months of related experience that I have:

  • I expect that test_nvvm.py will only break if we're testing new CTK releases.
  • Fixing up test_nvvm.py will be just one of potentially several adjustments we have to make for new CTK versions.
  • Chances of test_nvvm.py breakages due to minor-version CTK releases are probably small. — Every couple of months roughly.
  • Chances of test_nvvm.py breakages due to major-version CTK releases are significant. — Every couple years.

Making llvmlite a required dependency means that we'll be living at the bleeding edge: random breakages at random times (from NVIDIA's viewpoint) that need immediate attention.

With the static bitcode inputs, we'll only need to tend to the test when there are CTK changes (release schedule controlled by NVIDIA).


For completeness: There are other ways to convert the txt version to bitcode. I don't remember exactly, I believe there are llvm commands that could be used instead. What's most suitable might change in the future, but I'm guessing as long as numba-cuda uses llvmlite, it'll be a good choice.

" ../../toolshed/build_static_bitcode_input.py"
)


@pytest.fixture(params=[nvvm.compile_program, nvvm.verify_program])
Expand Down
53 changes: 53 additions & 0 deletions toolshed/build_static_bitcode_input.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/usr/bin/env python3

# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES.
# SPDX-License-Identifier: LicenseRef-NVIDIA-SOFTWARE-LICENSE

"""
Helper to produce static bitcode input for test_nvvm.py.
Usage:
python toolshed/build_static_bitcode_input.py
It will print a ready-to-paste MINIMAL_NVVMIR_BITCODE_STATIC entry for the
current NVVM IR version detected at runtime.
"""

import binascii
import os
import sys
import textwrap

import llvmlite.binding # HINT: pip install llvmlite
from cuda.bindings import nvvm


def get_minimal_nvvmir_txt_template():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you just check in a JSON file that contains the bitcode? Why do we need the sys.path munging and print-generated dictionary?

Copy link
Collaborator Author

@rwgk rwgk Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would you put the JSON file?

I had on my mind: Avoid creating extra artifacts.
But that's something I could do if you prefer.

Re printing out the generated dictionary: I don't want to over-engineer a helper script that's used only rarely (possibly only every couple years). The main goal is to archive "how it worked last time", so that future maintainers don't have to start from scratch. (Back in February it took me more than a couple hours to figure out the approach.)

cuda_bindings_tests_dir = os.path.normpath("cuda_bindings/tests")
assert os.path.isdir(cuda_bindings_tests_dir), (
"Please run this helper script from the cuda-python top-level directory."
)
sys.path.insert(0, os.path.abspath(cuda_bindings_tests_dir))
import test_nvvm

return test_nvvm.MINIMAL_NVVMIR_TXT_TEMPLATE


def main():
major, _minor, debug_major, _debug_minor = nvvm.ir_version()
txt = get_minimal_nvvmir_txt_template() % (major, debug_major)
bitcode_dynamic = llvmlite.binding.parse_assembly(txt.decode()).as_bitcode()
bitcode_hex = binascii.hexlify(bitcode_dynamic).decode("ascii")
print("\n\nMINIMAL_NVVMIR_BITCODE_STATIC = { # PLEASE ADD TO test_nvvm.py")
print(f" ({major}, {debug_major}): # (major, debug_major)")
lines = textwrap.wrap(bitcode_hex, width=80)
for line in lines[:-1]:
print(f' "{line}"')
print(f' "{lines[-1]}",')
print("}\n", flush=True)
print()


if __name__ == "__main__":
assert len(sys.argv) == 1, "This helper script does not take any arguments."
main()
Loading