[Enhancement][CUDA] Support `nvidia-cuda-nvcc` as `nvcc` #1528

clouds56 · 2025-12-24T19:27:32Z

Sorry last PR #1527 was closed by mistake, and my branch is also lost, so I prepared a new PR.

Summary by CodeRabbit

Improvements
- More reliable CUDA detection: now also recognizes CUDA provided via NVIDIA PyPI packages and applies a unified validation step with platform-specific fallbacks to better locate CUDA installations.
New Features
- Added an optional "nvcc" install group to simplify installing nvcc and related helper packages.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

github-actions · 2025-12-24T19:27:41Z

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

coderabbitai · 2025-12-24T19:27:42Z

📝 Walkthrough

Walkthrough

Adds CUDA_HOME discovery from the nvidia-cuda-nvcc PyPI package via importlib.metadata file inspection, a helper to read package versions, refactors CUDA detection to consolidate validation, and adds an nvcc optional-dependency group in pyproject.toml. (34 words)

Changes

Cohort / File(s)	Summary
CUDA Home Discovery (env changes) `tilelang/env.py`	Adds `importlib.metadata` usage and helper `_get_package_version(pkg: str) -> str
Optional dependency group `pyproject.toml`	Adds new optional-dependencies group `nvcc` including `nvidia-cuda-nvcc>=13.0.48` and `nvidia-cuda-cccl>=13.0.50`.

Sequence Diagram(s)

sequenceDiagram
  participant Caller as Caller
  participant Env as tilelang.env._find_cuda_home
  participant PATH as System PATH (nvcc)
  participant Package as importlib.metadata.files("nvidia-cuda-nvcc")
  participant FS as Filesystem

  Caller->>Env: request CUDA_HOME
  Env->>Env: check CUDA_HOME env var & known locations (Guesses `#1`)
  Env->>PATH: is nvcc on PATH? (Guess `#2`)
  alt nvcc on PATH
    PATH-->>Env: nvcc path
    Env->>FS: resolve grandparent -> candidate cuda_home
  else nvcc not on PATH
    Env->>Package: inspect package files (Guess `#3`)
    alt package provides nvcc
      Package-->>Env: nvcc file path
      Env->>FS: resolve grandparent -> candidate cuda_home
    else
      Env->>FS: fallback checks for standard CUDA installs (Guess `#4`)
    end
  end
  Env->>FS: validate candidate cuda_home (unified step)
  alt valid
    Env-->>Caller: CUDA_HOME (path)
  else invalid
    Env-->>Caller: "" (empty)
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

[Bugfix] Add NVIDIA HPC SDK support in CUDA detection (#974) #976: Also modifies _find_cuda_home detection and CUDA/NVIDIA package-based fallbacks.

Suggested reviewers

LeiWang1999

Poem

🐰 I sniffed the packages, followed nvcc's trail,

From package files and PATH I wagged my tail.
I hopped through guesses, checked each cozy nook,
Found CUDA's home with one determined look.
Hop, grab a carrot — the build's on track! 🥕

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main enhancement: adding support for the nvidia-cuda-nvcc PyPI package as an alternative CUDA detection source.

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 605867b and 2818cd9.

📒 Files selected for processing (1)

pyproject.toml

🚧 Files skipped from review as they are similar to previous changes (1)

pyproject.toml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.0
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.0
GitHub Check: Build SDist

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

tilelang/env.py (1)
91-93: Non-deterministic CUDA version selection when multiple versions are installed.

glob.glob() returns paths in arbitrary filesystem order. If multiple CUDA versions are installed (e.g., v11.8, v12.0, v12.4), selecting cuda_homes[0] gives unpredictable results across runs or machines.
🔎 Proposed fix to prefer the latest CUDA version
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if cuda_homes:
+                # Sort to prefer the latest version (e.g., v12.4 over v11.8)
+                cuda_homes.sort(reverse=True)
+                cuda_home = cuda_homes[0]
+            else:
+                cuda_home = ""

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d140415 and e801a01.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:27.444Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

🔇 Additional comments (2)

tilelang/env.py (2)

69-87: LGTM - PyPI package detection correctly implements nvidia-cuda-nvcc paths.

The candidate paths (nvidia/cu13/bin/, nvidia/cu12/bin/, nvidia/cu11/bin/) correctly match the nvidia-cuda-nvcc PyPI package installation structure. The priority order prefers newer CUDA versions, and the fallback to nvidia/cuda_nvcc/bin/ is a reasonable defensive addition. Based on learnings from the previous PR.

98-99: Verify HPC SDK fallback path is intentional.

The path /opt/nvidia/hpc_sdk/Linux_x86_64 is the HPC SDK root, not a valid CUDA_HOME. The actual CUDA installation within HPC SDK is typically nested at /opt/nvidia/hpc_sdk/Linux_x86_64/<version>/cuda/<cuda_version>.

This differs from Guess #2's handling (lines 62-64), which navigates from the nvcc binary to find the correct SDK path. If this fallback is reached, it's unlikely to point to a usable CUDA installation—though the validation at lines 101-103 will safely reject it.

Is this a best-effort guess that relies on the subsequent validation, or should this path be removed/adjusted for consistency?

oraluben

This idea LGTM, with this PR, tilelang can work with a plain torch installation (with nvidia-cuda-nvcc installed), without requiring cuda toolchain.

I think we can also put nvidia-cuda-nvcc in optional dependency.

tilelang/env.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

tilelang/env.py (2)
69-78: Move import to module level for consistency.

The importlib.util import is placed inside the function. Consider moving it to the top of the file alongside other imports for better consistency and readability.
🔎 Proposed refactor

At the top of the file, add the import:
from __future__ import annotations
import sys
import os
import pathlib
import logging
import shutil
import glob
+import importlib.util
from dataclasses import dataclass
Then remove it from inside the function:
     if cuda_home is None:
         # Guess #3
         # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.
-        import importlib.util
-
         for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:
82-84: Consider sorting CUDA versions when multiple installations exist.

The Windows fallback uses glob.glob which returns matches in arbitrary filesystem order. If multiple CUDA versions are installed, the selected version may be unpredictable. Consider sorting the results to select the latest version consistently.
🔎 Proposed fix to select the latest CUDA version
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if len(cuda_homes) == 0:
+                cuda_home = ""
+            else:
+                # Sort to get the latest version (e.g., v12.8 comes after v11.8)
+                cuda_homes.sort(reverse=True)
+                cuda_home = cuda_homes[0]

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e801a01 and 5d53b22.

📒 Files selected for processing (2)

pyproject.toml
tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:27.444Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

pyproject.toml
tilelang/env.py

🔇 Additional comments (4)

tilelang/env.py (3)

86-90: LGTM!

The Linux/macOS fallback logic correctly checks standard CUDA installation paths in a reasonable priority order.

92-96: LGTM!

The path validation logic correctly ensures that only existing CUDA installations are returned, with proper fallback to an empty string for backward compatibility.

74-78: The CUDA_HOME detection from PyPI packages at lines 74-78 is correct. The spec.submodule_search_locations[0] returns the nvidia/cu13 (or cu12/cu11) directory, which is the proper CUDA_HOME location containing the bin/ subdirectory with nvcc. No changes needed.

pyproject.toml (1)

55-59: The specified versions for both optional dependencies are valid and available on PyPI:

nvidia-cuda-nvcc 13.0.48 ✓ exists (released Aug 4, 2025)

nvidia-cuda-cccl 13.0.50 ✓ exists (released Aug 4, 2025)

No known security vulnerabilities were found for these versions.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

tilelang/env.py (2)
93-95: Consider simplifying the validation logic.

The validation can be streamlined since checking cuda_home is None and then setting it to None is redundant.
🔎 Proposed simplification
-        # Validate found path
-        if cuda_home is None or not os.path.exists(cuda_home):
-            cuda_home = None
+        # Validate found path exists
+        if cuda_home and not os.path.exists(cuda_home):
+            cuda_home = None
72-72: Consider moving the import to the module level for better clarity.

While importing importlib.util inside the function works, placing it at the module level (lines 1-8) would improve code organization and make dependencies more visible.
🔎 Proposed change

At the top of the file (after line 7):
 import shutil
 import glob
+import importlib.util
 from dataclasses import dataclass
Then remove the import from line 72:
     if cuda_home is None:
         # Guess #3
         # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.
-        import importlib.util
 
         if importlib.util.find_spec("nvidia") is not None:

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d53b22 and dfe5550.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:27.444Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

🔇 Additional comments (1)

tilelang/env.py (1)

69-79: The fix correctly prevents exceptions when nvidia packages are not installed.

The guard clause at line 74 (if importlib.util.find_spec("nvidia") is not None:) successfully prevents accessing nvidia submodules when the parent package is missing. Testing confirms that importlib.util.find_spec("nvidia") returns None without raising an exception when the package is not installed, and the conditional structure ensures submodule searches (lines 75-79) never execute in this scenario.

tilelang/env.py

oraluben · 2025-12-26T03:23:42Z

with this PR, tilelang can work with a plain torch installation (with nvidia-cuda-nvcc installed), without requiring cuda toolchain.

Would you mind to make this work (e.g. docker run -ti --rm --gpus all ubuntu and inside docker just install nvcc and torch via pip)? Currently I got following error in that scenario:

(venv) root@8025c5faee4e:/# python /t/examples/gemm/example_gemm.py 
/venv/lib/python3.12/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled.
We recommend installing via `pip install torch-c-dlpack-ext`
  warnings.warn(
/venv/lib/python3.12/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled.
We recommend installing via `pip install torch-c-dlpack-ext`
  warnings.warn(
2025-12-26 03:21:42  [TileLang:tilelang.jit.kernel:INFO]: TileLang begins to compile kernel `gemm` with `out_idx=[-1]`
Traceback (most recent call last):
  File "/t/examples/gemm/example_gemm.py", line 67, in <module>
    main()
  File "/t/examples/gemm/example_gemm.py", line 30, in main
    kernel = matmul(1024, 1024, 1024, 128, 128, 32)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 423, in __call__
    kernel = self.compile(*args, **kwargs, **tune_params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 355, in compile
    kernel_result = compile(
                    ^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 99, in compile
    return cached(
           ^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/cache/__init__.py", line 30, in cached
    return _kernel_cache_instance.cached(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/cache/kernel_cache.py", line 236, in cached
    kernel = JITKernel(
             ^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/kernel.py", line 137, in __init__
    adapter = self._compile_and_create_adapter(func, out_idx)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/kernel.py", line 242, in _compile_and_create_adapter
    artifact = tilelang.lower(
               ^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 275, in lower
    codegen_mod = device_codegen(device_mod, target) if enable_device_compile else device_codegen_without_compile(device_mod, target)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 198, in device_codegen
    device_mod = tvm.ffi.get_global_func(global_func)(device_mod, target)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python/tvm_ffi/cython/function.pxi", line 923, in tvm_ffi.core.Function.__call__
  File "<unknown>", line 0, in tvm::codegen::BuildTileLangCUDA(tvm::IRModule, tvm::Target)
  File "python/tvm_ffi/cython/function.pxi", line 1077, in tvm_ffi.core.tvm_ffi_callback
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 114, in tilelang_callback_cuda_compile
    ptx = nvcc.compile_cuda(

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 77, in compile_cuda
    cmd = [get_nvcc_compiler()]

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 592, in get_nvcc_compiler
    return os.path.join(find_cuda_path(), "bin", "nvcc")

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 275, in find_cuda_path
    raise RuntimeError(

RuntimeError: Failed to automatically detect CUDA installation. Please set the CUDA_HOME environment variable manually (e.g., export CUDA_HOME=/usr/local/cuda).

Here's my workaround for autodetection failure:

diff --git a/examples/gemm/example_gemm.py b/examples/gemm/example_gemm.py
index dfa43112..c945d8eb 100644
--- a/examples/gemm/example_gemm.py
+++ b/examples/gemm/example_gemm.py
@@ -2,7 +2,7 @@ import tilelang
 import tilelang.language as T
 
 
-@tilelang.jit(out_idx=[-1])
+@tilelang.jit(out_idx=[-1], target='cuda')
 def matmul(M, N, K, block_M, block_N, block_K, dtype=T.float16, accum_dtype=T.float32):
     @T.prim_func
     def gemm(

clouds56 · 2025-12-26T19:34:14Z

@oraluben which Dockerfile are you using?
You could manually install nvidia-cuda-nvcc in the Dockerfile, via pip install nvidia-cuda-nvcc nvidia-cuda-cccl or uv add nvidia-cuda-nvcc nvidia-cuda-cccl, or uv add "cuda-toolkit[nvcc,cccl]", or uv add tilelang --optional nvcc

oraluben · 2025-12-27T05:42:14Z

@oraluben which Dockerfile are you using? You could manually install nvidia-cuda-nvcc in the Dockerfile, via pip install nvidia-cuda-nvcc nvidia-cuda-cccl or uv add nvidia-cuda-nvcc nvidia-cuda-cccl, or uv add "cuda-toolkit[nvcc,cccl]", or uv add tilelang --optional nvcc

I ran into the error with nvidia-cuda-nvcc installed.

clouds56 · 2025-12-28T00:10:57Z

Sorry I have trouble in setting up a docker with libcuda.so.1 to reproduce (either could not run docker, or doesn't have GPU), could you help run this in your docker

python -c "import nvidia.cu13; print('1: done')"
python -c "import tilelang; print('2:', repr(tilelang.env.CUDA_HOME))"
python -c "import os; print('3:', os.environ.get('CUDA_HOME', '<not present>'))"
python -c "import os; print('4:', os.environ.get('CUDA_PATH', '<not present>'))"

An idea might you have your CUDA_HOME accidentally set to empty string so it wouldn't pass if cuda_home is None

oraluben

I've validated that this PR works as expected, i.e., tilelang works from a raw ubuntu container with docker run --gpus and pip install 'tilelang[nvcc]' (it's the wheel name, not tilelang to be precise). cc @LeiWang1999

With some comments.

tilelang/env.py

pyproject.toml

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tilelang/env.py (1)
81-92: Windows CUDA detection remains non-deterministic when multiple versions are installed.

As flagged in the previous review, glob.glob() does not guarantee a deterministic ordering of results. When multiple CUDA installations exist, the code picks an arbitrary match, which could lead to inconsistent behavior across runs or environments.
🔎 Proposed fix to select the latest CUDA version
     if cuda_home is None:
         # Guess #4
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if cuda_homes:
+                # Sort to get the latest version (e.g., v12.3 over v11.8)
+                cuda_home = sorted(cuda_homes)[-1]
+            else:
+                cuda_home = None
         else:

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f299e7b and 05caa19.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:32.819Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.1
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.1
GitHub Check: Build SDist

tilelang/env.py

oraluben · 2026-01-05T00:28:10Z

tilelang/env.py

            else:
                cuda_home = os.path.dirname(os.path.dirname(nvcc_path))

+    if cuda_home is None:


This if cuda_home is None: is duplicated with previous if, please merge this (and Guess 4 below) with the previous one. This can also remove unnecessary indent changes in Guess 4 to keep the diff minimal.

NO. Please check twice before you comment.
Not every single line is unnecessary. It's easy to figure out.

If you consist it is unnecessary, please use "Add a suggestion" to show correct code your thought.

tilelang/env.py

Co-authored-by: Yichen Yan <oraluben@outlook.com>

coderabbitai

Actionable comments posted: 1

Fix all issues with AI Agents 🤖

In @tilelang/env.py:
- Around line 77-86: The code currently uses `assert False` in the branch where
`_get_package_version("nvidia-cuda-nvcc")` is present but no `nvcc` file is
found; replace that `assert False, "`nvidia-cuda-nvcc` installed but no `nvcc`
found"` with an explicit `raise AssertionError("nvidia-cuda-nvcc installed but
no nvcc found")` so the failure cannot be skipped under Python -O optimizations;
update the branch in the loop that inspects
`importlib.metadata.files("nvidia-cuda-nvcc")` and ensure `cuda_home` remains
untouched when raising the AssertionError.

🧹 Nitpick comments (1)

tilelang/env.py (1)
90-92: Consider sorting CUDA versions on Windows for deterministic selection.

When multiple CUDA versions are installed, glob.glob() doesn't guarantee ordering, which could lead to inconsistent behavior. Sorting the results and selecting the latest version would make the detection more predictable.
🔎 Proposed enhancement
             if sys.platform == "win32":
                 cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-                cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+                if cuda_homes:
+                    # Sort to deterministically select the latest CUDA version
+                    cuda_home = sorted(cuda_homes)[-1]
+                else:
+                    cuda_home = None
             else:
Note: This is a pre-existing issue and not introduced by this PR.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9625ba6 and de575e8.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:32.819Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

🪛 Ruff (0.14.10)

tilelang/env.py

86-86: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.1
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.1
GitHub Check: Build SDist

🔇 Additional comments (4)

tilelang/env.py (4)

2-2: LGTM!

The importlib.metadata import is necessary for the new package version detection and file enumeration functionality.

48-52: LGTM!

The helper function cleanly encapsulates package installation checks and avoids repetitive exception handling at call sites.

88-98: LGTM!

The refactoring correctly structures the fallback logic to only run when nvcc is neither in PATH nor available via the nvidia-cuda-nvcc package. The control flow is clear and maintains the original behavior.

100-102: LGTM!

The consolidated validation logic is cleaner and ensures all detection paths are validated consistently before returning.

tilelang/env.py

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

tilelang/env.py (1)
90-92: Fix Windows CUDA detection non-determinism.

When multiple CUDA versions are installed, glob.glob() returns results in an arbitrary order, causing cuda_homes[0] to select a non-deterministic version. This can lead to inconsistent behavior across runs or environments.

This issue was flagged in previous reviews and remains unaddressed.
🔎 Proposed fix to deterministically select the latest CUDA version
             if sys.platform == "win32":
                 cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-                cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+                if cuda_homes:
+                    # Sort to deterministically pick the latest version (e.g., v12.3 over v11.8)
+                    cuda_home = sorted(cuda_homes)[-1]
+                else:
+                    cuda_home = None
             else:

🧹 Nitpick comments (1)

tilelang/env.py (1)
77-86: LGTM! nvidia-cuda-nvcc detection is well-implemented.

The logic correctly detects CUDA installations from the nvidia-cuda-nvcc PyPI package by inspecting package files and deriving CUDA_HOME from the nvcc binary location. The or [] fallback safely handles the case where files() returns None.

However, consider extracting the exception message to a module-level constant to address the Ruff TRY003 hint and improve maintainability:
🔎 Optional refactor to address TRY003

At the top of the file (after logger definition):
_NVCC_NOT_FOUND_ERROR = "`nvidia-cuda-nvcc` installed but no `nvcc` found"
Then update line 86:
             else:
-                raise AssertionError("`nvidia-cuda-nvcc` installed but no `nvcc` found")
+                raise AssertionError(_NVCC_NOT_FOUND_ERROR)

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de575e8 and 605867b.

📒 Files selected for processing (1)

tilelang/env.py

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

📚 Learning: 2025-12-24T17:20:32.819Z

Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

tilelang/env.py

🪛 Ruff (0.14.10)

tilelang/env.py

86-86: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.1
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.1
GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
GitHub Check: Build SDist

🔇 Additional comments (2)

tilelang/env.py (2)

48-52: LGTM! Clean package version detection helper.

The implementation correctly handles the PackageNotFoundError exception and returns None when the package is not installed, making it a reliable check for package presence.

100-104: LGTM! Consolidated validation improves code structure.

The refactored validation logic consolidates path checking in a single location after all detection attempts, reducing duplication and improving maintainability. The final return ensures a consistent string return type (empty string for failure cases).

oraluben

LGTM

add detection of cuda_home from package nvidia-cuda-nvcc

e801a01

coderabbitai bot reviewed Dec 24, 2025

View reviewed changes

oraluben requested changes Dec 25, 2025

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

use importlib.util.find_spec

5d53b22

coderabbitai bot reviewed Dec 25, 2025

View reviewed changes

fix raise if nvidia not installed

dfe5550

coderabbitai bot reviewed Dec 25, 2025

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

treat empty CUDA_HOME non present

f299e7b

oraluben requested changes Dec 30, 2025

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

tilelang/env.py Outdated Show resolved Hide resolved

pyproject.toml Show resolved Hide resolved

oraluben changed the title ~~add detection of cuda_home from package nvidia-cuda-nvcc~~ [Enhancement][CUDA] Support nvidia-cuda-nvcc as nvcc Dec 30, 2025

fix comment

05caa19

coderabbitai bot reviewed Jan 4, 2026

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

oraluben requested changes Jan 5, 2026

View reviewed changes

oraluben reviewed Jan 5, 2026

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

Update tilelang/env.py

9625ba6

oraluben reviewed Jan 5, 2026

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

clouds56 and others added 2 commits January 5, 2026 16:28

using suggestions

de575e8

Co-authored-by: Yichen Yan <oraluben@outlook.com>

fix lint

605867b

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

tilelang/env.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

oraluben previously approved these changes Jan 5, 2026

View reviewed changes

oraluben requested review from LeiWang1999 and XuehaiPan January 5, 2026 08:55

Merge branch 'main' into cuda_home

2818cd9

clouds56 dismissed oraluben’s stale review via 2818cd9 January 6, 2026 07:45

LeiWang1999 approved these changes Jan 7, 2026

View reviewed changes

oraluben merged commit 1bcce8b into tile-ai:main Jan 7, 2026
13 checks passed

[Enhancement][CUDA] Support nvidia-cuda-nvcc as nvcc #1528

[Enhancement][CUDA] Support nvidia-cuda-nvcc as nvcc #1528

Uh oh!

Conversation

clouds56 commented Dec 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Dec 24, 2025

Uh oh!

coderabbitai bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

oraluben left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oraluben commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clouds56 commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oraluben commented Dec 27, 2025

Uh oh!

clouds56 commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oraluben left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oraluben Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

clouds56 Jan 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

oraluben left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

[Enhancement][CUDA] Support `nvidia-cuda-nvcc` as `nvcc` #1528

[Enhancement][CUDA] Support `nvidia-cuda-nvcc` as `nvcc` #1528

clouds56 commented Dec 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 24, 2025 •

edited

Loading

oraluben commented Dec 26, 2025 •

edited

Loading

clouds56 commented Dec 26, 2025 •

edited

Loading

clouds56 commented Dec 28, 2025 •

edited

Loading

oraluben left a comment •

edited

Loading