Skip to content

Conversation

@clouds56
Copy link
Contributor

@clouds56 clouds56 commented Dec 24, 2025

Sorry last PR #1527 was closed by mistake, and my branch is also lost, so I prepared a new PR.

Summary by CodeRabbit

  • Improvements

    • More reliable CUDA detection: now also recognizes CUDA provided via NVIDIA PyPI packages and applies a unified validation step with platform-specific fallbacks to better locate CUDA installations.
  • New Features

    • Added an optional "nvcc" install group to simplify installing nvcc and related helper packages.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 24, 2025

📝 Walkthrough

Walkthrough

Adds CUDA_HOME discovery from the nvidia-cuda-nvcc PyPI package via importlib.metadata file inspection, a helper to read package versions, refactors CUDA detection to consolidate validation, and adds an nvcc optional-dependency group in pyproject.toml. (34 words)

Changes

Cohort / File(s) Summary
CUDA Home Discovery (env changes)
tilelang/env.py
Adds importlib.metadata usage and helper `_get_package_version(pkg: str) -> str
Optional dependency group
pyproject.toml
Adds new optional-dependencies group nvcc including nvidia-cuda-nvcc>=13.0.48 and nvidia-cuda-cccl>=13.0.50.

Sequence Diagram(s)

sequenceDiagram
  participant Caller as Caller
  participant Env as tilelang.env._find_cuda_home
  participant PATH as System PATH (nvcc)
  participant Package as importlib.metadata.files("nvidia-cuda-nvcc")
  participant FS as Filesystem

  Caller->>Env: request CUDA_HOME
  Env->>Env: check CUDA_HOME env var & known locations (Guesses `#1`)
  Env->>PATH: is nvcc on PATH? (Guess `#2`)
  alt nvcc on PATH
    PATH-->>Env: nvcc path
    Env->>FS: resolve grandparent -> candidate cuda_home
  else nvcc not on PATH
    Env->>Package: inspect package files (Guess `#3`)
    alt package provides nvcc
      Package-->>Env: nvcc file path
      Env->>FS: resolve grandparent -> candidate cuda_home
    else
      Env->>FS: fallback checks for standard CUDA installs (Guess `#4`)
    end
  end
  Env->>FS: validate candidate cuda_home (unified step)
  alt valid
    Env-->>Caller: CUDA_HOME (path)
  else invalid
    Env-->>Caller: "" (empty)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • LeiWang1999

Poem

🐰 I sniffed the packages, followed nvcc's trail,

From package files and PATH I wagged my tail.
I hopped through guesses, checked each cozy nook,
Found CUDA's home with one determined look.
Hop, grab a carrot — the build's on track! 🥕

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main enhancement: adding support for the nvidia-cuda-nvcc PyPI package as an alternative CUDA detection source.

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 605867b and 2818cd9.

📒 Files selected for processing (1)
  • pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (1)
  • pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.0
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.0
  • GitHub Check: Build SDist

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tilelang/env.py (1)

91-93: Non-deterministic CUDA version selection when multiple versions are installed.

glob.glob() returns paths in arbitrary filesystem order. If multiple CUDA versions are installed (e.g., v11.8, v12.0, v12.4), selecting cuda_homes[0] gives unpredictable results across runs or machines.

🔎 Proposed fix to prefer the latest CUDA version
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if cuda_homes:
+                # Sort to prefer the latest version (e.g., v12.4 over v11.8)
+                cuda_homes.sort(reverse=True)
+                cuda_home = cuda_homes[0]
+            else:
+                cuda_home = ""
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d140415 and e801a01.

📒 Files selected for processing (1)
  • tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:27.444Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

  • tilelang/env.py
🔇 Additional comments (2)
tilelang/env.py (2)

69-87: LGTM - PyPI package detection correctly implements nvidia-cuda-nvcc paths.

The candidate paths (nvidia/cu13/bin/, nvidia/cu12/bin/, nvidia/cu11/bin/) correctly match the nvidia-cuda-nvcc PyPI package installation structure. The priority order prefers newer CUDA versions, and the fallback to nvidia/cuda_nvcc/bin/ is a reasonable defensive addition. Based on learnings from the previous PR.


98-99: Verify HPC SDK fallback path is intentional.

The path /opt/nvidia/hpc_sdk/Linux_x86_64 is the HPC SDK root, not a valid CUDA_HOME. The actual CUDA installation within HPC SDK is typically nested at /opt/nvidia/hpc_sdk/Linux_x86_64/<version>/cuda/<cuda_version>.

This differs from Guess #2's handling (lines 62-64), which navigates from the nvcc binary to find the correct SDK path. If this fallback is reached, it's unlikely to point to a usable CUDA installation—though the validation at lines 101-103 will safely reject it.

Is this a best-effort guess that relies on the subsequent validation, or should this path be removed/adjusted for consistency?

Copy link
Collaborator

@oraluben oraluben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This idea LGTM, with this PR, tilelang can work with a plain torch installation (with nvidia-cuda-nvcc installed), without requiring cuda toolchain.

I think we can also put nvidia-cuda-nvcc in optional dependency.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tilelang/env.py (2)

69-78: Move import to module level for consistency.

The importlib.util import is placed inside the function. Consider moving it to the top of the file alongside other imports for better consistency and readability.

🔎 Proposed refactor

At the top of the file, add the import:

from __future__ import annotations
import sys
import os
import pathlib
import logging
import shutil
import glob
+import importlib.util
from dataclasses import dataclass

Then remove it from inside the function:

     if cuda_home is None:
         # Guess #3
         # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.
-        import importlib.util
-
         for submodule in ["cu13", "cu12", "cu11", "cuda_nvcc"]:

82-84: Consider sorting CUDA versions when multiple installations exist.

The Windows fallback uses glob.glob which returns matches in arbitrary filesystem order. If multiple CUDA versions are installed, the selected version may be unpredictable. Consider sorting the results to select the latest version consistently.

🔎 Proposed fix to select the latest CUDA version
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if len(cuda_homes) == 0:
+                cuda_home = ""
+            else:
+                # Sort to get the latest version (e.g., v12.8 comes after v11.8)
+                cuda_homes.sort(reverse=True)
+                cuda_home = cuda_homes[0]
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e801a01 and 5d53b22.

📒 Files selected for processing (2)
  • pyproject.toml
  • tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:27.444Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

  • pyproject.toml
  • tilelang/env.py
🔇 Additional comments (4)
tilelang/env.py (3)

86-90: LGTM!

The Linux/macOS fallback logic correctly checks standard CUDA installation paths in a reasonable priority order.


92-96: LGTM!

The path validation logic correctly ensures that only existing CUDA installations are returned, with proper fallback to an empty string for backward compatibility.


74-78: The CUDA_HOME detection from PyPI packages at lines 74-78 is correct. The spec.submodule_search_locations[0] returns the nvidia/cu13 (or cu12/cu11) directory, which is the proper CUDA_HOME location containing the bin/ subdirectory with nvcc. No changes needed.

pyproject.toml (1)

55-59: The specified versions for both optional dependencies are valid and available on PyPI:

  • nvidia-cuda-nvcc 13.0.48 ✓ exists (released Aug 4, 2025)
  • nvidia-cuda-cccl 13.0.50 ✓ exists (released Aug 4, 2025)

No known security vulnerabilities were found for these versions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tilelang/env.py (2)

93-95: Consider simplifying the validation logic.

The validation can be streamlined since checking cuda_home is None and then setting it to None is redundant.

🔎 Proposed simplification
-        # Validate found path
-        if cuda_home is None or not os.path.exists(cuda_home):
-            cuda_home = None
+        # Validate found path exists
+        if cuda_home and not os.path.exists(cuda_home):
+            cuda_home = None

72-72: Consider moving the import to the module level for better clarity.

While importing importlib.util inside the function works, placing it at the module level (lines 1-8) would improve code organization and make dependencies more visible.

🔎 Proposed change

At the top of the file (after line 7):

 import shutil
 import glob
+import importlib.util
 from dataclasses import dataclass

Then remove the import from line 72:

     if cuda_home is None:
         # Guess #3
         # from pypi package nvidia-cuda-nvcc, nvidia-cuda-nvcc-cu12, etc.
-        import importlib.util
 
         if importlib.util.find_spec("nvidia") is not None:
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d53b22 and dfe5550.

📒 Files selected for processing (1)
  • tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:27.444Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:27.444Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

  • tilelang/env.py
🔇 Additional comments (1)
tilelang/env.py (1)

69-79: The fix correctly prevents exceptions when nvidia packages are not installed.

The guard clause at line 74 (if importlib.util.find_spec("nvidia") is not None:) successfully prevents accessing nvidia submodules when the parent package is missing. Testing confirms that importlib.util.find_spec("nvidia") returns None without raising an exception when the package is not installed, and the conditional structure ensures submodule searches (lines 75-79) never execute in this scenario.

@oraluben
Copy link
Collaborator

oraluben commented Dec 26, 2025

with this PR, tilelang can work with a plain torch installation (with nvidia-cuda-nvcc installed), without requiring cuda toolchain.

Would you mind to make this work (e.g. docker run -ti --rm --gpus all ubuntu and inside docker just install nvcc and torch via pip)? Currently I got following error in that scenario:

(venv) root@8025c5faee4e:/# python /t/examples/gemm/example_gemm.py 
/venv/lib/python3.12/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled.
We recommend installing via `pip install torch-c-dlpack-ext`
  warnings.warn(
/venv/lib/python3.12/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled.
We recommend installing via `pip install torch-c-dlpack-ext`
  warnings.warn(
2025-12-26 03:21:42  [TileLang:tilelang.jit.kernel:INFO]: TileLang begins to compile kernel `gemm` with `out_idx=[-1]`
Traceback (most recent call last):
  File "/t/examples/gemm/example_gemm.py", line 67, in <module>
    main()
  File "/t/examples/gemm/example_gemm.py", line 30, in main
    kernel = matmul(1024, 1024, 1024, 128, 128, 32)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 423, in __call__
    kernel = self.compile(*args, **kwargs, **tune_params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 355, in compile
    kernel_result = compile(
                    ^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/__init__.py", line 99, in compile
    return cached(
           ^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/cache/__init__.py", line 30, in cached
    return _kernel_cache_instance.cached(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/cache/kernel_cache.py", line 236, in cached
    kernel = JITKernel(
             ^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/kernel.py", line 137, in __init__
    adapter = self._compile_and_create_adapter(func, out_idx)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/jit/kernel.py", line 242, in _compile_and_create_adapter
    artifact = tilelang.lower(
               ^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 275, in lower
    codegen_mod = device_codegen(device_mod, target) if enable_device_compile else device_codegen_without_compile(device_mod, target)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 198, in device_codegen
    device_mod = tvm.ffi.get_global_func(global_func)(device_mod, target)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python/tvm_ffi/cython/function.pxi", line 923, in tvm_ffi.core.Function.__call__
  File "<unknown>", line 0, in tvm::codegen::BuildTileLangCUDA(tvm::IRModule, tvm::Target)
  File "python/tvm_ffi/cython/function.pxi", line 1077, in tvm_ffi.core.tvm_ffi_callback
  File "/venv/lib/python3.12/site-packages/tilelang/engine/lower.py", line 114, in tilelang_callback_cuda_compile
    ptx = nvcc.compile_cuda(

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 77, in compile_cuda
    cmd = [get_nvcc_compiler()]

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 592, in get_nvcc_compiler
    return os.path.join(find_cuda_path(), "bin", "nvcc")

  File "/venv/lib/python3.12/site-packages/tilelang/contrib/nvcc.py", line 275, in find_cuda_path
    raise RuntimeError(

RuntimeError: Failed to automatically detect CUDA installation. Please set the CUDA_HOME environment variable manually (e.g., export CUDA_HOME=/usr/local/cuda).

Here's my workaround for autodetection failure:

diff --git a/examples/gemm/example_gemm.py b/examples/gemm/example_gemm.py
index dfa43112..c945d8eb 100644
--- a/examples/gemm/example_gemm.py
+++ b/examples/gemm/example_gemm.py
@@ -2,7 +2,7 @@ import tilelang
 import tilelang.language as T
 
 
-@tilelang.jit(out_idx=[-1])
+@tilelang.jit(out_idx=[-1], target='cuda')
 def matmul(M, N, K, block_M, block_N, block_K, dtype=T.float16, accum_dtype=T.float32):
     @T.prim_func
     def gemm(

@clouds56
Copy link
Contributor Author

clouds56 commented Dec 26, 2025

@oraluben which Dockerfile are you using?
You could manually install nvidia-cuda-nvcc in the Dockerfile, via pip install nvidia-cuda-nvcc nvidia-cuda-cccl or uv add nvidia-cuda-nvcc nvidia-cuda-cccl, or uv add "cuda-toolkit[nvcc,cccl]", or uv add tilelang --optional nvcc

@oraluben
Copy link
Collaborator

@oraluben which Dockerfile are you using? You could manually install nvidia-cuda-nvcc in the Dockerfile, via pip install nvidia-cuda-nvcc nvidia-cuda-cccl or uv add nvidia-cuda-nvcc nvidia-cuda-cccl, or uv add "cuda-toolkit[nvcc,cccl]", or uv add tilelang --optional nvcc

I ran into the error with nvidia-cuda-nvcc installed.

@clouds56
Copy link
Contributor Author

clouds56 commented Dec 28, 2025

Sorry I have trouble in setting up a docker with libcuda.so.1 to reproduce (either could not run docker, or doesn't have GPU), could you help run this in your docker

python -c "import nvidia.cu13; print('1: done')"
python -c "import tilelang; print('2:', repr(tilelang.env.CUDA_HOME))"
python -c "import os; print('3:', os.environ.get('CUDA_HOME', '<not present>'))"
python -c "import os; print('4:', os.environ.get('CUDA_PATH', '<not present>'))"

An idea might you have your CUDA_HOME accidentally set to empty string so it wouldn't pass if cuda_home is None

Copy link
Collaborator

@oraluben oraluben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've validated that this PR works as expected, i.e., tilelang works from a raw ubuntu container with docker run --gpus and pip install 'tilelang[nvcc]' (it's the wheel name, not tilelang to be precise). cc @LeiWang1999

With some comments.

@oraluben oraluben changed the title add detection of cuda_home from package nvidia-cuda-nvcc [Enhancement][CUDA] Support nvidia-cuda-nvcc as nvcc Dec 30, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tilelang/env.py (1)

81-92: Windows CUDA detection remains non-deterministic when multiple versions are installed.

As flagged in the previous review, glob.glob() does not guarantee a deterministic ordering of results. When multiple CUDA installations exist, the code picks an arbitrary match, which could lead to inconsistent behavior across runs or environments.

🔎 Proposed fix to select the latest CUDA version
     if cuda_home is None:
         # Guess #4
         if sys.platform == "win32":
             cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-            cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+            if cuda_homes:
+                # Sort to get the latest version (e.g., v12.3 over v11.8)
+                cuda_home = sorted(cuda_homes)[-1]
+            else:
+                cuda_home = None
         else:
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f299e7b and 05caa19.

📒 Files selected for processing (1)
  • tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:32.819Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

  • tilelang/env.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.1
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.1
  • GitHub Check: Build SDist

tilelang/env.py Outdated
else:
cuda_home = os.path.dirname(os.path.dirname(nvcc_path))

if cuda_home is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if cuda_home is None: is duplicated with previous if, please merge this (and Guess 4 below) with the previous one. This can also remove unnecessary indent changes in Guess 4 to keep the diff minimal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO. Please check twice before you comment.
Not every single line is unnecessary. It's easy to figure out.

If you consist it is unnecessary, please use "Add a suggestion" to show correct code your thought.

clouds56 and others added 2 commits January 5, 2026 16:28
Co-authored-by: Yichen Yan <oraluben@outlook.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @tilelang/env.py:
- Around line 77-86: The code currently uses `assert False` in the branch where
`_get_package_version("nvidia-cuda-nvcc")` is present but no `nvcc` file is
found; replace that `assert False, "`nvidia-cuda-nvcc` installed but no `nvcc`
found"` with an explicit `raise AssertionError("nvidia-cuda-nvcc installed but
no nvcc found")` so the failure cannot be skipped under Python -O optimizations;
update the branch in the loop that inspects
`importlib.metadata.files("nvidia-cuda-nvcc")` and ensure `cuda_home` remains
untouched when raising the AssertionError.
🧹 Nitpick comments (1)
tilelang/env.py (1)

90-92: Consider sorting CUDA versions on Windows for deterministic selection.

When multiple CUDA versions are installed, glob.glob() doesn't guarantee ordering, which could lead to inconsistent behavior. Sorting the results and selecting the latest version would make the detection more predictable.

🔎 Proposed enhancement
             if sys.platform == "win32":
                 cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-                cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+                if cuda_homes:
+                    # Sort to deterministically select the latest CUDA version
+                    cuda_home = sorted(cuda_homes)[-1]
+                else:
+                    cuda_home = None
             else:

Note: This is a pre-existing issue and not introduced by this PR.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9625ba6 and de575e8.

📒 Files selected for processing (1)
  • tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:32.819Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

  • tilelang/env.py
🪛 Ruff (0.14.10)
tilelang/env.py

86-86: Do not assert False (python -O removes these calls), raise AssertionError()

Replace assert False

(B011)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.1
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.1
  • GitHub Check: Build SDist
🔇 Additional comments (4)
tilelang/env.py (4)

2-2: LGTM!

The importlib.metadata import is necessary for the new package version detection and file enumeration functionality.


48-52: LGTM!

The helper function cleanly encapsulates package installation checks and avoids repetitive exception handling at call sites.


88-98: LGTM!

The refactoring correctly structures the fallback logic to only run when nvcc is neither in PATH nor available via the nvidia-cuda-nvcc package. The control flow is clear and maintains the original behavior.


100-102: LGTM!

The consolidated validation logic is cleaner and ensures all detection paths are validated consistently before returning.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tilelang/env.py (1)

90-92: Fix Windows CUDA detection non-determinism.

When multiple CUDA versions are installed, glob.glob() returns results in an arbitrary order, causing cuda_homes[0] to select a non-deterministic version. This can lead to inconsistent behavior across runs or environments.

This issue was flagged in previous reviews and remains unaddressed.

🔎 Proposed fix to deterministically select the latest CUDA version
             if sys.platform == "win32":
                 cuda_homes = glob.glob("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v*.*")
-                cuda_home = "" if len(cuda_homes) == 0 else cuda_homes[0]
+                if cuda_homes:
+                    # Sort to deterministically pick the latest version (e.g., v12.3 over v11.8)
+                    cuda_home = sorted(cuda_homes)[-1]
+                else:
+                    cuda_home = None
             else:
🧹 Nitpick comments (1)
tilelang/env.py (1)

77-86: LGTM! nvidia-cuda-nvcc detection is well-implemented.

The logic correctly detects CUDA installations from the nvidia-cuda-nvcc PyPI package by inspecting package files and deriving CUDA_HOME from the nvcc binary location. The or [] fallback safely handles the case where files() returns None.

However, consider extracting the exception message to a module-level constant to address the Ruff TRY003 hint and improve maintainability:

🔎 Optional refactor to address TRY003

At the top of the file (after logger definition):

_NVCC_NOT_FOUND_ERROR = "`nvidia-cuda-nvcc` installed but no `nvcc` found"

Then update line 86:

             else:
-                raise AssertionError("`nvidia-cuda-nvcc` installed but no `nvcc` found")
+                raise AssertionError(_NVCC_NOT_FOUND_ERROR)
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de575e8 and 605867b.

📒 Files selected for processing (1)
  • tilelang/env.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.
📚 Learning: 2025-12-24T17:20:32.819Z
Learnt from: clouds56
Repo: tile-ai/tilelang PR: 1527
File: tilelang/env.py:0-0
Timestamp: 2025-12-24T17:20:32.819Z
Learning: The nvidia-cuda-nvcc PyPI package installs to `nvidia/cu13/bin/` (for CUDA 13), `nvidia/cu12/bin/` (for CUDA 12), and `nvidia/cu11/bin/` (for CUDA 11) in the site-packages directory, not to `nvidia/cuda_nvcc/bin/`. These paths should be used when detecting CUDA installations from PyPI packages in tilelang/env.py.

Applied to files:

  • tilelang/env.py
🪛 Ruff (0.14.10)
tilelang/env.py

86-86: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)
  • GitHub Check: Test for Python 3.12 with Nightly-ROCm-7.1 (on self-hosted-amd)
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with Nightly-CUDA-13.1
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with CUDA-12.8
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-24.04-arm with Nightly-CUDA-13.1
  • GitHub Check: Build wheels for Python 3.9 on macos-latest with Metal
  • GitHub Check: Build wheels for Python 3.9 on ubuntu-latest with CUDA-12.8
  • GitHub Check: Build SDist
🔇 Additional comments (2)
tilelang/env.py (2)

48-52: LGTM! Clean package version detection helper.

The implementation correctly handles the PackageNotFoundError exception and returns None when the package is not installed, making it a reliable check for package presence.


100-104: LGTM! Consolidated validation improves code structure.

The refactored validation logic consolidates path checking in a single location after all detection attempts, reducing duplication and improving maintainability. The final return ensures a consistent string return type (empty string for failure cases).

oraluben
oraluben previously approved these changes Jan 5, 2026
Copy link
Collaborator

@oraluben oraluben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@oraluben oraluben merged commit 1bcce8b into tile-ai:main Jan 7, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants