cuTile Python is a programming language for NVIDIA GPUs. The official documentation can be found on docs.nvidia.com, or built from source located in the docs folder.
# This examples uses CuPy which can be installed via `pip install cupy-cuda13x`
# Make sure cuda toolkit 13.1+ is installed: https://developer.nvidia.com/cuda-downloads
import cuda.tile as ct
import cupy
TILE_SIZE = 16
# cuTile kernel for adding two dense vectors. It runs in parallel on the GPU.
@ct.kernel
def vector_add_kernel(a, b, result):
block_id = ct.bid(0)
a_tile = ct.load(a, index=(block_id,), shape=(TILE_SIZE,))
b_tile = ct.load(b, index=(block_id,), shape=(TILE_SIZE,))
result_tile = a_tile + b_tile
ct.store(result, index=(block_id,), tile=result_tile)
# Host-side function that launches the above kernel.
def vector_add(a: cupy.ndarray, b: cupy.ndarray, result: cupy.ndarray):
assert a.shape == b.shape == result.shape
grid = (ct.cdiv(a.shape[0], TILE_SIZE), 1, 1)
ct.launch(cupy.cuda.get_current_stream(), grid, vector_add_kernel, (a, b, result))
import numpy as np
def test_vector_add():
a = cupy.random.uniform(-5, 5, 128)
b = cupy.random.uniform(-5, 5, 128)
result = cupy.zeros_like(a)
vector_add(a, b, result)
a_np = cupy.asnumpy(a)
b_np = cupy.asnumpy(b)
result_np = cupy.asnumpy(result)
expected = a_np + b_np
np.testing.assert_array_almost_equal(result_np, expected)
test_vector_add()cuTile Python generates kernels based on Tile IR
which requries NVIDIA Driver r580 or later to run.
Furthermore, the tileiras compiler only supports Blackwell GPU with 13.1 release, but the
restriction will be removed in the coming versions.
Checkout the prerequisites
for full list of requirements.
cuTile Python is published on PyPI under the
cuda-tile package name and can be installed with pip:
pip install cuda-tile
Currently, the CUDA Toolkit 13.1+ is required
and needs to be installed separately. On a Debian-based system, use apt-get install cuda-tileiras-13.1 cuda-compiler-13.1 instead of apt-get install cuda-toolkit-13.1
if you wish to avoid installing the full CUDA Toolkit.
cuTile is written mostly in Python, but includes a C++ extension which needs to be built. You will need:
- A C++17-capable compiler, such as GNU C++ or MSVC;
- CMake 3.18+;
- GNU Make on Linux or msbuild on Windows;
- Python 3.10+ with development headers (
venvmodule is recommended but optional); - CUDA Toolkit 13.1+
On an Ubuntu system, the first four dependencies can be installed with APT:
sudo apt-get update && sudo apt-get install build-essential cmake python3-dev python3-venv
The CMakeLists.txt script will also automatically download
the DLPack dependency from GitHub.
If you wish to disable this behavior and provide your own copy of DLPack,
set the CUDA_TILE_CMAKE_DLPACK_PATH environment variable to a local path
to the DLPack source tree.
Unless you are already using a Python virtual environment, it is recommended to create one in order to avoid installing cuTile globally:
python3 -m venv env
source env/bin/activate
Once the build dependencies are in place, the simplest way to build cuTile is to install it in editable mode by running the following command in the source root directory:
pip install -e .
This will create the build directory and invoke the CMake-based build process.
In editable mode, the compiled extension module will be placed in the build directory,
and then a symbolic link to it will be created in the source directory.
This makes sure that the pip install -e . command above is needed only once, and recompiling
the extension after making changes to the C++ code can be done with make -C build
which is much faster. This logic is defined in setup.py.
cuTile uses the pytest framework for testing. Tests have extra dependencies, such as PyTorch, which can be installed with
pip install -r test/requirements.txt
The tests are located in the test/ directory. To run a specific test file,
for example test_copy.py, use the following command:
pytest test/test_copy.py
Copyright © 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
cuTile-Python is licensed under the Apache 2.0 license. See the LICENSES folder for the full license text.