fix(setup): auto-detect NVIDIA GPU on Linux and install PyTorch cu128 by smendola · Pull Request #560 · jamiepine/voicebox

smendola · 2026-04-25T15:36:28Z

Problem

just setup on Linux with an NVIDIA GPU installs CPU-only PyTorch. The default torch wheel on PyPI satisfies requirements.txt without ever consulting the PyTorch CUDA index, so even users with a capable GPU end up with no CUDA acceleration.

Fix

Detect nvidia-smi at setup time and install torch==2.7.0+cu128 / torchaudio==2.7.0+cu128 via a constraints file from https://download.pytorch.org/whl/cu128.

Why cu128 and not cu130? Driver 570 — the default shipped with Ubuntu 24.04's current NVIDIA packages — supports CUDA 12.8 max. cu130 requires driver 576+. Pinning cu128 means this works for any NVIDIA GPU on driver 520+.

No impact on other platforms:

macOS (Apple Silicon or Intel): unchanged, takes the existing path
Linux without nvidia-smi: falls through to the same CPU install as before
Windows: the [unix] block doesn't run; Windows already has its own GPU detection

--no-deps is also added to the Qwen3-TTS install to prevent it from overriding the pinned torch version.

Test

On a Linux machine with an NVIDIA GPU:

python3.12 -m venv backend/venv
just setup
backend/venv/bin/python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"
# → 2.7.0+cu128
# → True

Summary by CodeRabbit

New Features
- Platform-specific CUDA packages and artifacts for Windows and Linux.
- System-aware PyTorch installation: CUDA 12.8 for Linux GPUs, default for Apple Silicon, CPU-only otherwise.
Chores
- Release pipeline updated to generate, name, and upload platform-tagged CUDA artifacts.
- Installer tasks adjusted to respect pinned torch versions.

`just setup` on Linux with an NVIDIA GPU previously installed CPU-only torch because PyPI's default torch wheel satisfies requirements.txt without consulting the PyTorch CUDA index. This change detects `nvidia-smi` at setup time and pins torch==2.7.0+cu128 / torchaudio==2.7.0+cu128 via a constraints file, pulling from https://download.pytorch.org/whl/cu128. cu128 is chosen over cu130 because driver 570 (the default on Ubuntu 24.04 with current NVIDIA packages) supports CUDA 12.8 max; cu130 requires driver 576+. macOS and non-NVIDIA Linux paths are unchanged. Qwen3-TTS gains --no-deps to prevent it from overriding the pinned torch version.

…nload The CUDA backend download previously used a single archive name (voicebox-server-cuda.tar.gz) for all platforms. Only a Windows binary was published in releases, so Linux users silently downloaded and extracted a Windows .exe, which was unusable. Changes: - backend/services/cuda.py: derive archive names from sys.platform so Linux downloads voicebox-server-cuda-linux-x86_64.tar.gz and Windows downloads voicebox-server-cuda-windows-x86_64.tar.gz (same for cuda-libs archives) - scripts/package_cuda.py: add required --platform argument that stamps the platform identifier into all output archive and checksum filenames - .github/workflows/release.yml: pass --platform windows-x86_64 to the existing Windows job; add build-cuda-linux job on ubuntu-latest that installs torch cu128, builds with PyInstaller, packages with --platform linux-x86_64, and uploads the Linux archives to the release No GPU is required in CI — PyInstaller bundles the CUDA .so libraries from the pip-installed nvidia-* packages at build time.

coderabbitai · 2026-04-25T15:44:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c43ffd1e-a7cf-4793-9c65-842b49756696

📥 Commits

Reviewing files that changed from the base of the PR and between 4c3cb97 and 4715a6a.

📒 Files selected for processing (1)

.gitignore

✅ Files skipped from review due to trivial changes (1)

.gitignore

📝 Walkthrough

Walkthrough

Adds platform-specific CUDA packaging and CI builds: renames artifacts with platform suffixes, threads platform through packaging and manifests, adds a Linux CUDA build job, and updates the backend to download platform-suffixed release archives and the local build/install tooling for OS/GPU-aware PyTorch installs.

Changes

Cohort / File(s)	Summary
CI/CD Workflow `.github/workflows/release.yml`	Makes Windows CUDA packaging platform-specific (`--platform windows-x86_64`) and adds a new `build-cuda-linux` job that builds CUDA 12.8 on Linux, packages server + CUDA libs with `--platform linux-x86_64`, and uploads platform-suffixed archives and onedir artifact.
Packaging Script `scripts/package_cuda.py`	Adds required `--platform` argument, includes platform in generated archive filenames and .sha256 sidecars, and adds `"platform"` to the `cuda-libs.json` manifest; updates `package(...)` signature to accept `platform`.
Backend CUDA Service `backend/services/cuda.py`	Derives platform (`_plat`) from `sys.platform` and appends it to server core and CUDA libs tarball names when downloading release assets.
Build Tooling / Local dev `justfile`	Implements OS/GPU-aware PyTorch install logic: Linux with `nvidia-smi` installs CUDA 12.8–pinned wheels via cu128 index; Apple Silicon uses default requirements; others use CPU-only requirements. Adds `--no-deps` when installing Qwen3-TTS.
Misc `.gitignore`	Adds generated `tauri/src-tauri/gen/schemas/linux-schema.json` to ignore list.

Sequence Diagram(s)

sequenceDiagram
  participant CI as CI Job
  participant Pack as package_cuda.py
  participant GH as GitHub Releases
  participant Backend as backend/services/cuda.py
  participant Local as Local/Runtime

  CI->>Pack: build onedir server (Windows/Linux) + cuda libs
  CI->>Pack: package(..., platform)
  Pack->>GH: upload artifacts (voicebox-server-cuda-{platform}.tar.gz, cuda-libs-{ver}-{platform}.tar.gz) + .sha256 + cuda-libs.json
  Local->>GH: request platform-specific assets
  GH->>Backend: serve platform-suffixed archives
  Backend->>Local: download server core + cuda-libs-{ver}-{platform}.tar.gz

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Upgrade CUDA backend from cu126 to cu128, fix GPU settings UI #316: Related changes to CUDA packaging and release asset naming/manifest handling (platform-suffixed archives and manifest updates).
feat: Linux support, AMD ROCm, Whisper Turbo, and spawn fix #262: Related CI adjustments enabling Linux CUDA builds and release packaging for Ubuntu-based runners.
Windows support: CUDA detection, cross-platform justfile, clean server shutdown #272: Overlapping cross-platform build/install tooling changes (Justfile and packaging adjustments for multi-platform CUDA support).

Poem

🐰 I hopped through builds and folders wide,
Windows here and Linux by my side.
Archives stamped with platforms fair,
Manifests know just where they fare.
Tiny paws packed CUDA's pride — hooray!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main objective of the changeset: adding NVIDIA GPU auto-detection on Linux and installing CUDA-enabled PyTorch cu128. The title is concise, specific, and directly reflects the primary purpose across multiple files (justfile, scripts, workflows, and backend services).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/package_cuda.py (1)
218-223: ⚠️ Potential issue | 🟡 Minor

Stale --torch-compat help text.

The default value on line 221 is ">=2.7.0,<2.11.0" but the help string on line 222 still advertises >=2.6.0,<2.11.0. Update the help to match the actual default (this also matches the value the workflow now passes for both Windows and Linux jobs).
📝 Suggested fix
     parser.add_argument(
         "--torch-compat",
         type=str,
         default=">=2.7.0,<2.11.0",
-        help="Torch version compatibility range (default: >=2.6.0,<2.11.0)",
+        help="Torch version compatibility range (default: >=2.7.0,<2.11.0)",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/package_cuda.py` around lines 218 - 223, Update the help text for the
--torch-compat argument to match the actual default value; in the
parser.add_argument call for "--torch-compat" change the help string from
referencing ">=2.6.0,<2.11.0" to ">=2.7.0,<2.11.0" so the help accurately
reflects the default default=">=2.7.0,<2.11.0" used by the script and CI.

🧹 Nitpick comments (2)

justfile (1)

47-64: Tempfile leak on pip failure; misleading branch message on macOS Intel.

Two minor issues in the new GPU-detection block:

With set -euo pipefail, if {{ pip }} install fails the rm -f "$_constraints" on line 57 never runs and the temp file stays in $TMPDIR. A trap on EXIT ensures cleanup on either path.
The else branch on lines 61–63 also fires on macOS Intel (which is intentional per the PR description) but its message is "No NVIDIA GPU detected — using CPU-only PyTorch.", which reads oddly on a Mac.

♻️ Suggested cleanup

     if [ "$(uname)" = "Linux" ] && command -v nvidia-smi &>/dev/null && nvidia-smi &>/dev/null; then
         echo "NVIDIA GPU detected — installing PyTorch with CUDA 12.8 (cu128)..."
         _constraints=$(mktemp)
+        trap 'rm -f "$_constraints"' EXIT
         printf 'torch==2.7.0+cu128\ntorchaudio==2.7.0+cu128\n' > "$_constraints"
         {{ pip }} install \
             --extra-index-url https://download.pytorch.org/whl/cu128 \
             -r {{ backend_dir }}/requirements.txt \
             -c "$_constraints"
-        rm -f "$_constraints"
     elif [ "$(uname -m)" = "arm64" ] && [ "$(uname)" = "Darwin" ]; then
         echo "Apple Silicon detected — using default PyTorch (MLX path)..."
         {{ pip }} install -r {{ backend_dir }}/requirements.txt
     else
-        echo "No NVIDIA GPU detected — using CPU-only PyTorch."
+        echo "No supported GPU detected — using CPU-only PyTorch."
         {{ pip }} install -r {{ backend_dir }}/requirements.txt
     fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@justfile` around lines 47 - 64, Add a trap to ensure the temporary constraint
file (variable _constraints) is always removed even if {{ pip }} install fails:
create the temp file as before, register a trap 'on EXIT' to rm -f
"$_constraints" (and clear the trap after successful removal), and keep the
existing rm -f "$_constraints" for normal flow; additionally, change the final
branch log message (the echo in the else branch) to avoid Mac-specific wording
(e.g., "No NVIDIA GPU detected — using CPU-only PyTorch.") so it doesn't read
oddly on Intel macOS—use a neutral message like "No NVIDIA GPU detected —
installing CPU-only PyTorch." and leave the arm64/Darwin branch message
unchanged.

backend/services/cuda.py (1)

344-346: Local cuda-libs.json diverges from the manifest published by package_cuda.py.

scripts/package_cuda.py writes a manifest with version, platform, torch_compat, archive, and sha256 (see scripts/package_cuda.py lines 172–178), but here we only persist {"version": CUDA_LIBS_VERSION}. It's safe today because get_installed_cuda_libs_version() only reads version, but you lose the integrity/compat info if you ever want to validate at runtime. Consider either downloading the published cuda-libs.json alongside the archive, or mirroring the same fields locally.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/cuda.py` around lines 344 - 346, The local cuda-libs.json
currently only writes {"version": CUDA_LIBS_VERSION} which diverges from the
published manifest format in scripts/package_cuda.py (which includes version,
platform, torch_compat, archive, sha256); update the code that builds the
manifest (the block that calls get_cuda_libs_manifest_path().write_text) to
mirror the published fields — include "platform", "torch_compat", "archive" and
compute the archive "sha256" (or download the published manifest from the same
source as package_cuda.py and write that out) while preserving "version" (use
CUDA_LIBS_VERSION); ensure the produced manifest shape matches the manifest
produced in scripts/package_cuda.py so runtime validation and integrity checks
can use the same fields.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/release.yml:
- Around line 374-381: The Linux CUDA release job adds a step that installs
Qwen3-TTS (the pip command "pip install --no-deps
git+https://github.com/QwenLM/Qwen3-TTS.git") but the Windows CUDA job
(build-cuda-windows) does not, causing inconsistent artifacts; to fix, either
add an equivalent "Install Qwen3-TTS" step with the same pip command to the
build-cuda-windows job before its "Build CUDA server binary (onedir)" step, or
remove the install step from the Linux CUDA job so both CUDA jobs expose the
same model set—update the workflow so both jobs contain identical model-install
steps.
- Around line 392-403: The Linux release job currently omits the cuda-libs.json
manifest and may write into release-assets/ even on non-tag runs; update the
"Upload archives to GitHub Release" step to either upload a platform-suffixed
manifest (e.g. cuda-libs-cu128-v1-linux-x86_64.json) to match the Windows upload
or remove the manifest from the release altogether if you don't need it
(package_cuda.py still writes plain cuda-libs.json so if you keep it you must
rename it per-platform before upload to avoid name collisions); also make the
"Package into…" step or the upload step conditional so artifacts in
release-assets/ are only produced/uploaded on tag-based releases (or ensure you
clean/skip packaging on non-tag runs) to avoid silently discarding artifacts.
- Around line 342-381: The build-cuda-linux job is missing the disk cleanup step
used in the main release job, causing potential "No space left on device" errors
during heavy CUDA wheel installs and PyInstaller bundling; add a new step
immediately after the actions/checkout@v4 step in the build-cuda-linux job that
runs the same free-up-disk-space commands as the release job (the cleanup run
that clears apt caches, removes unused packages/snap/docker caches and purges
/var/lib/apt/lists, /var/cache/apt, /tmp and /var/tmp to reclaim ~25GB) so that
the subsequent steps (Install Python dependencies, Install PyTorch with CUDA
12.8, Build CUDA server binary) have sufficient disk space.

In `@backend/services/cuda.py`:
- Around line 289-291: The platform detection using _plat incorrectly defaults
all non-win32 to "linux-x86_64"; update the logic around the _plat,
server_archive and libs_archive construction in backend/services/cuda.py to
detect sys.platform and platform.machine(): set _plat to "windows-x86_64" for
"win32", "darwin-x86_64" or "darwin-arm64" (or "darwin-x86_64"/"darwin-arm64"
variants) for macOS, and "linux-aarch64" for aarch64/arm64 Linux and
"linux-x86_64" for x86_64 Linux; if the combination is unsupported, raise a
clear exception (or return an explicit error) before constructing
server_archive/libs_archive (reference symbols: _plat, server_archive,
libs_archive, CUDA_LIBS_VERSION) and ensure the same validation is used by both
the auto-update startup flow and the /backend/download-cuda endpoint so wrong
archives are never requested.

In `@justfile`:
- Around line 75-76: The justfile installs Qwen3-TTS with --no-deps which skips
its runtime dependencies; update backend/requirements.txt to include einops,
gradio, onnxruntime, and sox so a fresh backend install won't hit ImportError,
or alternatively remove the --no-deps flag in the justfile pip install of
Qwen3-TTS and resolve any torch pin conflicts if they arise; modify either the
justfile entry referencing Qwen3-TTS or the backend/requirements.txt to reflect
this change.

---

Outside diff comments:
In `@scripts/package_cuda.py`:
- Around line 218-223: Update the help text for the --torch-compat argument to
match the actual default value; in the parser.add_argument call for
"--torch-compat" change the help string from referencing ">=2.6.0,<2.11.0" to
">=2.7.0,<2.11.0" so the help accurately reflects the default
default=">=2.7.0,<2.11.0" used by the script and CI.

---

Nitpick comments:
In `@backend/services/cuda.py`:
- Around line 344-346: The local cuda-libs.json currently only writes
{"version": CUDA_LIBS_VERSION} which diverges from the published manifest format
in scripts/package_cuda.py (which includes version, platform, torch_compat,
archive, sha256); update the code that builds the manifest (the block that calls
get_cuda_libs_manifest_path().write_text) to mirror the published fields —
include "platform", "torch_compat", "archive" and compute the archive "sha256"
(or download the published manifest from the same source as package_cuda.py and
write that out) while preserving "version" (use CUDA_LIBS_VERSION); ensure the
produced manifest shape matches the manifest produced in scripts/package_cuda.py
so runtime validation and integrity checks can use the same fields.

In `@justfile`:
- Around line 47-64: Add a trap to ensure the temporary constraint file
(variable _constraints) is always removed even if {{ pip }} install fails:
create the temp file as before, register a trap 'on EXIT' to rm -f
"$_constraints" (and clear the trap after successful removal), and keep the
existing rm -f "$_constraints" for normal flow; additionally, change the final
branch log message (the echo in the else branch) to avoid Mac-specific wording
(e.g., "No NVIDIA GPU detected — using CPU-only PyTorch.") so it doesn't read
oddly on Intel macOS—use a neutral message like "No NVIDIA GPU detected —
installing CPU-only PyTorch." and leave the arm64/Darwin branch message
unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ccf46e00-cce6-4bc0-ba0c-4a7cbfbb5d0e

📥 Commits

Reviewing files that changed from the base of the PR and between ed2eec5 and 4c3cb97.

📒 Files selected for processing (4)

.github/workflows/release.yml
backend/services/cuda.py
justfile
scripts/package_cuda.py

coderabbitai · 2026-04-25T15:50:30Z

+  build-cuda-linux:
+    runs-on: ubuntu-latest
+    permissions:
+      contents: write
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+          cache: "pip"
+
+      - name: Install Python dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install pyinstaller
+          pip install -r backend/requirements.txt
+          pip install --no-deps chatterbox-tts
+          pip install --no-deps hume-tada
+
+      - name: Install PyTorch with CUDA 12.8
+        run: |
+          pip install torch==2.7.0+cu128 torchaudio==2.7.0+cu128 \
+            --index-url https://download.pytorch.org/whl/cu128 \
+            --force-reinstall --no-deps
+
+      - name: Verify CUDA libraries are present
+        run: |
+          python -c "import torch; print(f'torch: {torch.__version__}'); print(f'CUDA version: {torch.version.cuda}')"
+
+      - name: Install Qwen3-TTS
+        run: pip install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git
+
+      - name: Build CUDA server binary (onedir)
+        working-directory: backend
+        env:
+          TORCH_CUDA_ARCH_LIST: "8.0;8.6;8.9;9.0;12.0+PTX"
+        run: python build_binary.py --cuda


⚠️ Potential issue | 🟠 Major

build-cuda-linux is missing the "Free up disk space" step used elsewhere on Ubuntu — likely to OOD-disk.

The existing release job goes out of its way to free ~25 GB on Ubuntu before building (lines 40–55), with a comment that this is what tripped the March 2026 Linux release attempts. The new CUDA build is strictly heavier (CUDA torch wheels are ~2 GB each, plus PyInstaller bundling all the NVIDIA .so files), and it runs without that cleanup on the same ubuntu-latest image. Expect "No space left on device" in PyInstaller or during tar packaging.

🛠️ Suggested addition (after `actions/checkout@v4`)

steps: - uses: actions/checkout@v4 + - name: Free up disk space + uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be + with: + tool-cache: false + android: true + dotnet: true + haskell: true + large-packages: false + swap-storage: true + - name: Setup Python uses: actions/setup-python@v5

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/release.yml around lines 342 - 381, The build-cuda-linux job is missing the disk cleanup step used in the main release job, causing potential "No space left on device" errors during heavy CUDA wheel installs and PyInstaller bundling; add a new step immediately after the actions/checkout@v4 step in the build-cuda-linux job that runs the same free-up-disk-space commands as the release job (the cleanup run that clears apt caches, removes unused packages/snap/docker caches and purges /var/lib/apt/lists, /var/cache/apt, /tmp and /var/tmp to reclaim ~25GB) so that the subsequent steps (Install Python dependencies, Install PyTorch with CUDA 12.8, Build CUDA server binary) have sufficient disk space.

coderabbitai · 2026-04-25T15:50:30Z

+      - name: Install Qwen3-TTS
+        run: pip install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git
+
+      - name: Build CUDA server binary (onedir)
+        working-directory: backend
+        env:
+          TORCH_CUDA_ARCH_LIST: "8.0;8.6;8.9;9.0;12.0+PTX"
+        run: python build_binary.py --cuda


⚠️ Potential issue | 🟠 Major

Feature parity gap: Linux CUDA build installs Qwen3-TTS, Windows CUDA build does not.

The Windows CUDA job (lines 283–309) never installs git+https://github.com/QwenLM/Qwen3-TTS.git, but the new Linux job does (line 375). After this PR, the same release tag will ship a CUDA backend that supports Qwen3-TTS on Linux but not on Windows, which is surprising for a per-platform packaging change. Either add the same step to build-cuda-windows, or drop it from the Linux job, so both CUDA artifacts expose the same model set.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/release.yml around lines 374 - 381, The Linux CUDA release job adds a step that installs Qwen3-TTS (the pip command "pip install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git") but the Windows CUDA job (build-cuda-windows) does not, causing inconsistent artifacts; to fix, either add an equivalent "Install Qwen3-TTS" step with the same pip command to the build-cuda-windows job before its "Build CUDA server binary (onedir)" step, or remove the install step from the Linux CUDA job so both CUDA jobs expose the same model set—update the workflow so both jobs contain identical model-install steps.

coderabbitai · 2026-04-25T15:50:30Z

+      - name: Upload archives to GitHub Release
+        if: startsWith(github.ref, 'refs/tags/')
+        uses: softprops/action-gh-release@v2
+        with:
+          files: |
+            release-assets/voicebox-server-cuda-linux-x86_64.tar.gz
+            release-assets/voicebox-server-cuda-linux-x86_64.tar.gz.sha256
+            release-assets/cuda-libs-cu128-v1-linux-x86_64.tar.gz
+            release-assets/cuda-libs-cu128-v1-linux-x86_64.tar.gz.sha256
+          draft: true
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}


⚠️ Potential issue | 🟡 Minor

cuda-libs.json is not uploaded for Linux; release-assets/ may not exist on non-tag pushes either.

Two things to clean up here:

The Windows job (line 330) uploads release-assets/cuda-libs.json to the release, but this Linux step does not. As a result the published cuda-libs.json will only ever describe the Windows artifact. If you want both, the manifest filename itself needs to be platform-suffixed (e.g. cuda-libs-cu128-v1-linux-x86_64.json), since package_cuda.py always writes plain cuda-libs.json and a release can't hold two assets with the same name. Otherwise consider dropping the manifest from the release uploads entirely (the runtime in backend/services/cuda.py doesn't fetch it anyway).

The Package into… step on lines 383–390 always runs and writes into release-assets/, but the Upload archives step is gated on startsWith(github.ref, 'refs/tags/'). On non-tag runs (e.g. workflow_dispatch), packaging still succeeds but the archives are silently discarded. Same shape as the Windows job today — calling it out so it's an intentional choice.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/release.yml around lines 392 - 403, The Linux release job currently omits the cuda-libs.json manifest and may write into release-assets/ even on non-tag runs; update the "Upload archives to GitHub Release" step to either upload a platform-suffixed manifest (e.g. cuda-libs-cu128-v1-linux-x86_64.json) to match the Windows upload or remove the manifest from the release altogether if you don't need it (package_cuda.py still writes plain cuda-libs.json so if you keep it you must rename it per-platform before upload to avoid name collisions); also make the "Package into…" step or the upload step conditional so artifacts in release-assets/ are only produced/uploaded on tag-based releases (or ensure you clean/skip packaging on non-tag runs) to avoid silently discarding artifacts.

coderabbitai · 2026-04-25T15:50:30Z

+    _plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64"
+    server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
+    libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find call sites for download_cuda_binary / check_and_update_cuda_binary and look for platform gating. rg -nP --type=py -C3 '\b(download_cuda_binary|check_and_update_cuda_binary)\s*$' echo echo "=== platform/sys.platform gates around CUDA ===" rg -nP --type=py -C2 'cuda' -g '!**/tests/**' | rg -nP '(sys\.platform|platform\.machine|is_cuda_active|VOICEBOX_BACKEND_VARIANT)' | head -60

Repository: jamiepine/voicebox

Length of output: 3554

🏁 Script executed:

# Check the context around app.py:249 where check_and_update_cuda_binary is called sed -n '235,265p' backend/app.py | cat -n echo "=== Check backend/services/cuda.py lines 50-120 for platform logic ===" sed -n '50,120p' backend/services/cuda.py | cat -n echo "=== Check routes/cuda.py for any gating ===" cat -n backend/routes/cuda.py | head -60

Repository: jamiepine/voicebox

Length of output: 6632

🏁 Script executed:

# Check how backend type is determined and if CUDA is conditionally enabled sed -n '240,255p' backend/app.py | cat -n echo "=== Check get_backend_type and related logic ===" rg -A10 -B3 'def get_backend_type\($' --type=py echo "=== Check if check_and_update_cuda_binary is conditionally called ===" rg -B15 'check_and_update_cuda_binary' backend/app.py | head -40

Repository: jamiepine/voicebox

Length of output: 2563

_plat falls back to linux-x86_64 for any non-win32 platform — wrong on macOS and ARM Linux.

sys.platform == "win32" distinguishes only Windows from "everything else". As written:

On macOS, sys.platform == "darwin" → _plat = "linux-x86_64". The background auto-update task on startup will fetch and try to extract a Linux x86_64 tarball.

On ARM64 Linux (Jetson, aarch64 servers), sys.platform == "linux" regardless of arch → same Linux x86_64 archive is downloaded, and the binary won't run.

Both code paths are reachable: the auto-update runs unconditionally on startup (app.py:249), and the HTTP endpoint /backend/download-cuda has no platform validation. Detect unsupported platforms early and raise a clear error rather than silently fetch the wrong archive.

🛡️ Suggested guard

- base_url = f"{GITHUB_RELEASES_URL}/{version}" - _plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64" - server_archive = f"voicebox-server-cuda-{_plat}.tar.gz" - libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz" + base_url = f"{GITHUB_RELEASES_URL}/{version}" + import platform as _platform + machine = _platform.machine().lower() + if sys.platform == "win32" and machine in ("amd64", "x86_64"): + _plat = "windows-x86_64" + elif sys.platform.startswith("linux") and machine in ("x86_64", "amd64"): + _plat = "linux-x86_64" + else: + raise RuntimeError( + f"CUDA backend is not available for {sys.platform}/{machine}; " + "supported targets are windows-x86_64 and linux-x86_64." + ) + server_archive = f"voicebox-server-cuda-{_plat}.tar.gz" + libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

_plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64"

server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"

libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"

base_url = f"{GITHUB_RELEASES_URL}/{version}"

import platform as _platform

machine = _platform.machine().lower()

if sys.platform == "win32" and machine in ("amd64", "x86_64"):

_plat = "windows-x86_64"

elif sys.platform.startswith("linux") and machine in ("x86_64", "amd64"):

_plat = "linux-x86_64"

else:

raise RuntimeError(

f"CUDA backend is not available for {sys.platform}/{machine}; "

"supported targets are windows-x86_64 and linux-x86_64."

)

server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"

libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/services/cuda.py` around lines 289 - 291, The platform detection using _plat incorrectly defaults all non-win32 to "linux-x86_64"; update the logic around the _plat, server_archive and libs_archive construction in backend/services/cuda.py to detect sys.platform and platform.machine(): set _plat to "windows-x86_64" for "win32", "darwin-x86_64" or "darwin-arm64" (or "darwin-x86_64"/"darwin-arm64" variants) for macOS, and "linux-aarch64" for aarch64/arm64 Linux and "linux-x86_64" for x86_64 Linux; if the combination is unsupported, raise a clear exception (or return an explicit error) before constructing server_archive/libs_archive (reference symbols: _plat, server_archive, libs_archive, CUDA_LIBS_VERSION) and ensure the same validation is used by both the auto-update startup flow and the /backend/download-cuda endpoint so wrong archives are never requested.

coderabbitai · 2026-04-25T15:50:30Z

+    # --no-deps prevents Qwen3-TTS from overriding the pinned torch version
+    {{ pip }} install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Inspect Qwen3-TTS install_requires to compare against backend/requirements.txt. set -euo pipefail tmp=$(mktemp -d) git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1 echo "=== Qwen3-TTS pyproject/setup ===" fd -t f -E '*.lock' . "$tmp/qwen3" -e toml -e cfg -e py | xargs grep -nE 'install_requires|dependencies|requires' 2>/dev/null | head -80 echo echo "=== backend/requirements.txt ===" cat backend/requirements.txt

Repository: jamiepine/voicebox

Length of output: 4574

🏁 Script executed:

#!/bin/bash set -euo pipefail tmp=$(mktemp -d) git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1 echo "=== Qwen3-TTS dependencies (full) ===" python3 <<'PYEOF' import toml with open('$tmp/qwen3/pyproject.toml') as f: data = toml.load(f) deps = data.get('project', {}).get('dependencies', []) for dep in deps: print(dep) PYEOF echo echo "=== Checking each Qwen3-TTS dep against backend/requirements.txt ===" python3 <<'PYEOF' import toml import re with open('$tmp/qwen3/pyproject.toml') as f: data = toml.load(f) qwen_deps = data.get('project', {}).get('dependencies', []) with open('backend/requirements.txt') as f: backend_reqs = f.read() print("Qwen3-TTS dependencies not found in backend/requirements.txt:") for dep in qwen_deps: # Extract package name (before version specifiers, brackets, etc.) pkg_name = re.split(r'[<>=\[\]@;]', dep)[0].strip().lower() if pkg_name and not re.search(rf'\b{re.escape(pkg_name)}\b', backend_reqs, re.IGNORECASE): print(f" MISSING: {dep}") else: print(f" OK: {dep}") PYEOF

Repository: jamiepine/voicebox

Length of output: 235

🏁 Script executed:

#!/bin/bash set -euo pipefail tmp=$(mktemp -d) git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1 echo "=== Qwen3-TTS dependencies (from pyproject.toml) ===" sed -n '/^\s*dependencies\s*=/,/^\s*\]/p' "$tmp/qwen3/pyproject.toml" | grep -oE '"[^"]+"' | tr -d '"' | sort echo echo "=== backend/requirements.txt packages ===" grep -v '^#' backend/requirements.txt | grep -v '^\s*$' | sed 's/[><=@\[].*//' | tr '[:upper:]' '[:lower:]' | sort echo echo "=== Comparison: Missing deps ===" python3 <<'PYEOF' import subprocess import re # Get Qwen3-TTS deps qwen_deps = subprocess.check_output([ 'sed', '-n', '/^\s*dependencies\s*=/,/^\s*\]/p', '/tmp/qwen3_deps_tmp.txt' ], text=True, cwd='tmp') if False else None # Simpler: extract manually with open('/tmp/qwen3_info.txt', 'w') as f: import subprocess tmp = '/tmp/qwen3_clone' result = subprocess.run(['sed', '-n', '/^\s*dependencies\s*=/,/^\s*\]/p', f'{tmp}/pyproject.toml'], capture_output=True, text=True, cwd='/') f.write(result.stdout) with open('/tmp/qwen3_info.txt') as f: content = f.read() deps = re.findall(r'"([^"]+)"', content) with open('backend/requirements.txt') as f: backend_content = f.read().lower() print("Qwen3-TTS runtime dependencies:") for dep in deps: pkg_name = re.split(r'[<>=\[\]@;]', dep)[0].strip().lower() found = bool(re.search(rf'\b{re.escape(pkg_name)}\b', backend_content)) status = "✓" if found else "✗ MISSING" print(f" {status} {dep}") PYEOF

Repository: jamiepine/voicebox

Length of output: 724

🏁 Script executed:

#!/bin/bash echo "=== Qwen3-TTS dependencies vs backend/requirements.txt ===" cat <<'EOF' | python3 qwen_deps = [ "accelerate==1.12.0", "einops", "gradio", "librosa", "onnxruntime", "soundfile", "sox", "torchaudio", "transformers==4.57.3" ] with open('backend/requirements.txt') as f: backend_content = f.read().lower() print("Package status in backend/requirements.txt:\n") missing = [] for dep in qwen_deps: pkg_name = dep.split('==')[0].split('[')[0].strip().lower() if pkg_name in backend_content: print(f" ✓ {pkg_name}") else: print(f" ✗ {pkg_name} [MISSING]") missing.append(pkg_name) print(f"\nMissing: {', '.join(missing) if missing else 'None'}") EOF

Repository: jamiepine/voicebox

Length of output: 368

Add missing Qwen3-TTS runtime dependencies to backend/requirements.txt.

Installing Qwen3-TTS with --no-deps skips its declared dependencies. Verification shows 4 packages are missing from backend/requirements.txt: einops, gradio, onnxruntime, and sox.

These will cause ImportError at runtime on a fresh setup. Add them to backend/requirements.txt or remove the --no-deps flag and let transitive installation handle them (noting the torch pin may need adjustment if conflicts arise).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@justfile` around lines 75 - 76, The justfile installs Qwen3-TTS with --no-deps which skips its runtime dependencies; update backend/requirements.txt to include einops, gradio, onnxruntime, and sox so a fresh backend install won't hit ImportError, or alternatively remove the --no-deps flag in the justfile pip install of Qwen3-TTS and resolve any torch pin conflicts if they arise; modify either the justfile entry referencing Qwen3-TTS or the backend/requirements.txt to reflect this change.

tauri/src-tauri/gen/schemas/linux-schema.json is auto-generated by Tauri on Linux during dev/build and is not meant to be committed.

This reverts commit 4715a6a.

reachire-smendola added 2 commits April 25, 2026 11:36

coderabbitai Bot reviewed Apr 25, 2026

View reviewed changes

reachire-smendola added 2 commits April 25, 2026 12:00

chore: ignore Tauri Linux-generated schema file

4715a6a

tauri/src-tauri/gen/schemas/linux-schema.json is auto-generated by Tauri on Linux during dev/build and is not meant to be committed.

Revert "chore: ignore Tauri Linux-generated schema file"

7120f86

This reverts commit 4715a6a.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(setup): auto-detect NVIDIA GPU on Linux and install PyTorch cu128#560

fix(setup): auto-detect NVIDIA GPU on Linux and install PyTorch cu128#560
smendola wants to merge 4 commits intojamiepine:mainfrom
smendola:linux-gpu-support

smendola commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Uh oh!

coderabbitai Bot Apr 25, 2026

Uh oh!

coderabbitai Bot Apr 25, 2026

Uh oh!

coderabbitai Bot Apr 25, 2026

Uh oh!

coderabbitai Bot Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    _plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64"
-    server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
-    libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"
+    base_url = f"{GITHUB_RELEASES_URL}/{version}"
+    import platform as _platform
+    machine = _platform.machine().lower()
+    if sys.platform == "win32" and machine in ("amd64", "x86_64"):
+        _plat = "windows-x86_64"
+    elif sys.platform.startswith("linux") and machine in ("x86_64", "amd64"):
+        _plat = "linux-x86_64"
+    else:
+        raise RuntimeError(
+            f"CUDA backend is not available for {sys.platform}/{machine}; "
+            "supported targets are windows-x86_64 and linux-x86_64."
+        )
+    server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
+    libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"

		# --no-deps prevents Qwen3-TTS from overriding the pinned torch version
		{{ pip }} install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git

Conversation

smendola commented Apr 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Test

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

smendola commented Apr 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 25, 2026 •

edited

Loading