Skip to content

fix(setup): auto-detect NVIDIA GPU on Linux and install PyTorch cu128#560

Open
smendola wants to merge 4 commits intojamiepine:mainfrom
smendola:linux-gpu-support
Open

fix(setup): auto-detect NVIDIA GPU on Linux and install PyTorch cu128#560
smendola wants to merge 4 commits intojamiepine:mainfrom
smendola:linux-gpu-support

Conversation

@smendola
Copy link
Copy Markdown

@smendola smendola commented Apr 25, 2026

Problem

just setup on Linux with an NVIDIA GPU installs CPU-only PyTorch. The default torch wheel on PyPI satisfies requirements.txt without ever consulting the PyTorch CUDA index, so even users with a capable GPU end up with no CUDA acceleration.

Fix

Detect nvidia-smi at setup time and install torch==2.7.0+cu128 / torchaudio==2.7.0+cu128 via a constraints file from https://download.pytorch.org/whl/cu128.

Why cu128 and not cu130? Driver 570 — the default shipped with Ubuntu 24.04's current NVIDIA packages — supports CUDA 12.8 max. cu130 requires driver 576+. Pinning cu128 means this works for any NVIDIA GPU on driver 520+.

No impact on other platforms:

  • macOS (Apple Silicon or Intel): unchanged, takes the existing path
  • Linux without nvidia-smi: falls through to the same CPU install as before
  • Windows: the [unix] block doesn't run; Windows already has its own GPU detection

--no-deps is also added to the Qwen3-TTS install to prevent it from overriding the pinned torch version.

Test

On a Linux machine with an NVIDIA GPU:

python3.12 -m venv backend/venv
just setup
backend/venv/bin/python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"
# → 2.7.0+cu128
# → True

Summary by CodeRabbit

  • New Features

    • Platform-specific CUDA packages and artifacts for Windows and Linux.
    • System-aware PyTorch installation: CUDA 12.8 for Linux GPUs, default for Apple Silicon, CPU-only otherwise.
  • Chores

    • Release pipeline updated to generate, name, and upload platform-tagged CUDA artifacts.
    • Installer tasks adjusted to respect pinned torch versions.

`just setup` on Linux with an NVIDIA GPU previously installed CPU-only
torch because PyPI's default torch wheel satisfies requirements.txt
without consulting the PyTorch CUDA index.

This change detects `nvidia-smi` at setup time and pins
torch==2.7.0+cu128 / torchaudio==2.7.0+cu128 via a constraints file,
pulling from https://download.pytorch.org/whl/cu128. cu128 is chosen
over cu130 because driver 570 (the default on Ubuntu 24.04 with current
NVIDIA packages) supports CUDA 12.8 max; cu130 requires driver 576+.

macOS and non-NVIDIA Linux paths are unchanged. Qwen3-TTS gains
--no-deps to prevent it from overriding the pinned torch version.
…nload

The CUDA backend download previously used a single archive name
(voicebox-server-cuda.tar.gz) for all platforms. Only a Windows binary
was published in releases, so Linux users silently downloaded and
extracted a Windows .exe, which was unusable.

Changes:
- backend/services/cuda.py: derive archive names from sys.platform so
  Linux downloads voicebox-server-cuda-linux-x86_64.tar.gz and Windows
  downloads voicebox-server-cuda-windows-x86_64.tar.gz (same for
  cuda-libs archives)
- scripts/package_cuda.py: add required --platform argument that stamps
  the platform identifier into all output archive and checksum filenames
- .github/workflows/release.yml: pass --platform windows-x86_64 to the
  existing Windows job; add build-cuda-linux job on ubuntu-latest that
  installs torch cu128, builds with PyInstaller, packages with
  --platform linux-x86_64, and uploads the Linux archives to the release

No GPU is required in CI — PyInstaller bundles the CUDA .so libraries
from the pip-installed nvidia-* packages at build time.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c43ffd1e-a7cf-4793-9c65-842b49756696

📥 Commits

Reviewing files that changed from the base of the PR and between 4c3cb97 and 4715a6a.

📒 Files selected for processing (1)
  • .gitignore
✅ Files skipped from review due to trivial changes (1)
  • .gitignore

📝 Walkthrough

Walkthrough

Adds platform-specific CUDA packaging and CI builds: renames artifacts with platform suffixes, threads platform through packaging and manifests, adds a Linux CUDA build job, and updates the backend to download platform-suffixed release archives and the local build/install tooling for OS/GPU-aware PyTorch installs.

Changes

Cohort / File(s) Summary
CI/CD Workflow
.github/workflows/release.yml
Makes Windows CUDA packaging platform-specific (--platform windows-x86_64) and adds a new build-cuda-linux job that builds CUDA 12.8 on Linux, packages server + CUDA libs with --platform linux-x86_64, and uploads platform-suffixed archives and onedir artifact.
Packaging Script
scripts/package_cuda.py
Adds required --platform argument, includes platform in generated archive filenames and .sha256 sidecars, and adds "platform" to the cuda-libs.json manifest; updates package(...) signature to accept platform.
Backend CUDA Service
backend/services/cuda.py
Derives platform (_plat) from sys.platform and appends it to server core and CUDA libs tarball names when downloading release assets.
Build Tooling / Local dev
justfile
Implements OS/GPU-aware PyTorch install logic: Linux with nvidia-smi installs CUDA 12.8–pinned wheels via cu128 index; Apple Silicon uses default requirements; others use CPU-only requirements. Adds --no-deps when installing Qwen3-TTS.
Misc
.gitignore
Adds generated tauri/src-tauri/gen/schemas/linux-schema.json to ignore list.

Sequence Diagram(s)

sequenceDiagram
  participant CI as CI Job
  participant Pack as package_cuda.py
  participant GH as GitHub Releases
  participant Backend as backend/services/cuda.py
  participant Local as Local/Runtime

  CI->>Pack: build onedir server (Windows/Linux) + cuda libs
  CI->>Pack: package(..., platform)
  Pack->>GH: upload artifacts (voicebox-server-cuda-{platform}.tar.gz, cuda-libs-{ver}-{platform}.tar.gz) + .sha256 + cuda-libs.json
  Local->>GH: request platform-specific assets
  GH->>Backend: serve platform-suffixed archives
  Backend->>Local: download server core + cuda-libs-{ver}-{platform}.tar.gz
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through builds and folders wide,
Windows here and Linux by my side.
Archives stamped with platforms fair,
Manifests know just where they fare.
Tiny paws packed CUDA's pride — hooray!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective of the changeset: adding NVIDIA GPU auto-detection on Linux and installing CUDA-enabled PyTorch cu128. The title is concise, specific, and directly reflects the primary purpose across multiple files (justfile, scripts, workflows, and backend services).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/package_cuda.py (1)

218-223: ⚠️ Potential issue | 🟡 Minor

Stale --torch-compat help text.

The default value on line 221 is ">=2.7.0,<2.11.0" but the help string on line 222 still advertises >=2.6.0,<2.11.0. Update the help to match the actual default (this also matches the value the workflow now passes for both Windows and Linux jobs).

📝 Suggested fix
     parser.add_argument(
         "--torch-compat",
         type=str,
         default=">=2.7.0,<2.11.0",
-        help="Torch version compatibility range (default: >=2.6.0,<2.11.0)",
+        help="Torch version compatibility range (default: >=2.7.0,<2.11.0)",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/package_cuda.py` around lines 218 - 223, Update the help text for the
--torch-compat argument to match the actual default value; in the
parser.add_argument call for "--torch-compat" change the help string from
referencing ">=2.6.0,<2.11.0" to ">=2.7.0,<2.11.0" so the help accurately
reflects the default default=">=2.7.0,<2.11.0" used by the script and CI.
🧹 Nitpick comments (2)
justfile (1)

47-64: Tempfile leak on pip failure; misleading branch message on macOS Intel.

Two minor issues in the new GPU-detection block:

  1. With set -euo pipefail, if {{ pip }} install fails the rm -f "$_constraints" on line 57 never runs and the temp file stays in $TMPDIR. A trap on EXIT ensures cleanup on either path.
  2. The else branch on lines 61–63 also fires on macOS Intel (which is intentional per the PR description) but its message is "No NVIDIA GPU detected — using CPU-only PyTorch.", which reads oddly on a Mac.
♻️ Suggested cleanup
     if [ "$(uname)" = "Linux" ] && command -v nvidia-smi &>/dev/null && nvidia-smi &>/dev/null; then
         echo "NVIDIA GPU detected — installing PyTorch with CUDA 12.8 (cu128)..."
         _constraints=$(mktemp)
+        trap 'rm -f "$_constraints"' EXIT
         printf 'torch==2.7.0+cu128\ntorchaudio==2.7.0+cu128\n' > "$_constraints"
         {{ pip }} install \
             --extra-index-url https://download.pytorch.org/whl/cu128 \
             -r {{ backend_dir }}/requirements.txt \
             -c "$_constraints"
-        rm -f "$_constraints"
     elif [ "$(uname -m)" = "arm64" ] && [ "$(uname)" = "Darwin" ]; then
         echo "Apple Silicon detected — using default PyTorch (MLX path)..."
         {{ pip }} install -r {{ backend_dir }}/requirements.txt
     else
-        echo "No NVIDIA GPU detected — using CPU-only PyTorch."
+        echo "No supported GPU detected — using CPU-only PyTorch."
         {{ pip }} install -r {{ backend_dir }}/requirements.txt
     fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@justfile` around lines 47 - 64, Add a trap to ensure the temporary constraint
file (variable _constraints) is always removed even if {{ pip }} install fails:
create the temp file as before, register a trap 'on EXIT' to rm -f
"$_constraints" (and clear the trap after successful removal), and keep the
existing rm -f "$_constraints" for normal flow; additionally, change the final
branch log message (the echo in the else branch) to avoid Mac-specific wording
(e.g., "No NVIDIA GPU detected — using CPU-only PyTorch.") so it doesn't read
oddly on Intel macOS—use a neutral message like "No NVIDIA GPU detected —
installing CPU-only PyTorch." and leave the arm64/Darwin branch message
unchanged.
backend/services/cuda.py (1)

344-346: Local cuda-libs.json diverges from the manifest published by package_cuda.py.

scripts/package_cuda.py writes a manifest with version, platform, torch_compat, archive, and sha256 (see scripts/package_cuda.py lines 172–178), but here we only persist {"version": CUDA_LIBS_VERSION}. It's safe today because get_installed_cuda_libs_version() only reads version, but you lose the integrity/compat info if you ever want to validate at runtime. Consider either downloading the published cuda-libs.json alongside the archive, or mirroring the same fields locally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/cuda.py` around lines 344 - 346, The local cuda-libs.json
currently only writes {"version": CUDA_LIBS_VERSION} which diverges from the
published manifest format in scripts/package_cuda.py (which includes version,
platform, torch_compat, archive, sha256); update the code that builds the
manifest (the block that calls get_cuda_libs_manifest_path().write_text) to
mirror the published fields — include "platform", "torch_compat", "archive" and
compute the archive "sha256" (or download the published manifest from the same
source as package_cuda.py and write that out) while preserving "version" (use
CUDA_LIBS_VERSION); ensure the produced manifest shape matches the manifest
produced in scripts/package_cuda.py so runtime validation and integrity checks
can use the same fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/release.yml:
- Around line 374-381: The Linux CUDA release job adds a step that installs
Qwen3-TTS (the pip command "pip install --no-deps
git+https://github.com/QwenLM/Qwen3-TTS.git") but the Windows CUDA job
(build-cuda-windows) does not, causing inconsistent artifacts; to fix, either
add an equivalent "Install Qwen3-TTS" step with the same pip command to the
build-cuda-windows job before its "Build CUDA server binary (onedir)" step, or
remove the install step from the Linux CUDA job so both CUDA jobs expose the
same model set—update the workflow so both jobs contain identical model-install
steps.
- Around line 392-403: The Linux release job currently omits the cuda-libs.json
manifest and may write into release-assets/ even on non-tag runs; update the
"Upload archives to GitHub Release" step to either upload a platform-suffixed
manifest (e.g. cuda-libs-cu128-v1-linux-x86_64.json) to match the Windows upload
or remove the manifest from the release altogether if you don't need it
(package_cuda.py still writes plain cuda-libs.json so if you keep it you must
rename it per-platform before upload to avoid name collisions); also make the
"Package into…" step or the upload step conditional so artifacts in
release-assets/ are only produced/uploaded on tag-based releases (or ensure you
clean/skip packaging on non-tag runs) to avoid silently discarding artifacts.
- Around line 342-381: The build-cuda-linux job is missing the disk cleanup step
used in the main release job, causing potential "No space left on device" errors
during heavy CUDA wheel installs and PyInstaller bundling; add a new step
immediately after the actions/checkout@v4 step in the build-cuda-linux job that
runs the same free-up-disk-space commands as the release job (the cleanup run
that clears apt caches, removes unused packages/snap/docker caches and purges
/var/lib/apt/lists, /var/cache/apt, /tmp and /var/tmp to reclaim ~25GB) so that
the subsequent steps (Install Python dependencies, Install PyTorch with CUDA
12.8, Build CUDA server binary) have sufficient disk space.

In `@backend/services/cuda.py`:
- Around line 289-291: The platform detection using _plat incorrectly defaults
all non-win32 to "linux-x86_64"; update the logic around the _plat,
server_archive and libs_archive construction in backend/services/cuda.py to
detect sys.platform and platform.machine(): set _plat to "windows-x86_64" for
"win32", "darwin-x86_64" or "darwin-arm64" (or "darwin-x86_64"/"darwin-arm64"
variants) for macOS, and "linux-aarch64" for aarch64/arm64 Linux and
"linux-x86_64" for x86_64 Linux; if the combination is unsupported, raise a
clear exception (or return an explicit error) before constructing
server_archive/libs_archive (reference symbols: _plat, server_archive,
libs_archive, CUDA_LIBS_VERSION) and ensure the same validation is used by both
the auto-update startup flow and the /backend/download-cuda endpoint so wrong
archives are never requested.

In `@justfile`:
- Around line 75-76: The justfile installs Qwen3-TTS with --no-deps which skips
its runtime dependencies; update backend/requirements.txt to include einops,
gradio, onnxruntime, and sox so a fresh backend install won't hit ImportError,
or alternatively remove the --no-deps flag in the justfile pip install of
Qwen3-TTS and resolve any torch pin conflicts if they arise; modify either the
justfile entry referencing Qwen3-TTS or the backend/requirements.txt to reflect
this change.

---

Outside diff comments:
In `@scripts/package_cuda.py`:
- Around line 218-223: Update the help text for the --torch-compat argument to
match the actual default value; in the parser.add_argument call for
"--torch-compat" change the help string from referencing ">=2.6.0,<2.11.0" to
">=2.7.0,<2.11.0" so the help accurately reflects the default
default=">=2.7.0,<2.11.0" used by the script and CI.

---

Nitpick comments:
In `@backend/services/cuda.py`:
- Around line 344-346: The local cuda-libs.json currently only writes
{"version": CUDA_LIBS_VERSION} which diverges from the published manifest format
in scripts/package_cuda.py (which includes version, platform, torch_compat,
archive, sha256); update the code that builds the manifest (the block that calls
get_cuda_libs_manifest_path().write_text) to mirror the published fields —
include "platform", "torch_compat", "archive" and compute the archive "sha256"
(or download the published manifest from the same source as package_cuda.py and
write that out) while preserving "version" (use CUDA_LIBS_VERSION); ensure the
produced manifest shape matches the manifest produced in scripts/package_cuda.py
so runtime validation and integrity checks can use the same fields.

In `@justfile`:
- Around line 47-64: Add a trap to ensure the temporary constraint file
(variable _constraints) is always removed even if {{ pip }} install fails:
create the temp file as before, register a trap 'on EXIT' to rm -f
"$_constraints" (and clear the trap after successful removal), and keep the
existing rm -f "$_constraints" for normal flow; additionally, change the final
branch log message (the echo in the else branch) to avoid Mac-specific wording
(e.g., "No NVIDIA GPU detected — using CPU-only PyTorch.") so it doesn't read
oddly on Intel macOS—use a neutral message like "No NVIDIA GPU detected —
installing CPU-only PyTorch." and leave the arm64/Darwin branch message
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ccf46e00-cce6-4bc0-ba0c-4a7cbfbb5d0e

📥 Commits

Reviewing files that changed from the base of the PR and between ed2eec5 and 4c3cb97.

📒 Files selected for processing (4)
  • .github/workflows/release.yml
  • backend/services/cuda.py
  • justfile
  • scripts/package_cuda.py

Comment on lines +342 to +381
build-cuda-linux:
runs-on: ubuntu-latest
permissions:
contents: write

steps:
- uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: "pip"

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install pyinstaller
pip install -r backend/requirements.txt
pip install --no-deps chatterbox-tts
pip install --no-deps hume-tada

- name: Install PyTorch with CUDA 12.8
run: |
pip install torch==2.7.0+cu128 torchaudio==2.7.0+cu128 \
--index-url https://download.pytorch.org/whl/cu128 \
--force-reinstall --no-deps

- name: Verify CUDA libraries are present
run: |
python -c "import torch; print(f'torch: {torch.__version__}'); print(f'CUDA version: {torch.version.cuda}')"

- name: Install Qwen3-TTS
run: pip install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git

- name: Build CUDA server binary (onedir)
working-directory: backend
env:
TORCH_CUDA_ARCH_LIST: "8.0;8.6;8.9;9.0;12.0+PTX"
run: python build_binary.py --cuda
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

build-cuda-linux is missing the "Free up disk space" step used elsewhere on Ubuntu — likely to OOD-disk.

The existing release job goes out of its way to free ~25 GB on Ubuntu before building (lines 40–55), with a comment that this is what tripped the March 2026 Linux release attempts. The new CUDA build is strictly heavier (CUDA torch wheels are ~2 GB each, plus PyInstaller bundling all the NVIDIA .so files), and it runs without that cleanup on the same ubuntu-latest image. Expect "No space left on device" in PyInstaller or during tar packaging.

🛠️ Suggested addition (after `actions/checkout@v4`)
     steps:
       - uses: actions/checkout@v4

+      - name: Free up disk space
+        uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be
+        with:
+          tool-cache: false
+          android: true
+          dotnet: true
+          haskell: true
+          large-packages: false
+          swap-storage: true
+
       - name: Setup Python
         uses: actions/setup-python@v5
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 342 - 381, The build-cuda-linux
job is missing the disk cleanup step used in the main release job, causing
potential "No space left on device" errors during heavy CUDA wheel installs and
PyInstaller bundling; add a new step immediately after the actions/checkout@v4
step in the build-cuda-linux job that runs the same free-up-disk-space commands
as the release job (the cleanup run that clears apt caches, removes unused
packages/snap/docker caches and purges /var/lib/apt/lists, /var/cache/apt, /tmp
and /var/tmp to reclaim ~25GB) so that the subsequent steps (Install Python
dependencies, Install PyTorch with CUDA 12.8, Build CUDA server binary) have
sufficient disk space.

Comment on lines +374 to +381
- name: Install Qwen3-TTS
run: pip install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git

- name: Build CUDA server binary (onedir)
working-directory: backend
env:
TORCH_CUDA_ARCH_LIST: "8.0;8.6;8.9;9.0;12.0+PTX"
run: python build_binary.py --cuda
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Feature parity gap: Linux CUDA build installs Qwen3-TTS, Windows CUDA build does not.

The Windows CUDA job (lines 283–309) never installs git+https://github.com/QwenLM/Qwen3-TTS.git, but the new Linux job does (line 375). After this PR, the same release tag will ship a CUDA backend that supports Qwen3-TTS on Linux but not on Windows, which is surprising for a per-platform packaging change. Either add the same step to build-cuda-windows, or drop it from the Linux job, so both CUDA artifacts expose the same model set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 374 - 381, The Linux CUDA release
job adds a step that installs Qwen3-TTS (the pip command "pip install --no-deps
git+https://github.com/QwenLM/Qwen3-TTS.git") but the Windows CUDA job
(build-cuda-windows) does not, causing inconsistent artifacts; to fix, either
add an equivalent "Install Qwen3-TTS" step with the same pip command to the
build-cuda-windows job before its "Build CUDA server binary (onedir)" step, or
remove the install step from the Linux CUDA job so both CUDA jobs expose the
same model set—update the workflow so both jobs contain identical model-install
steps.

Comment on lines +392 to +403
- name: Upload archives to GitHub Release
if: startsWith(github.ref, 'refs/tags/')
uses: softprops/action-gh-release@v2
with:
files: |
release-assets/voicebox-server-cuda-linux-x86_64.tar.gz
release-assets/voicebox-server-cuda-linux-x86_64.tar.gz.sha256
release-assets/cuda-libs-cu128-v1-linux-x86_64.tar.gz
release-assets/cuda-libs-cu128-v1-linux-x86_64.tar.gz.sha256
draft: true
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

cuda-libs.json is not uploaded for Linux; release-assets/ may not exist on non-tag pushes either.

Two things to clean up here:

  1. The Windows job (line 330) uploads release-assets/cuda-libs.json to the release, but this Linux step does not. As a result the published cuda-libs.json will only ever describe the Windows artifact. If you want both, the manifest filename itself needs to be platform-suffixed (e.g. cuda-libs-cu128-v1-linux-x86_64.json), since package_cuda.py always writes plain cuda-libs.json and a release can't hold two assets with the same name. Otherwise consider dropping the manifest from the release uploads entirely (the runtime in backend/services/cuda.py doesn't fetch it anyway).
  2. The Package into… step on lines 383–390 always runs and writes into release-assets/, but the Upload archives step is gated on startsWith(github.ref, 'refs/tags/'). On non-tag runs (e.g. workflow_dispatch), packaging still succeeds but the archives are silently discarded. Same shape as the Windows job today — calling it out so it's an intentional choice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 392 - 403, The Linux release job
currently omits the cuda-libs.json manifest and may write into release-assets/
even on non-tag runs; update the "Upload archives to GitHub Release" step to
either upload a platform-suffixed manifest (e.g.
cuda-libs-cu128-v1-linux-x86_64.json) to match the Windows upload or remove the
manifest from the release altogether if you don't need it (package_cuda.py still
writes plain cuda-libs.json so if you keep it you must rename it per-platform
before upload to avoid name collisions); also make the "Package into…" step or
the upload step conditional so artifacts in release-assets/ are only
produced/uploaded on tag-based releases (or ensure you clean/skip packaging on
non-tag runs) to avoid silently discarding artifacts.

Comment thread backend/services/cuda.py
Comment on lines +289 to +291
_plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64"
server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find call sites for download_cuda_binary / check_and_update_cuda_binary and look for platform gating.
rg -nP --type=py -C3 '\b(download_cuda_binary|check_and_update_cuda_binary)\s*\('
echo
echo "=== platform/sys.platform gates around CUDA ==="
rg -nP --type=py -C2 'cuda' -g '!**/tests/**' | rg -nP '(sys\.platform|platform\.machine|is_cuda_active|VOICEBOX_BACKEND_VARIANT)' | head -60

Repository: jamiepine/voicebox

Length of output: 3554


🏁 Script executed:

# Check the context around app.py:249 where check_and_update_cuda_binary is called
sed -n '235,265p' backend/app.py | cat -n

echo "=== Check backend/services/cuda.py lines 50-120 for platform logic ==="
sed -n '50,120p' backend/services/cuda.py | cat -n

echo "=== Check routes/cuda.py for any gating ==="
cat -n backend/routes/cuda.py | head -60

Repository: jamiepine/voicebox

Length of output: 6632


🏁 Script executed:

# Check how backend type is determined and if CUDA is conditionally enabled
sed -n '240,255p' backend/app.py | cat -n

echo "=== Check get_backend_type and related logic ==="
rg -A10 -B3 'def get_backend_type\(\)' --type=py

echo "=== Check if check_and_update_cuda_binary is conditionally called ==="
rg -B15 'check_and_update_cuda_binary' backend/app.py | head -40

Repository: jamiepine/voicebox

Length of output: 2563


_plat falls back to linux-x86_64 for any non-win32 platform — wrong on macOS and ARM Linux.

sys.platform == "win32" distinguishes only Windows from "everything else". As written:

  • On macOS, sys.platform == "darwin"_plat = "linux-x86_64". The background auto-update task on startup will fetch and try to extract a Linux x86_64 tarball.
  • On ARM64 Linux (Jetson, aarch64 servers), sys.platform == "linux" regardless of arch → same Linux x86_64 archive is downloaded, and the binary won't run.

Both code paths are reachable: the auto-update runs unconditionally on startup (app.py:249), and the HTTP endpoint /backend/download-cuda has no platform validation. Detect unsupported platforms early and raise a clear error rather than silently fetch the wrong archive.

🛡️ Suggested guard
-    base_url = f"{GITHUB_RELEASES_URL}/{version}"
-    _plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64"
-    server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
-    libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"
+    base_url = f"{GITHUB_RELEASES_URL}/{version}"
+    import platform as _platform
+    machine = _platform.machine().lower()
+    if sys.platform == "win32" and machine in ("amd64", "x86_64"):
+        _plat = "windows-x86_64"
+    elif sys.platform.startswith("linux") and machine in ("x86_64", "amd64"):
+        _plat = "linux-x86_64"
+    else:
+        raise RuntimeError(
+            f"CUDA backend is not available for {sys.platform}/{machine}; "
+            "supported targets are windows-x86_64 and linux-x86_64."
+        )
+    server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
+    libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64"
server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"
base_url = f"{GITHUB_RELEASES_URL}/{version}"
import platform as _platform
machine = _platform.machine().lower()
if sys.platform == "win32" and machine in ("amd64", "x86_64"):
_plat = "windows-x86_64"
elif sys.platform.startswith("linux") and machine in ("x86_64", "amd64"):
_plat = "linux-x86_64"
else:
raise RuntimeError(
f"CUDA backend is not available for {sys.platform}/{machine}; "
"supported targets are windows-x86_64 and linux-x86_64."
)
server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/cuda.py` around lines 289 - 291, The platform detection
using _plat incorrectly defaults all non-win32 to "linux-x86_64"; update the
logic around the _plat, server_archive and libs_archive construction in
backend/services/cuda.py to detect sys.platform and platform.machine(): set
_plat to "windows-x86_64" for "win32", "darwin-x86_64" or "darwin-arm64" (or
"darwin-x86_64"/"darwin-arm64" variants) for macOS, and "linux-aarch64" for
aarch64/arm64 Linux and "linux-x86_64" for x86_64 Linux; if the combination is
unsupported, raise a clear exception (or return an explicit error) before
constructing server_archive/libs_archive (reference symbols: _plat,
server_archive, libs_archive, CUDA_LIBS_VERSION) and ensure the same validation
is used by both the auto-update startup flow and the /backend/download-cuda
endpoint so wrong archives are never requested.

Comment thread justfile
Comment on lines +75 to +76
# --no-deps prevents Qwen3-TTS from overriding the pinned torch version
{{ pip }} install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Inspect Qwen3-TTS install_requires to compare against backend/requirements.txt.
set -euo pipefail
tmp=$(mktemp -d)
git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1
echo "=== Qwen3-TTS pyproject/setup ==="
fd -t f -E '*.lock' . "$tmp/qwen3" -e toml -e cfg -e py | xargs grep -nE 'install_requires|dependencies|requires' 2>/dev/null | head -80
echo
echo "=== backend/requirements.txt ==="
cat backend/requirements.txt

Repository: jamiepine/voicebox

Length of output: 4574


🏁 Script executed:

#!/bin/bash
set -euo pipefail
tmp=$(mktemp -d)
git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1

echo "=== Qwen3-TTS dependencies (full) ==="
python3 <<'PYEOF'
import toml
with open('$tmp/qwen3/pyproject.toml') as f:
    data = toml.load(f)
    deps = data.get('project', {}).get('dependencies', [])
    for dep in deps:
        print(dep)
PYEOF

echo
echo "=== Checking each Qwen3-TTS dep against backend/requirements.txt ==="
python3 <<'PYEOF'
import toml
import re

with open('$tmp/qwen3/pyproject.toml') as f:
    data = toml.load(f)
    qwen_deps = data.get('project', {}).get('dependencies', [])

with open('backend/requirements.txt') as f:
    backend_reqs = f.read()

print("Qwen3-TTS dependencies not found in backend/requirements.txt:")
for dep in qwen_deps:
    # Extract package name (before version specifiers, brackets, etc.)
    pkg_name = re.split(r'[<>=\[\]@;]', dep)[0].strip().lower()
    if pkg_name and not re.search(rf'\b{re.escape(pkg_name)}\b', backend_reqs, re.IGNORECASE):
        print(f"  MISSING: {dep}")
    else:
        print(f"  OK: {dep}")
PYEOF

Repository: jamiepine/voicebox

Length of output: 235


🏁 Script executed:

#!/bin/bash
set -euo pipefail
tmp=$(mktemp -d)
git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1

echo "=== Qwen3-TTS dependencies (from pyproject.toml) ==="
sed -n '/^\s*dependencies\s*=/,/^\s*\]/p' "$tmp/qwen3/pyproject.toml" | grep -oE '"[^"]+"' | tr -d '"' | sort

echo
echo "=== backend/requirements.txt packages ==="
grep -v '^#' backend/requirements.txt | grep -v '^\s*$' | sed 's/[><=@\[].*//' | tr '[:upper:]' '[:lower:]' | sort

echo
echo "=== Comparison: Missing deps ==="
python3 <<'PYEOF'
import subprocess
import re

# Get Qwen3-TTS deps
qwen_deps = subprocess.check_output([
    'sed', '-n', '/^\s*dependencies\s*=/,/^\s*\]/p',
    '/tmp/qwen3_deps_tmp.txt'
], text=True, cwd='tmp') if False else None

# Simpler: extract manually
with open('/tmp/qwen3_info.txt', 'w') as f:
    import subprocess
    tmp = '/tmp/qwen3_clone'
    result = subprocess.run(['sed', '-n', '/^\s*dependencies\s*=/,/^\s*\]/p', f'{tmp}/pyproject.toml'], 
                          capture_output=True, text=True, cwd='/')
    f.write(result.stdout)

with open('/tmp/qwen3_info.txt') as f:
    content = f.read()
    deps = re.findall(r'"([^"]+)"', content)
    
with open('backend/requirements.txt') as f:
    backend_content = f.read().lower()

print("Qwen3-TTS runtime dependencies:")
for dep in deps:
    pkg_name = re.split(r'[<>=\[\]@;]', dep)[0].strip().lower()
    found = bool(re.search(rf'\b{re.escape(pkg_name)}\b', backend_content))
    status = "✓" if found else "✗ MISSING"
    print(f"  {status} {dep}")
PYEOF

Repository: jamiepine/voicebox

Length of output: 724


🏁 Script executed:

#!/bin/bash
echo "=== Qwen3-TTS dependencies vs backend/requirements.txt ==="
cat <<'EOF' | python3
qwen_deps = [
    "accelerate==1.12.0",
    "einops",
    "gradio",
    "librosa",
    "onnxruntime",
    "soundfile",
    "sox",
    "torchaudio",
    "transformers==4.57.3"
]

with open('backend/requirements.txt') as f:
    backend_content = f.read().lower()

print("Package status in backend/requirements.txt:\n")
missing = []
for dep in qwen_deps:
    pkg_name = dep.split('==')[0].split('[')[0].strip().lower()
    if pkg_name in backend_content:
        print(f"  ✓ {pkg_name}")
    else:
        print(f"  ✗ {pkg_name} [MISSING]")
        missing.append(pkg_name)

print(f"\nMissing: {', '.join(missing) if missing else 'None'}")
EOF

Repository: jamiepine/voicebox

Length of output: 368


Add missing Qwen3-TTS runtime dependencies to backend/requirements.txt.

Installing Qwen3-TTS with --no-deps skips its declared dependencies. Verification shows 4 packages are missing from backend/requirements.txt: einops, gradio, onnxruntime, and sox.

These will cause ImportError at runtime on a fresh setup. Add them to backend/requirements.txt or remove the --no-deps flag and let transitive installation handle them (noting the torch pin may need adjustment if conflicts arise).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@justfile` around lines 75 - 76, The justfile installs Qwen3-TTS with
--no-deps which skips its runtime dependencies; update backend/requirements.txt
to include einops, gradio, onnxruntime, and sox so a fresh backend install won't
hit ImportError, or alternatively remove the --no-deps flag in the justfile pip
install of Qwen3-TTS and resolve any torch pin conflicts if they arise; modify
either the justfile entry referencing Qwen3-TTS or the backend/requirements.txt
to reflect this change.

tauri/src-tauri/gen/schemas/linux-schema.json is auto-generated by
Tauri on Linux during dev/build and is not meant to be committed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants