fix(setup): auto-detect NVIDIA GPU on Linux and install PyTorch cu128#560
fix(setup): auto-detect NVIDIA GPU on Linux and install PyTorch cu128#560smendola wants to merge 4 commits intojamiepine:mainfrom
Conversation
`just setup` on Linux with an NVIDIA GPU previously installed CPU-only torch because PyPI's default torch wheel satisfies requirements.txt without consulting the PyTorch CUDA index. This change detects `nvidia-smi` at setup time and pins torch==2.7.0+cu128 / torchaudio==2.7.0+cu128 via a constraints file, pulling from https://download.pytorch.org/whl/cu128. cu128 is chosen over cu130 because driver 570 (the default on Ubuntu 24.04 with current NVIDIA packages) supports CUDA 12.8 max; cu130 requires driver 576+. macOS and non-NVIDIA Linux paths are unchanged. Qwen3-TTS gains --no-deps to prevent it from overriding the pinned torch version.
…nload The CUDA backend download previously used a single archive name (voicebox-server-cuda.tar.gz) for all platforms. Only a Windows binary was published in releases, so Linux users silently downloaded and extracted a Windows .exe, which was unusable. Changes: - backend/services/cuda.py: derive archive names from sys.platform so Linux downloads voicebox-server-cuda-linux-x86_64.tar.gz and Windows downloads voicebox-server-cuda-windows-x86_64.tar.gz (same for cuda-libs archives) - scripts/package_cuda.py: add required --platform argument that stamps the platform identifier into all output archive and checksum filenames - .github/workflows/release.yml: pass --platform windows-x86_64 to the existing Windows job; add build-cuda-linux job on ubuntu-latest that installs torch cu128, builds with PyInstaller, packages with --platform linux-x86_64, and uploads the Linux archives to the release No GPU is required in CI — PyInstaller bundles the CUDA .so libraries from the pip-installed nvidia-* packages at build time.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds platform-specific CUDA packaging and CI builds: renames artifacts with platform suffixes, threads platform through packaging and manifests, adds a Linux CUDA build job, and updates the backend to download platform-suffixed release archives and the local build/install tooling for OS/GPU-aware PyTorch installs. Changes
Sequence Diagram(s)sequenceDiagram
participant CI as CI Job
participant Pack as package_cuda.py
participant GH as GitHub Releases
participant Backend as backend/services/cuda.py
participant Local as Local/Runtime
CI->>Pack: build onedir server (Windows/Linux) + cuda libs
CI->>Pack: package(..., platform)
Pack->>GH: upload artifacts (voicebox-server-cuda-{platform}.tar.gz, cuda-libs-{ver}-{platform}.tar.gz) + .sha256 + cuda-libs.json
Local->>GH: request platform-specific assets
GH->>Backend: serve platform-suffixed archives
Backend->>Local: download server core + cuda-libs-{ver}-{platform}.tar.gz
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/package_cuda.py (1)
218-223:⚠️ Potential issue | 🟡 MinorStale
--torch-compathelp text.The default value on line 221 is
">=2.7.0,<2.11.0"but the help string on line 222 still advertises>=2.6.0,<2.11.0. Update the help to match the actual default (this also matches the value the workflow now passes for both Windows and Linux jobs).📝 Suggested fix
parser.add_argument( "--torch-compat", type=str, default=">=2.7.0,<2.11.0", - help="Torch version compatibility range (default: >=2.6.0,<2.11.0)", + help="Torch version compatibility range (default: >=2.7.0,<2.11.0)", )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/package_cuda.py` around lines 218 - 223, Update the help text for the --torch-compat argument to match the actual default value; in the parser.add_argument call for "--torch-compat" change the help string from referencing ">=2.6.0,<2.11.0" to ">=2.7.0,<2.11.0" so the help accurately reflects the default default=">=2.7.0,<2.11.0" used by the script and CI.
🧹 Nitpick comments (2)
justfile (1)
47-64: Tempfile leak on pip failure; misleading branch message on macOS Intel.Two minor issues in the new GPU-detection block:
- With
set -euo pipefail, if{{ pip }} installfails therm -f "$_constraints"on line 57 never runs and the temp file stays in$TMPDIR. Atrapon EXIT ensures cleanup on either path.- The
elsebranch on lines 61–63 also fires on macOS Intel (which is intentional per the PR description) but its message is "No NVIDIA GPU detected — using CPU-only PyTorch.", which reads oddly on a Mac.♻️ Suggested cleanup
if [ "$(uname)" = "Linux" ] && command -v nvidia-smi &>/dev/null && nvidia-smi &>/dev/null; then echo "NVIDIA GPU detected — installing PyTorch with CUDA 12.8 (cu128)..." _constraints=$(mktemp) + trap 'rm -f "$_constraints"' EXIT printf 'torch==2.7.0+cu128\ntorchaudio==2.7.0+cu128\n' > "$_constraints" {{ pip }} install \ --extra-index-url https://download.pytorch.org/whl/cu128 \ -r {{ backend_dir }}/requirements.txt \ -c "$_constraints" - rm -f "$_constraints" elif [ "$(uname -m)" = "arm64" ] && [ "$(uname)" = "Darwin" ]; then echo "Apple Silicon detected — using default PyTorch (MLX path)..." {{ pip }} install -r {{ backend_dir }}/requirements.txt else - echo "No NVIDIA GPU detected — using CPU-only PyTorch." + echo "No supported GPU detected — using CPU-only PyTorch." {{ pip }} install -r {{ backend_dir }}/requirements.txt fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@justfile` around lines 47 - 64, Add a trap to ensure the temporary constraint file (variable _constraints) is always removed even if {{ pip }} install fails: create the temp file as before, register a trap 'on EXIT' to rm -f "$_constraints" (and clear the trap after successful removal), and keep the existing rm -f "$_constraints" for normal flow; additionally, change the final branch log message (the echo in the else branch) to avoid Mac-specific wording (e.g., "No NVIDIA GPU detected — using CPU-only PyTorch.") so it doesn't read oddly on Intel macOS—use a neutral message like "No NVIDIA GPU detected — installing CPU-only PyTorch." and leave the arm64/Darwin branch message unchanged.backend/services/cuda.py (1)
344-346: Localcuda-libs.jsondiverges from the manifest published bypackage_cuda.py.
scripts/package_cuda.pywrites a manifest withversion,platform,torch_compat,archive, andsha256(seescripts/package_cuda.pylines 172–178), but here we only persist{"version": CUDA_LIBS_VERSION}. It's safe today becauseget_installed_cuda_libs_version()only readsversion, but you lose the integrity/compat info if you ever want to validate at runtime. Consider either downloading the publishedcuda-libs.jsonalongside the archive, or mirroring the same fields locally.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/cuda.py` around lines 344 - 346, The local cuda-libs.json currently only writes {"version": CUDA_LIBS_VERSION} which diverges from the published manifest format in scripts/package_cuda.py (which includes version, platform, torch_compat, archive, sha256); update the code that builds the manifest (the block that calls get_cuda_libs_manifest_path().write_text) to mirror the published fields — include "platform", "torch_compat", "archive" and compute the archive "sha256" (or download the published manifest from the same source as package_cuda.py and write that out) while preserving "version" (use CUDA_LIBS_VERSION); ensure the produced manifest shape matches the manifest produced in scripts/package_cuda.py so runtime validation and integrity checks can use the same fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/release.yml:
- Around line 374-381: The Linux CUDA release job adds a step that installs
Qwen3-TTS (the pip command "pip install --no-deps
git+https://github.com/QwenLM/Qwen3-TTS.git") but the Windows CUDA job
(build-cuda-windows) does not, causing inconsistent artifacts; to fix, either
add an equivalent "Install Qwen3-TTS" step with the same pip command to the
build-cuda-windows job before its "Build CUDA server binary (onedir)" step, or
remove the install step from the Linux CUDA job so both CUDA jobs expose the
same model set—update the workflow so both jobs contain identical model-install
steps.
- Around line 392-403: The Linux release job currently omits the cuda-libs.json
manifest and may write into release-assets/ even on non-tag runs; update the
"Upload archives to GitHub Release" step to either upload a platform-suffixed
manifest (e.g. cuda-libs-cu128-v1-linux-x86_64.json) to match the Windows upload
or remove the manifest from the release altogether if you don't need it
(package_cuda.py still writes plain cuda-libs.json so if you keep it you must
rename it per-platform before upload to avoid name collisions); also make the
"Package into…" step or the upload step conditional so artifacts in
release-assets/ are only produced/uploaded on tag-based releases (or ensure you
clean/skip packaging on non-tag runs) to avoid silently discarding artifacts.
- Around line 342-381: The build-cuda-linux job is missing the disk cleanup step
used in the main release job, causing potential "No space left on device" errors
during heavy CUDA wheel installs and PyInstaller bundling; add a new step
immediately after the actions/checkout@v4 step in the build-cuda-linux job that
runs the same free-up-disk-space commands as the release job (the cleanup run
that clears apt caches, removes unused packages/snap/docker caches and purges
/var/lib/apt/lists, /var/cache/apt, /tmp and /var/tmp to reclaim ~25GB) so that
the subsequent steps (Install Python dependencies, Install PyTorch with CUDA
12.8, Build CUDA server binary) have sufficient disk space.
In `@backend/services/cuda.py`:
- Around line 289-291: The platform detection using _plat incorrectly defaults
all non-win32 to "linux-x86_64"; update the logic around the _plat,
server_archive and libs_archive construction in backend/services/cuda.py to
detect sys.platform and platform.machine(): set _plat to "windows-x86_64" for
"win32", "darwin-x86_64" or "darwin-arm64" (or "darwin-x86_64"/"darwin-arm64"
variants) for macOS, and "linux-aarch64" for aarch64/arm64 Linux and
"linux-x86_64" for x86_64 Linux; if the combination is unsupported, raise a
clear exception (or return an explicit error) before constructing
server_archive/libs_archive (reference symbols: _plat, server_archive,
libs_archive, CUDA_LIBS_VERSION) and ensure the same validation is used by both
the auto-update startup flow and the /backend/download-cuda endpoint so wrong
archives are never requested.
In `@justfile`:
- Around line 75-76: The justfile installs Qwen3-TTS with --no-deps which skips
its runtime dependencies; update backend/requirements.txt to include einops,
gradio, onnxruntime, and sox so a fresh backend install won't hit ImportError,
or alternatively remove the --no-deps flag in the justfile pip install of
Qwen3-TTS and resolve any torch pin conflicts if they arise; modify either the
justfile entry referencing Qwen3-TTS or the backend/requirements.txt to reflect
this change.
---
Outside diff comments:
In `@scripts/package_cuda.py`:
- Around line 218-223: Update the help text for the --torch-compat argument to
match the actual default value; in the parser.add_argument call for
"--torch-compat" change the help string from referencing ">=2.6.0,<2.11.0" to
">=2.7.0,<2.11.0" so the help accurately reflects the default
default=">=2.7.0,<2.11.0" used by the script and CI.
---
Nitpick comments:
In `@backend/services/cuda.py`:
- Around line 344-346: The local cuda-libs.json currently only writes
{"version": CUDA_LIBS_VERSION} which diverges from the published manifest format
in scripts/package_cuda.py (which includes version, platform, torch_compat,
archive, sha256); update the code that builds the manifest (the block that calls
get_cuda_libs_manifest_path().write_text) to mirror the published fields —
include "platform", "torch_compat", "archive" and compute the archive "sha256"
(or download the published manifest from the same source as package_cuda.py and
write that out) while preserving "version" (use CUDA_LIBS_VERSION); ensure the
produced manifest shape matches the manifest produced in scripts/package_cuda.py
so runtime validation and integrity checks can use the same fields.
In `@justfile`:
- Around line 47-64: Add a trap to ensure the temporary constraint file
(variable _constraints) is always removed even if {{ pip }} install fails:
create the temp file as before, register a trap 'on EXIT' to rm -f
"$_constraints" (and clear the trap after successful removal), and keep the
existing rm -f "$_constraints" for normal flow; additionally, change the final
branch log message (the echo in the else branch) to avoid Mac-specific wording
(e.g., "No NVIDIA GPU detected — using CPU-only PyTorch.") so it doesn't read
oddly on Intel macOS—use a neutral message like "No NVIDIA GPU detected —
installing CPU-only PyTorch." and leave the arm64/Darwin branch message
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ccf46e00-cce6-4bc0-ba0c-4a7cbfbb5d0e
📒 Files selected for processing (4)
.github/workflows/release.ymlbackend/services/cuda.pyjustfilescripts/package_cuda.py
| build-cuda-linux: | ||
| runs-on: ubuntu-latest | ||
| permissions: | ||
| contents: write | ||
|
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Setup Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.12" | ||
| cache: "pip" | ||
|
|
||
| - name: Install Python dependencies | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install pyinstaller | ||
| pip install -r backend/requirements.txt | ||
| pip install --no-deps chatterbox-tts | ||
| pip install --no-deps hume-tada | ||
|
|
||
| - name: Install PyTorch with CUDA 12.8 | ||
| run: | | ||
| pip install torch==2.7.0+cu128 torchaudio==2.7.0+cu128 \ | ||
| --index-url https://download.pytorch.org/whl/cu128 \ | ||
| --force-reinstall --no-deps | ||
|
|
||
| - name: Verify CUDA libraries are present | ||
| run: | | ||
| python -c "import torch; print(f'torch: {torch.__version__}'); print(f'CUDA version: {torch.version.cuda}')" | ||
|
|
||
| - name: Install Qwen3-TTS | ||
| run: pip install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git | ||
|
|
||
| - name: Build CUDA server binary (onedir) | ||
| working-directory: backend | ||
| env: | ||
| TORCH_CUDA_ARCH_LIST: "8.0;8.6;8.9;9.0;12.0+PTX" | ||
| run: python build_binary.py --cuda |
There was a problem hiding this comment.
build-cuda-linux is missing the "Free up disk space" step used elsewhere on Ubuntu — likely to OOD-disk.
The existing release job goes out of its way to free ~25 GB on Ubuntu before building (lines 40–55), with a comment that this is what tripped the March 2026 Linux release attempts. The new CUDA build is strictly heavier (CUDA torch wheels are ~2 GB each, plus PyInstaller bundling all the NVIDIA .so files), and it runs without that cleanup on the same ubuntu-latest image. Expect "No space left on device" in PyInstaller or during tar packaging.
🛠️ Suggested addition (after `actions/checkout@v4`)
steps:
- uses: actions/checkout@v4
+ - name: Free up disk space
+ uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be
+ with:
+ tool-cache: false
+ android: true
+ dotnet: true
+ haskell: true
+ large-packages: false
+ swap-storage: true
+
- name: Setup Python
uses: actions/setup-python@v5🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/release.yml around lines 342 - 381, The build-cuda-linux
job is missing the disk cleanup step used in the main release job, causing
potential "No space left on device" errors during heavy CUDA wheel installs and
PyInstaller bundling; add a new step immediately after the actions/checkout@v4
step in the build-cuda-linux job that runs the same free-up-disk-space commands
as the release job (the cleanup run that clears apt caches, removes unused
packages/snap/docker caches and purges /var/lib/apt/lists, /var/cache/apt, /tmp
and /var/tmp to reclaim ~25GB) so that the subsequent steps (Install Python
dependencies, Install PyTorch with CUDA 12.8, Build CUDA server binary) have
sufficient disk space.
| - name: Install Qwen3-TTS | ||
| run: pip install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git | ||
|
|
||
| - name: Build CUDA server binary (onedir) | ||
| working-directory: backend | ||
| env: | ||
| TORCH_CUDA_ARCH_LIST: "8.0;8.6;8.9;9.0;12.0+PTX" | ||
| run: python build_binary.py --cuda |
There was a problem hiding this comment.
Feature parity gap: Linux CUDA build installs Qwen3-TTS, Windows CUDA build does not.
The Windows CUDA job (lines 283–309) never installs git+https://github.com/QwenLM/Qwen3-TTS.git, but the new Linux job does (line 375). After this PR, the same release tag will ship a CUDA backend that supports Qwen3-TTS on Linux but not on Windows, which is surprising for a per-platform packaging change. Either add the same step to build-cuda-windows, or drop it from the Linux job, so both CUDA artifacts expose the same model set.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/release.yml around lines 374 - 381, The Linux CUDA release
job adds a step that installs Qwen3-TTS (the pip command "pip install --no-deps
git+https://github.com/QwenLM/Qwen3-TTS.git") but the Windows CUDA job
(build-cuda-windows) does not, causing inconsistent artifacts; to fix, either
add an equivalent "Install Qwen3-TTS" step with the same pip command to the
build-cuda-windows job before its "Build CUDA server binary (onedir)" step, or
remove the install step from the Linux CUDA job so both CUDA jobs expose the
same model set—update the workflow so both jobs contain identical model-install
steps.
| - name: Upload archives to GitHub Release | ||
| if: startsWith(github.ref, 'refs/tags/') | ||
| uses: softprops/action-gh-release@v2 | ||
| with: | ||
| files: | | ||
| release-assets/voicebox-server-cuda-linux-x86_64.tar.gz | ||
| release-assets/voicebox-server-cuda-linux-x86_64.tar.gz.sha256 | ||
| release-assets/cuda-libs-cu128-v1-linux-x86_64.tar.gz | ||
| release-assets/cuda-libs-cu128-v1-linux-x86_64.tar.gz.sha256 | ||
| draft: true | ||
| env: | ||
| GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} |
There was a problem hiding this comment.
cuda-libs.json is not uploaded for Linux; release-assets/ may not exist on non-tag pushes either.
Two things to clean up here:
- The Windows job (line 330) uploads
release-assets/cuda-libs.jsonto the release, but this Linux step does not. As a result the publishedcuda-libs.jsonwill only ever describe the Windows artifact. If you want both, the manifest filename itself needs to be platform-suffixed (e.g.cuda-libs-cu128-v1-linux-x86_64.json), sincepackage_cuda.pyalways writes plaincuda-libs.jsonand a release can't hold two assets with the same name. Otherwise consider dropping the manifest from the release uploads entirely (the runtime inbackend/services/cuda.pydoesn't fetch it anyway). - The
Package into…step on lines 383–390 always runs and writes intorelease-assets/, but theUpload archivesstep is gated onstartsWith(github.ref, 'refs/tags/'). On non-tag runs (e.g.workflow_dispatch), packaging still succeeds but the archives are silently discarded. Same shape as the Windows job today — calling it out so it's an intentional choice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/release.yml around lines 392 - 403, The Linux release job
currently omits the cuda-libs.json manifest and may write into release-assets/
even on non-tag runs; update the "Upload archives to GitHub Release" step to
either upload a platform-suffixed manifest (e.g.
cuda-libs-cu128-v1-linux-x86_64.json) to match the Windows upload or remove the
manifest from the release altogether if you don't need it (package_cuda.py still
writes plain cuda-libs.json so if you keep it you must rename it per-platform
before upload to avoid name collisions); also make the "Package into…" step or
the upload step conditional so artifacts in release-assets/ are only
produced/uploaded on tag-based releases (or ensure you clean/skip packaging on
non-tag runs) to avoid silently discarding artifacts.
| _plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64" | ||
| server_archive = f"voicebox-server-cuda-{_plat}.tar.gz" | ||
| libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find call sites for download_cuda_binary / check_and_update_cuda_binary and look for platform gating.
rg -nP --type=py -C3 '\b(download_cuda_binary|check_and_update_cuda_binary)\s*\('
echo
echo "=== platform/sys.platform gates around CUDA ==="
rg -nP --type=py -C2 'cuda' -g '!**/tests/**' | rg -nP '(sys\.platform|platform\.machine|is_cuda_active|VOICEBOX_BACKEND_VARIANT)' | head -60Repository: jamiepine/voicebox
Length of output: 3554
🏁 Script executed:
# Check the context around app.py:249 where check_and_update_cuda_binary is called
sed -n '235,265p' backend/app.py | cat -n
echo "=== Check backend/services/cuda.py lines 50-120 for platform logic ==="
sed -n '50,120p' backend/services/cuda.py | cat -n
echo "=== Check routes/cuda.py for any gating ==="
cat -n backend/routes/cuda.py | head -60Repository: jamiepine/voicebox
Length of output: 6632
🏁 Script executed:
# Check how backend type is determined and if CUDA is conditionally enabled
sed -n '240,255p' backend/app.py | cat -n
echo "=== Check get_backend_type and related logic ==="
rg -A10 -B3 'def get_backend_type\(\)' --type=py
echo "=== Check if check_and_update_cuda_binary is conditionally called ==="
rg -B15 'check_and_update_cuda_binary' backend/app.py | head -40Repository: jamiepine/voicebox
Length of output: 2563
_plat falls back to linux-x86_64 for any non-win32 platform — wrong on macOS and ARM Linux.
sys.platform == "win32" distinguishes only Windows from "everything else". As written:
- On macOS,
sys.platform == "darwin"→_plat = "linux-x86_64". The background auto-update task on startup will fetch and try to extract a Linux x86_64 tarball. - On ARM64 Linux (Jetson, aarch64 servers),
sys.platform == "linux"regardless of arch → same Linux x86_64 archive is downloaded, and the binary won't run.
Both code paths are reachable: the auto-update runs unconditionally on startup (app.py:249), and the HTTP endpoint /backend/download-cuda has no platform validation. Detect unsupported platforms early and raise a clear error rather than silently fetch the wrong archive.
🛡️ Suggested guard
- base_url = f"{GITHUB_RELEASES_URL}/{version}"
- _plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64"
- server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
- libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"
+ base_url = f"{GITHUB_RELEASES_URL}/{version}"
+ import platform as _platform
+ machine = _platform.machine().lower()
+ if sys.platform == "win32" and machine in ("amd64", "x86_64"):
+ _plat = "windows-x86_64"
+ elif sys.platform.startswith("linux") and machine in ("x86_64", "amd64"):
+ _plat = "linux-x86_64"
+ else:
+ raise RuntimeError(
+ f"CUDA backend is not available for {sys.platform}/{machine}; "
+ "supported targets are windows-x86_64 and linux-x86_64."
+ )
+ server_archive = f"voicebox-server-cuda-{_plat}.tar.gz"
+ libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _plat = "windows-x86_64" if sys.platform == "win32" else "linux-x86_64" | |
| server_archive = f"voicebox-server-cuda-{_plat}.tar.gz" | |
| libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz" | |
| base_url = f"{GITHUB_RELEASES_URL}/{version}" | |
| import platform as _platform | |
| machine = _platform.machine().lower() | |
| if sys.platform == "win32" and machine in ("amd64", "x86_64"): | |
| _plat = "windows-x86_64" | |
| elif sys.platform.startswith("linux") and machine in ("x86_64", "amd64"): | |
| _plat = "linux-x86_64" | |
| else: | |
| raise RuntimeError( | |
| f"CUDA backend is not available for {sys.platform}/{machine}; " | |
| "supported targets are windows-x86_64 and linux-x86_64." | |
| ) | |
| server_archive = f"voicebox-server-cuda-{_plat}.tar.gz" | |
| libs_archive = f"cuda-libs-{CUDA_LIBS_VERSION}-{_plat}.tar.gz" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/services/cuda.py` around lines 289 - 291, The platform detection
using _plat incorrectly defaults all non-win32 to "linux-x86_64"; update the
logic around the _plat, server_archive and libs_archive construction in
backend/services/cuda.py to detect sys.platform and platform.machine(): set
_plat to "windows-x86_64" for "win32", "darwin-x86_64" or "darwin-arm64" (or
"darwin-x86_64"/"darwin-arm64" variants) for macOS, and "linux-aarch64" for
aarch64/arm64 Linux and "linux-x86_64" for x86_64 Linux; if the combination is
unsupported, raise a clear exception (or return an explicit error) before
constructing server_archive/libs_archive (reference symbols: _plat,
server_archive, libs_archive, CUDA_LIBS_VERSION) and ensure the same validation
is used by both the auto-update startup flow and the /backend/download-cuda
endpoint so wrong archives are never requested.
| # --no-deps prevents Qwen3-TTS from overriding the pinned torch version | ||
| {{ pip }} install --no-deps git+https://github.com/QwenLM/Qwen3-TTS.git |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Inspect Qwen3-TTS install_requires to compare against backend/requirements.txt.
set -euo pipefail
tmp=$(mktemp -d)
git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1
echo "=== Qwen3-TTS pyproject/setup ==="
fd -t f -E '*.lock' . "$tmp/qwen3" -e toml -e cfg -e py | xargs grep -nE 'install_requires|dependencies|requires' 2>/dev/null | head -80
echo
echo "=== backend/requirements.txt ==="
cat backend/requirements.txtRepository: jamiepine/voicebox
Length of output: 4574
🏁 Script executed:
#!/bin/bash
set -euo pipefail
tmp=$(mktemp -d)
git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1
echo "=== Qwen3-TTS dependencies (full) ==="
python3 <<'PYEOF'
import toml
with open('$tmp/qwen3/pyproject.toml') as f:
data = toml.load(f)
deps = data.get('project', {}).get('dependencies', [])
for dep in deps:
print(dep)
PYEOF
echo
echo "=== Checking each Qwen3-TTS dep against backend/requirements.txt ==="
python3 <<'PYEOF'
import toml
import re
with open('$tmp/qwen3/pyproject.toml') as f:
data = toml.load(f)
qwen_deps = data.get('project', {}).get('dependencies', [])
with open('backend/requirements.txt') as f:
backend_reqs = f.read()
print("Qwen3-TTS dependencies not found in backend/requirements.txt:")
for dep in qwen_deps:
# Extract package name (before version specifiers, brackets, etc.)
pkg_name = re.split(r'[<>=\[\]@;]', dep)[0].strip().lower()
if pkg_name and not re.search(rf'\b{re.escape(pkg_name)}\b', backend_reqs, re.IGNORECASE):
print(f" MISSING: {dep}")
else:
print(f" OK: {dep}")
PYEOFRepository: jamiepine/voicebox
Length of output: 235
🏁 Script executed:
#!/bin/bash
set -euo pipefail
tmp=$(mktemp -d)
git clone --depth=1 https://github.com/QwenLM/Qwen3-TTS.git "$tmp/qwen3" >/dev/null 2>&1
echo "=== Qwen3-TTS dependencies (from pyproject.toml) ==="
sed -n '/^\s*dependencies\s*=/,/^\s*\]/p' "$tmp/qwen3/pyproject.toml" | grep -oE '"[^"]+"' | tr -d '"' | sort
echo
echo "=== backend/requirements.txt packages ==="
grep -v '^#' backend/requirements.txt | grep -v '^\s*$' | sed 's/[><=@\[].*//' | tr '[:upper:]' '[:lower:]' | sort
echo
echo "=== Comparison: Missing deps ==="
python3 <<'PYEOF'
import subprocess
import re
# Get Qwen3-TTS deps
qwen_deps = subprocess.check_output([
'sed', '-n', '/^\s*dependencies\s*=/,/^\s*\]/p',
'/tmp/qwen3_deps_tmp.txt'
], text=True, cwd='tmp') if False else None
# Simpler: extract manually
with open('/tmp/qwen3_info.txt', 'w') as f:
import subprocess
tmp = '/tmp/qwen3_clone'
result = subprocess.run(['sed', '-n', '/^\s*dependencies\s*=/,/^\s*\]/p', f'{tmp}/pyproject.toml'],
capture_output=True, text=True, cwd='/')
f.write(result.stdout)
with open('/tmp/qwen3_info.txt') as f:
content = f.read()
deps = re.findall(r'"([^"]+)"', content)
with open('backend/requirements.txt') as f:
backend_content = f.read().lower()
print("Qwen3-TTS runtime dependencies:")
for dep in deps:
pkg_name = re.split(r'[<>=\[\]@;]', dep)[0].strip().lower()
found = bool(re.search(rf'\b{re.escape(pkg_name)}\b', backend_content))
status = "✓" if found else "✗ MISSING"
print(f" {status} {dep}")
PYEOFRepository: jamiepine/voicebox
Length of output: 724
🏁 Script executed:
#!/bin/bash
echo "=== Qwen3-TTS dependencies vs backend/requirements.txt ==="
cat <<'EOF' | python3
qwen_deps = [
"accelerate==1.12.0",
"einops",
"gradio",
"librosa",
"onnxruntime",
"soundfile",
"sox",
"torchaudio",
"transformers==4.57.3"
]
with open('backend/requirements.txt') as f:
backend_content = f.read().lower()
print("Package status in backend/requirements.txt:\n")
missing = []
for dep in qwen_deps:
pkg_name = dep.split('==')[0].split('[')[0].strip().lower()
if pkg_name in backend_content:
print(f" ✓ {pkg_name}")
else:
print(f" ✗ {pkg_name} [MISSING]")
missing.append(pkg_name)
print(f"\nMissing: {', '.join(missing) if missing else 'None'}")
EOFRepository: jamiepine/voicebox
Length of output: 368
Add missing Qwen3-TTS runtime dependencies to backend/requirements.txt.
Installing Qwen3-TTS with --no-deps skips its declared dependencies. Verification shows 4 packages are missing from backend/requirements.txt: einops, gradio, onnxruntime, and sox.
These will cause ImportError at runtime on a fresh setup. Add them to backend/requirements.txt or remove the --no-deps flag and let transitive installation handle them (noting the torch pin may need adjustment if conflicts arise).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@justfile` around lines 75 - 76, The justfile installs Qwen3-TTS with
--no-deps which skips its runtime dependencies; update backend/requirements.txt
to include einops, gradio, onnxruntime, and sox so a fresh backend install won't
hit ImportError, or alternatively remove the --no-deps flag in the justfile pip
install of Qwen3-TTS and resolve any torch pin conflicts if they arise; modify
either the justfile entry referencing Qwen3-TTS or the backend/requirements.txt
to reflect this change.
tauri/src-tauri/gen/schemas/linux-schema.json is auto-generated by Tauri on Linux during dev/build and is not meant to be committed.
This reverts commit 4715a6a.
Problem
just setupon Linux with an NVIDIA GPU installs CPU-only PyTorch. The default torch wheel on PyPI satisfiesrequirements.txtwithout ever consulting the PyTorch CUDA index, so even users with a capable GPU end up with no CUDA acceleration.Fix
Detect
nvidia-smiat setup time and installtorch==2.7.0+cu128/torchaudio==2.7.0+cu128via a constraints file fromhttps://download.pytorch.org/whl/cu128.Why cu128 and not cu130? Driver 570 — the default shipped with Ubuntu 24.04's current NVIDIA packages — supports CUDA 12.8 max. cu130 requires driver 576+. Pinning cu128 means this works for any NVIDIA GPU on driver 520+.
No impact on other platforms:
nvidia-smi: falls through to the same CPU install as before[unix]block doesn't run; Windows already has its own GPU detection--no-depsis also added to the Qwen3-TTS install to prevent it from overriding the pinned torch version.Test
On a Linux machine with an NVIDIA GPU:
Summary by CodeRabbit
New Features
Chores