Upgrade CUDA backend from cu126 to cu128, fix GPU settings UI#316
Upgrade CUDA backend from cu126 to cu128, fix GPU settings UI#316
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
📝 WalkthroughWalkthroughThis PR updates CUDA toolkit support from version 12.6 to 12.8 across build workflows, backend services, and packaging scripts. It also bumps PyTorch from 2.1.0 to 2.7.0 and refactors the GPU acceleration UI to better handle active CUDA states with improved status indicators. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Upgrade CUDA toolkit from 12.6 (cu126) to 12.8 (cu128) for proper RTX 50-series (Blackwell) GPU support. Users with RTX 5070/5080/5090 were reporting CUDA detection failures with cu126. Also fix the GPU Acceleration settings panel where the 'Switch to CPU Backend' button was unreachable — it was inside a conditional block that required !isCurrentlyCuda, making it impossible to switch back to CPU once running on CUDA. Closes #315
2925355 to
fc5ed1f
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (11)
backend/backends/cosyvoice_backend.py (6)
98-98: Use adefinstead of assigning a lambda.Per PEP 8 / Ruff E731, named functions should use
defstatements.💅 Suggested fix
- _noop = lambda *a, **kw: None + def _noop(*a, **kw): + pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/cosyvoice_backend.py` at line 98, Replace the assigned lambda _noop = lambda *a, **kw: None with a proper named function using def; locate the symbol _noop in cosyvoice_backend.py and change it to a def _noop(*a, **kw): return None so it follows PEP 8 / Ruff E731 and is clearer to readers and tooling.
183-185: Parameterformatshadows Python builtin.While this matches
torchaudio.load's signature, consider renaming toformat_with an underscore to avoid shadowing the builtin.💅 Suggested fix
- def _sf_load(uri, frame_offset=0, num_frames=-1, normalize=True, - channels_first=True, format=None, buffer_size=4096, - backend=None): + def _sf_load(uri, frame_offset=0, num_frames=-1, normalize=True, + channels_first=True, format_=None, buffer_size=4096, + backend=None):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/cosyvoice_backend.py` around lines 183 - 185, The parameter name format in function _sf_load shadows the Python builtin; rename it to format_ (update the function signature for _sf_load and every internal reference to format -> format_) and update any callers to pass format_ (or accept both names by adding **kwargs handling and mapping format -> format_ if backward compatibility is required). Also update the function docstring/parameter list to reflect format_ and run tests to ensure no call sites were missed (search for "_sf_load(" and references to the format parameter).
425-426: Consider logging a warning for empty audio output.When no audio chunks are produced, returning 1 second of silence is a reasonable fallback, but this may indicate an upstream issue worth logging.
💡 Suggested improvement
if not audio_chunks: + logger.warning("CosyVoice produced no audio chunks for text: %s", text[:60]) return np.zeros(COSYVOICE_SAMPLE_RATE, dtype=np.float32), COSYVOICE_SAMPLE_RATE🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/cosyvoice_backend.py` around lines 425 - 426, The code returns 1 second of silence when audio_chunks is empty but doesn't log this event; add a warning log before the return to record that no audio was produced (e.g., use logger.warning or self.logger.warning depending on the surrounding scope) and include context such as the function name and that COSYVOICE_SAMPLE_RATE silent buffer is being returned (reference audio_chunks and COSYVOICE_SAMPLE_RATE to locate the code). Ensure the log is emitted only when audio_chunks is falsy and then return the same np.zeros(...) as before.
23-23: Use modern type hints for Python 3.9+.
typing.Listandtyping.Tupleare deprecated. Since the project targets Python 3.12, use the built-inlistandtupledirectly.💅 Suggested fix
-from typing import ClassVar, List, Optional, Tuple +from typing import ClassVar, OptionalThen update usages throughout the file:
List[str]→list[str]Tuple[dict, bool]→tuple[dict, bool]Tuple[np.ndarray, str]→tuple[np.ndarray, str]Tuple[np.ndarray, int]→tuple[np.ndarray, int]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/cosyvoice_backend.py` at line 23, Replace deprecated typing aliases with built-in generics for Python 3.9+: remove typing.List and typing.Tuple usage in the import line and type hints in this module; keep ClassVar and Optional if still needed. Specifically, update the import from "from typing import ClassVar, List, Optional, Tuple" to drop List and Tuple, and change all occurrences like "List[str]" → "list[str]", "Tuple[dict, bool]" → "tuple[dict, bool]", "Tuple[np.ndarray, str]" → "tuple[np.ndarray, str]", and "Tuple[np.ndarray, int]" → "tuple[np.ndarray, int]" (apply similar conversions elsewhere in functions/methods such as any function that returns tuple[...] or accepts list[...] types).
372-372: Defensive load is good, but consider the variant fallback.If
_variantisNone(model unloaded), this defaults to"v2". This could be surprising if the user intended to use v3. Consider requiring an explicitmodel_sizeparameter or raising an error if the model is not loaded.Note: The retrieved learning indicates model reload/unload race conditions are a pre-existing design issue tracked for future follow-up.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/cosyvoice_backend.py` at line 372, The call uses a silent fallback to "v2" when self._variant is None which can misselect the model; update the code around the invocation of load_model (the call at await self.load_model(self._variant or "v2")) to avoid implicit fallback: either require an explicit model_size/model_variant parameter and pass it to load_model, or check if self._variant is None and raise a clear error (e.g., ValueError) instructing the caller to load the desired variant first; touch the load_model caller site and any public API that will pass model_size so the selection is explicit and no silent "v2" default remains.
274-298: Thread-safety concern with globaltorch.loadpatch.Patching
torch.loadglobally affects all threads. While_model_load_lockserializes model loading within this backend, other backends or code paths could calltorch.loadduring this window and unexpectedly receivemap_location="cpu".This is a pre-existing pattern (also used in
chatterbox_backend.py), but worth noting for future refactoring.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/cosyvoice_backend.py` around lines 274 - 298, The global patch of torch.load (_orig_torch_load/_patched_load) is not thread-safe; instead, make the patch local to the import of cosyvoice.cli.cosyvoice by temporarily replacing the torch entry in sys.modules with a small wrapper module that exposes the same attributes but a patched load (only while importing CosyVoice2/CosyVoice3), then restore sys.modules; locate the block that checks device == "cpu" and currently sets torch.load and change it to: create a shallow wrapper module or clone of the real torch with a patched load, inject it into sys.modules["torch"], import the desired class (CosyVoice2 or CosyVoice3) and instantiate model, and finally restore the original sys.modules["torch"]; keep references to the existing symbols _orig_torch_load/_patched_load, torch.load, variant, CosyVoice2/CosyVoice3 and _model_load_lock when implementing.Dockerfile (1)
42-42: Consider pinning CosyVoice to a specific commit for reproducible builds.The clone fetches the latest commit from the default branch, which could lead to build inconsistencies over time if the upstream CosyVoice repository introduces breaking changes.
💡 Suggested improvement
-RUN git clone --recursive --depth 1 https://github.com/FunAudioLLM/CosyVoice.git /build/CosyVoice +# Pin to a known-working commit for reproducible builds +RUN git clone --recursive https://github.com/FunAudioLLM/CosyVoice.git /build/CosyVoice && \ + cd /build/CosyVoice && git checkout <COMMIT_SHA>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Dockerfile` at line 42, The Dockerfile's RUN git clone line currently pulls the latest default branch; change it to pin CosyVoice to a specific commit or tag to ensure reproducible builds by cloning and then checking out a known commit/hash (or cloning the specific tag/branch with --branch and --single-branch) instead of the floating default branch; update the RUN step that references "git clone --recursive --depth 1 https://github.com/FunAudioLLM/CosyVoice.git /build/CosyVoice" so it checks out a provided commit SHA or tag immediately after clone (or clones the tag directly) and document the chosen commit SHA/tag in the Dockerfile comment..github/workflows/release.yml (1)
66-66: Same reproducibility concern applies here.Consider pinning CosyVoice to a specific commit to ensure release builds are reproducible.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/release.yml at line 66, The git clone command fetching CosyVoice is not pinned and makes releases non-reproducible; update the clone step that currently targets https://github.com/FunAudioLLM/CosyVoice to clone a specific commit/SHA instead of the default branch—fetch the repo (as now), then checkout a fixed commit SHA (or use the repository URL with the exact commit reference) for CosyVoice so every release uses the same immutable version and rerun CI to verify.app/src/lib/hooks/useGenerationForm.ts (1)
18-22: Add cross-field validation betweenengineandmodelSize.Current schema allows invalid pairs (e.g.,
engine: 'luxtts'withmodelSize: 'v3'). AsuperRefinecheck would prevent bad combinations before submission.Possible schema refinement
-const generationSchema = z.object({ +const generationSchema = z + .object({ text: z.string().min(1, '').max(50000), language: z.enum(LANGUAGE_CODES as [LanguageCode, ...LanguageCode[]]), seed: z.number().int().optional(), modelSize: z.enum(['1.7B', '0.6B', '1B', '3B', 'v2', 'v3']).optional(), instruct: z.string().max(500).optional(), engine: z .enum(['qwen', 'luxtts', 'chatterbox', 'chatterbox_turbo', 'tada', 'cosyvoice']) .optional(), -}); + }) + .superRefine((data, ctx) => { + const engine = data.engine ?? 'qwen'; + const size = data.modelSize; + if (!size) return; + + const allowed: Record<string, string[]> = { + qwen: ['1.7B', '0.6B'], + tada: ['1B', '3B'], + cosyvoice: ['v2', 'v3'], + luxtts: [], + chatterbox: [], + chatterbox_turbo: [], + }; + + if (!allowed[engine]?.includes(size)) { + ctx.addIssue({ + code: z.ZodIssueCode.custom, + path: ['modelSize'], + message: 'Invalid model size for selected engine', + }); + } + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/lib/hooks/useGenerationForm.ts` around lines 18 - 22, The Zod schema in useGenerationForm (the schema object that defines modelSize, instruct, engine) permits invalid engine/modelSize combinations; add a schema.superRefine that checks the engine and modelSize pair and calls ctx.addIssue when the combination is invalid (e.g., forbid luxtts with v2/v3, or enforce that certain engines only accept numeric sizes like '0.6B','1B', etc.). Locate the schema variable in useGenerationForm.ts and implement the cross-field rules inside superRefine using the ctx.addIssue API with a descriptive message and path ['modelSize'] or ['engine'] so the form shows the validation error. Ensure optional values are handled (skip check when one is undefined) and keep the logic centralized in one refinement function.app/src/components/Generation/EngineModelSelector.tsx (1)
77-81: Consider extracting shared language-compatibility fallback.This fallback block is repeated across multiple engine branches. A small helper (e.g.,
ensureSupportedLanguage(form, engine)) would reduce drift when engine language support changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 77 - 81, Extract the repeated language-compatibility fallback into a reusable helper (e.g., ensureSupportedLanguage(form, engine)) that calls getLanguageOptionsForEngine(engine), compares form.getValues('language') against available options, and sets form.setValue('language', available[0]?.value ?? 'en') when current language is unsupported; replace the inline block in EngineModelSelector (currently using getLanguageOptionsForEngine('cosyvoice') and form) with a call to ensureSupportedLanguage(form, 'cosyvoice') so all engine branches use the same centralized logic.app/src/components/ServerSettings/GpuAcceleration.tsx (1)
247-280: Consider: Non-Tauri users running CUDA see no management options.When
isCurrentlyCudais true butplatform.metadata.isTauriis false (e.g., user manually started the CUDA backend and accesses via browser), neither the "Switch to CPU" section (lines 247-277) nor the download/manage section (lines 280-379) renders. Users would see only the GPU status with no actions available.This is likely acceptable since restart functionality requires Tauri, but you might consider adding a brief informational message for this edge case explaining that backend management requires the desktop app.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/ServerSettings/GpuAcceleration.tsx` around lines 247 - 280, When isCurrentlyCuda is true but platform.metadata.isTauri is false (user started CUDA backend outside Tauri) there are no management controls shown; add a small informational UI branch for this edge case so users know backend management requires the desktop app. Specifically, inside the condition that currently checks isCurrentlyCuda && platform.metadata.isTauri, add an else-if or adjacent block for isCurrentlyCuda && !platform.metadata.isTauri that renders a brief message (using the same styling as other notices) explaining that switching to CPU or restarting requires the Tauri desktop app and that no in-browser controls are available; reference the existing symbols isCurrentlyCuda, platform.metadata.isTauri, and handleSwitchToCpu (to clarify why the button is unavailable) and surface any existing error via the error variable as in the other branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 74-77: The code currently trusts value.split(':') and writes
modelSize into form state; validate the parsed modelSize before calling
form.setValue. After const [, modelSize] = value.split(':'), check that
modelSize is one of the allowed values ('v2' or 'v3') and only then call
form.setValue('modelSize', modelSize as 'v2' | 'v3'); if it is invalid or
missing, handle it safely (e.g. skip setting, set a safe default, or early
return) and avoid propagating an invalid value to form state; update the block
around value.split, modelSize, form.setValue('engine', 'cosyvoice'), and
form.setValue('modelSize', ...) accordingly and consider logging or triggering
validation for malformed inputs.
In `@app/src/lib/hooks/useGenerationForm.ts`:
- Around line 88-92: The qwen branch builds model IDs and display labels using
data.modelSize which can be undefined (yielding "qwen-tts-undefined" vs a
display like "Qwen TTS 0.6B"); fix by defaulting modelSize to '1.7B' when
constructing the model id and the display label: use (data.modelSize ?? '1.7B')
wherever model id is computed (the ternary that sets modelId for engine ===
'qwen') and wherever the human-facing label is generated (the block that outputs
"Qwen TTS ...", lines around the display text), and make the same change in the
other occurrence referenced (the second block at lines ~104-110) so both model
selection and display remain consistent.
In `@scripts/package_cuda.py`:
- Around line 217-218: The help text for the CLI option defining the torch
compatibility range is stale: update the help string for the argument (the
parser.add_argument call that sets default=">=2.7.0,<2.11.0" for the
--torch-compat option in scripts/package_cuda.py) so it matches the new default
(change the displayed range from ">=2.6.0,<2.11.0" to ">=2.7.0,<2.11.0"); ensure
the --torch-compat help message and the default value are consistent.
---
Nitpick comments:
In @.github/workflows/release.yml:
- Line 66: The git clone command fetching CosyVoice is not pinned and makes
releases non-reproducible; update the clone step that currently targets
https://github.com/FunAudioLLM/CosyVoice to clone a specific commit/SHA instead
of the default branch—fetch the repo (as now), then checkout a fixed commit SHA
(or use the repository URL with the exact commit reference) for CosyVoice so
every release uses the same immutable version and rerun CI to verify.
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 77-81: Extract the repeated language-compatibility fallback into a
reusable helper (e.g., ensureSupportedLanguage(form, engine)) that calls
getLanguageOptionsForEngine(engine), compares form.getValues('language') against
available options, and sets form.setValue('language', available[0]?.value ??
'en') when current language is unsupported; replace the inline block in
EngineModelSelector (currently using getLanguageOptionsForEngine('cosyvoice')
and form) with a call to ensureSupportedLanguage(form, 'cosyvoice') so all
engine branches use the same centralized logic.
In `@app/src/components/ServerSettings/GpuAcceleration.tsx`:
- Around line 247-280: When isCurrentlyCuda is true but
platform.metadata.isTauri is false (user started CUDA backend outside Tauri)
there are no management controls shown; add a small informational UI branch for
this edge case so users know backend management requires the desktop app.
Specifically, inside the condition that currently checks isCurrentlyCuda &&
platform.metadata.isTauri, add an else-if or adjacent block for isCurrentlyCuda
&& !platform.metadata.isTauri that renders a brief message (using the same
styling as other notices) explaining that switching to CPU or restarting
requires the Tauri desktop app and that no in-browser controls are available;
reference the existing symbols isCurrentlyCuda, platform.metadata.isTauri, and
handleSwitchToCpu (to clarify why the button is unavailable) and surface any
existing error via the error variable as in the other branch.
In `@app/src/lib/hooks/useGenerationForm.ts`:
- Around line 18-22: The Zod schema in useGenerationForm (the schema object that
defines modelSize, instruct, engine) permits invalid engine/modelSize
combinations; add a schema.superRefine that checks the engine and modelSize pair
and calls ctx.addIssue when the combination is invalid (e.g., forbid luxtts with
v2/v3, or enforce that certain engines only accept numeric sizes like
'0.6B','1B', etc.). Locate the schema variable in useGenerationForm.ts and
implement the cross-field rules inside superRefine using the ctx.addIssue API
with a descriptive message and path ['modelSize'] or ['engine'] so the form
shows the validation error. Ensure optional values are handled (skip check when
one is undefined) and keep the logic centralized in one refinement function.
In `@backend/backends/cosyvoice_backend.py`:
- Line 98: Replace the assigned lambda _noop = lambda *a, **kw: None with a
proper named function using def; locate the symbol _noop in cosyvoice_backend.py
and change it to a def _noop(*a, **kw): return None so it follows PEP 8 / Ruff
E731 and is clearer to readers and tooling.
- Around line 183-185: The parameter name format in function _sf_load shadows
the Python builtin; rename it to format_ (update the function signature for
_sf_load and every internal reference to format -> format_) and update any
callers to pass format_ (or accept both names by adding **kwargs handling and
mapping format -> format_ if backward compatibility is required). Also update
the function docstring/parameter list to reflect format_ and run tests to ensure
no call sites were missed (search for "_sf_load(" and references to the format
parameter).
- Around line 425-426: The code returns 1 second of silence when audio_chunks is
empty but doesn't log this event; add a warning log before the return to record
that no audio was produced (e.g., use logger.warning or self.logger.warning
depending on the surrounding scope) and include context such as the function
name and that COSYVOICE_SAMPLE_RATE silent buffer is being returned (reference
audio_chunks and COSYVOICE_SAMPLE_RATE to locate the code). Ensure the log is
emitted only when audio_chunks is falsy and then return the same np.zeros(...)
as before.
- Line 23: Replace deprecated typing aliases with built-in generics for Python
3.9+: remove typing.List and typing.Tuple usage in the import line and type
hints in this module; keep ClassVar and Optional if still needed. Specifically,
update the import from "from typing import ClassVar, List, Optional, Tuple" to
drop List and Tuple, and change all occurrences like "List[str]" → "list[str]",
"Tuple[dict, bool]" → "tuple[dict, bool]", "Tuple[np.ndarray, str]" →
"tuple[np.ndarray, str]", and "Tuple[np.ndarray, int]" → "tuple[np.ndarray,
int]" (apply similar conversions elsewhere in functions/methods such as any
function that returns tuple[...] or accepts list[...] types).
- Line 372: The call uses a silent fallback to "v2" when self._variant is None
which can misselect the model; update the code around the invocation of
load_model (the call at await self.load_model(self._variant or "v2")) to avoid
implicit fallback: either require an explicit model_size/model_variant parameter
and pass it to load_model, or check if self._variant is None and raise a clear
error (e.g., ValueError) instructing the caller to load the desired variant
first; touch the load_model caller site and any public API that will pass
model_size so the selection is explicit and no silent "v2" default remains.
- Around line 274-298: The global patch of torch.load
(_orig_torch_load/_patched_load) is not thread-safe; instead, make the patch
local to the import of cosyvoice.cli.cosyvoice by temporarily replacing the
torch entry in sys.modules with a small wrapper module that exposes the same
attributes but a patched load (only while importing CosyVoice2/CosyVoice3), then
restore sys.modules; locate the block that checks device == "cpu" and currently
sets torch.load and change it to: create a shallow wrapper module or clone of
the real torch with a patched load, inject it into sys.modules["torch"], import
the desired class (CosyVoice2 or CosyVoice3) and instantiate model, and finally
restore the original sys.modules["torch"]; keep references to the existing
symbols _orig_torch_load/_patched_load, torch.load, variant,
CosyVoice2/CosyVoice3 and _model_load_lock when implementing.
In `@Dockerfile`:
- Line 42: The Dockerfile's RUN git clone line currently pulls the latest
default branch; change it to pin CosyVoice to a specific commit or tag to ensure
reproducible builds by cloning and then checking out a known commit/hash (or
cloning the specific tag/branch with --branch and --single-branch) instead of
the floating default branch; update the RUN step that references "git clone
--recursive --depth 1 https://github.com/FunAudioLLM/CosyVoice.git
/build/CosyVoice" so it checks out a provided commit SHA or tag immediately
after clone (or clones the tag directly) and document the chosen commit SHA/tag
in the Dockerfile comment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9bacccaa-4a86-4d26-8f63-a5a0eee63fda
📒 Files selected for processing (19)
.github/workflows/release.yml.gitignoreDockerfileapp/src/components/Generation/EngineModelSelector.tsxapp/src/components/ServerSettings/GpuAcceleration.tsxapp/src/components/ServerSettings/ModelManagement.tsxapp/src/lib/api/types.tsapp/src/lib/constants/languages.tsapp/src/lib/hooks/useGenerationForm.tsbackend/backends/__init__.pybackend/backends/cosyvoice_backend.pybackend/build_binary.pybackend/models.pybackend/requirements.txtbackend/server.pybackend/services/cuda.pydocs/content/docs/developer/building.mdxjustfilescripts/package_cuda.py
| default=">=2.7.0,<2.11.0", | ||
| help="Torch version compatibility range (default: >=2.6.0,<2.11.0)", |
There was a problem hiding this comment.
Help text shows stale default value.
The --torch-compat default was updated to >=2.7.0,<2.11.0 on Line 217, but the help text on Line 218 still shows >=2.6.0,<2.11.0.
Proposed fix
parser.add_argument(
"--torch-compat",
type=str,
default=">=2.7.0,<2.11.0",
- help="Torch version compatibility range (default: >=2.6.0,<2.11.0)",
+ help="Torch version compatibility range (default: >=2.7.0,<2.11.0)",
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| default=">=2.7.0,<2.11.0", | |
| help="Torch version compatibility range (default: >=2.6.0,<2.11.0)", | |
| parser.add_argument( | |
| "--torch-compat", | |
| type=str, | |
| default=">=2.7.0,<2.11.0", | |
| help="Torch version compatibility range (default: >=2.7.0,<2.11.0)", | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/package_cuda.py` around lines 217 - 218, The help text for the CLI
option defining the torch compatibility range is stale: update the help string
for the argument (the parser.add_argument call that sets
default=">=2.7.0,<2.11.0" for the --torch-compat option in
scripts/package_cuda.py) so it matches the new default (change the displayed
range from ">=2.6.0,<2.11.0" to ">=2.7.0,<2.11.0"); ensure the --torch-compat
help message and the default value are consistent.
|
NVIDIA GeForce RTX5060 Ti here INFO: 127.0.0.1:61924 - "GET /history?limit=20 HTTP/1.1" 200 OK So, i cannot generate an audio using the GPU nvidia-smi: Thanks in advance |
|
Same issue |
Summary
!isCurrentlyCudaguard, making it impossible to switch back).>=2.7.0(required for cu128 compatibility).Changes
CUDA cu126 → cu128 (8 locations)
backend/services/cuda.py—CUDA_LIBS_VERSIONconstant.github/workflows/release.yml— PyTorch install, packaging args, release asset filenamesscripts/package_cuda.py— CLI defaults and docstringbackend/build_binary.py— CUDA torch restore URLjustfile— dev setup PyTorch installbackend/requirements.txt— torch minimum versiondocs/content/docs/developer/building.mdx— developer docsGPU Settings UI fix
app/src/components/ServerSettings/GpuAcceleration.tsx— moved "Switch to CPU Backend" into its own top-level conditional block that renders whenisCurrentlyCudais true, so users can actually switch back to CPURelated issues
CUDA/GPU compatibility (#314, #313, #310, #301) — issues with CUDA 13.1, DirectML/AMD, RTX 5070, and CUDA errors during generation.
Closes #315
Summary by CodeRabbit
Release Notes
New Features
Chores