Hotfix: probe torch.cuda via subprocess + opaque install log#25
Merged
cryptopoly merged 1 commit intomainfrom May 1, 2026
Merged
Hotfix: probe torch.cuda via subprocess + opaque install log#25cryptopoly merged 1 commit intomainfrom
cryptopoly merged 1 commit intomainfrom
Conversation
Two regressions reported from the v0.7.2 Windows smoke test (RTX 4090). 1. GPU bundle install fails with PermissionError on torch DLLs. PR #22's _snapshot_torch_cuda probed the GPU by importing torch directly in the backend process. On Windows that loads torch/lib/*.dll (asmjit, cublas, cudnn, ...) into the process handle table, which then makes pip's --target install fail with PermissionError: [WinError 5] Access is denied: '...\extras\cp312\site-packages\torch\lib\asmjit.dll' when shutil.rmtree tries to swap the existing torch wheel. DiffusersImageEngine.probe() already documents this exact trap and explicitly avoids importing torch — _snapshot_torch_cuda was undoing that protection. Fix: spawn a short-lived Python subprocess that imports torch, prints {gpu_name, total, used} as JSON to stdout, and exits. The OS releases the DLL handles on process exit, so the next Install GPU runtime click can rmtree + replace torch in place. Prefer the embedded sidecar Python (CHAOSENGINE_EMBED_PYTHON_BIN) so the subprocess sees the same site-packages as the backend; fall back to sys.executable when the env var is not set. Also skip the probe entirely on macOS — Apple Silicon has no torch.cuda; the unified-memory path in _snapshot_macos owns that case. 2. InstallLogPanel still appears to overlap Prompt + Recent Outputs. The previous rgba(0, 0, 0, 0.22) background let the sibling panel headers bleed through whenever the install log was visually adjacent to them, which read as 'overlap' even when the layout wasn't actually intersecting. Switch to var(--surface) for a fully opaque card background, and add 'contain: layout' so the panel's growth during a long torch download cannot leak into sibling grid rows. Tests - tests/test_gpu_detection.py rewritten to mock subprocess.run instead of sys.modules['torch']. Adds an explicit assertion that the probe never imports torch in the main process — if anyone reverts to an in-process import, that test catches it. - All existing tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two regressions surfaced when smoke-testing the v0.7.2 rebuild on Windows / RTX 4090 — both blockers for shipping.
#1 — torch DLL lock prevents GPU bundle install
PR #22 added
_snapshot_torch_cudawhich didimport torchin the backend process. On Windows that pinstorch/lib/*.dll(asmjit, cublas, cudnn, ...) into the process handle table. The next click on Install GPU runtime runspip install --targetwhich callsshutil.rmtreeon the existing torch dir, hits the locked DLLs, and crashes:```
PermissionError: [WinError 5] Access is denied:
'...\extras\cp312\site-packages\torch\lib\asmjit.dll'
```
DiffusersImageEngine.probe()already documents this exact trap (it deliberately usesfind_specinstead of importing torch). PR #22 was undoing that protection.Fix: spawn a short-lived Python subprocess that imports torch, prints
{gpu_name, total, used}as JSON, and exits. The OS releases the DLL handles on subprocess exit so the next install can swap torch in place. Prefer the embedded sidecar Python (CHAOSENGINE_EMBED_PYTHON_BIN); fall back tosys.executable.Skip the probe entirely on macOS — Apple Silicon has no torch.cuda;
_snapshot_macosowns that path.#2 — Install log appears to overlap Prompt + Recent Outputs
PR #23's
position: relative; z-index: 5won the stacking battle, but the install-log-panel kept its translucentrgba(0, 0, 0, 0.22)background, so the Prompt + Recent Outputs panel headers bled through visually whenever the install log was adjacent to them. Reads as "overlap" even when the layout doesn't actually intersect.Fix: switch the background to
var(--surface)for a fully opaque card, and addcontain: layoutso the panel's growth during a long torch download can't leak into sibling grid rows.Changes
backend_service/helpers/gpu.py—_snapshot_torch_cudanow spawns a Python subprocess; new_resolve_python_executablepicks the embedded sidecar Python first; macOS short-circuits toNone.tests/test_gpu_detection.py— rewritten to mocksubprocess.runinstead ofsys.modules['torch']. Adds an explicit assertion that the probe never imports torch in the main process — if anyone reverts to an in-process import, this test catches it.src/styles.css—.install-log-panelgetsbackground: var(--surface)+contain: layout.Test plan
.venv/bin/python -m pytest tests/test_gpu.py tests/test_gpu_detection.py tests/test_inference.py tests/test_setup_routes.py -q— all pass (30 in test_gpu* alone, 1 expected skip)_snapshot_macos(untouched).