Skip to content

Feature/video gen#3

Merged
cryptopoly merged 55 commits intomainfrom
feature/video-gen
Apr 20, 2026
Merged

Feature/video gen#3
cryptopoly merged 55 commits intomainfrom
feature/video-gen

Conversation

@cryptopoly
Copy link
Copy Markdown
Owner

No description provided.

Pulls brand block, nav list, and footer out of App.tsx into
src/components/Sidebar.tsx. Identical JSX and filtering (Conversion tab
still hidden on non-Darwin). Prep work for upcoming sidebar grouping
and video-gen menu additions.
Reorganises the primary nav into collapsible groups with monochrome
stroke icons:

- Standalone: Dashboard, Chat, Server, Logs, Settings
- Grouped:    Models (Discover, My Models)
              Images (Discover, My Models, Studio, Gallery)
              Benchmarks (Run, History)
              Tools (Conversion, Fine-Tuning, Prompts, Plugins)

TabId remains flat so programmatic setActiveTab calls from
useImageState and elsewhere keep working. Adds a useSidebarPrefs
hook backed by localStorage that remembers collapsed groups, sidebar
mode (for Phase 3), and the last-clicked child per group.

Groups auto-expand while one of their children is active so users
never lose sight of where they are. Styling drops the standalone
outline boxes and the children's vertical guide line for a cleaner,
more unified look.

Adds four invariant tests in sidebarGroups.test.ts to prevent future
drift: each group's defaultChild exists, every tab.group references
a known group, non-placeholder groups have children, and grouped
tabs declare a shortLabel.
Adds a second sidebar style alongside the collapsible one, chosen via
a new Settings > Appearance panel:

- Collapsible (default): group headers expand/collapse children inline
- Tabs: groups render as a single compact sidebar button; clicking one
  jumps to its last-used child (or defaultChild on first visit), and a
  horizontal SubtabBar above the workspace content shows sibling tabs.

useSidebarPrefs is now lifted to App so Sidebar, SubtabBar, and the
Settings panel share the same persisted state (mode, collapsedGroups,
lastChildByGroup). Sidebar no longer calls the hook internally; it
receives prefs as props.

SubtabBar only renders when mode is "tabs" and the active tab belongs
to a group. Conversion is filtered out on non-Darwin so the bar stays
consistent with the sidebar.
Within the Models and Images groups, list the locally-installed tab
("My Models" / "Image Models") before the Discover tab. Since the
group's defaultChild maps to whichever tab is first in each group's
natural order via sidebarGroups.ts metadata, this makes "what's on
this machine" the landing view instead of "search and download".

Search remains one click away as the second child.
Adds a Video group to the sidebar alongside Models and Images, with
four child tabs mirroring the Images layout — My Models, Discover,
Studio, Gallery. Routing is wired end-to-end (TabId union, tabs.ts,
sidebarGroups.ts, App.tsx) and every tab renders a shared
VideoPlaceholderTab component.

The placeholder surfaces the planned first-wave engines (LTX-Video,
Wan 2.2, HunyuanVideo, Mochi 1) so users understand what's coming
and what's on the roadmap. No engine code ships — this is purely
UX shell, ready for the video runtime to slot in behind it.

Test updated: the previous "video group may be empty" exception is
gone; every declared group must now have at least one child tab.
Backs the Video tab UX with a minimal, contract-locked API surface so
the frontend can start consuming real data ahead of the runtime:

- backend_service/catalog/video_models.py — first-wave engine catalog
  (LTX-Video, Wan 2.2, HunyuanVideo, Mochi 1) with size, resolution,
  duration defaults, and task support. Mirrors the image catalog shape
  so the frontend can reuse rendering code.
- backend_service/routes/video.py — FastAPI router with:
  * GET  /api/video/catalog   -> planned engines (populated)
  * GET  /api/video/runtime   -> "not available" status (mirrors image
    runtime shape for future drop-in)
  * GET  /api/video/library   -> empty list
  * GET  /api/video/outputs   -> empty list
  * POST /api/video/generate  -> 501
  * POST /api/video/preload   -> 501
  * POST /api/video/download  -> 501
- tests/test_video_routes.py — 9 contract tests covering catalog shape,
  runtime flags, library/outputs empty state, and 501 on unimplemented
  endpoints.
- src/types.ts — VideoModelVariant, VideoModelFamily, VideoCatalogResponse,
  VideoRuntimeStatus, VideoModelTask.
- src/api.ts — getVideoCatalog() and getVideoRuntime() client methods.

No runtime code — purely contract + data + types. The engine slots in
behind these routes when it's ready.
Video generation runtime lands behind the previously-scaffolded API.
The frontend contract is unchanged — endpoints now return real data
from diffusers instead of static placeholders.

Runtime module (backend_service/video_runtime.py):
- DiffusersVideoEngine with probe(), preload(), unload(). Generation
  deliberately raises NotImplementedError for a future phase.
- PIPELINE_REGISTRY mapping each first-wave repo to the right diffusers
  pipeline class: LTXPipeline, MochiPipeline, WanPipeline,
  HunyuanVideoPipeline.
- Dependency probe separates core deps (diffusers, torch, accelerate,
  huggingface_hub, pillow — gate realGenerationAvailable) from output
  deps (imageio, imageio-ffmpeg — warn only, needed for mp4 encoding).
- Device detection: cuda > mps > cpu with bf16 / fp16 / fp32 fallback.
- Memory-saving loaders enabled by default: attention slicing, VAE
  slicing + tiling, with sequential CPU offload fallback if the model
  doesn't fit on the chosen device.
- VideoRuntimeManager facade mirrors ImageRuntimeManager for clean
  state-level substitution.

Helpers (backend_service/helpers/video.py):
- _find_video_variant / _find_video_variant_by_repo
- _video_variant_available_locally via local snapshot validation
- _video_model_payloads enriches catalog variants with availableLocally
  + hasLocalData so the frontend can render install state.

State wiring (backend_service/state.py):
- ChaosEngineState now exposes state.video_runtime, initialised the
  same way as state.image_runtime.

Pydantic models:
- VideoRuntimePreloadRequest / VideoRuntimeUnloadRequest
- VideoGenerationRequest shape defined early so the frontend can type
  against it ahead of the generate endpoint lighting up.

Routes (backend_service/routes/video.py):
- GET  /api/video/catalog   now returns per-variant availableLocally
- GET  /api/video/runtime   delegates to state.video_runtime.capabilities()
- POST /api/video/preload   real: 404 unknown / 409 not-installed / 200 OK
- POST /api/video/unload    real
- GET  /api/video/library   filters catalog by local snapshot readiness
- POST /api/video/generate  still 501 (next phase)
- POST /api/video/download  still 501 (next phase)

Tests:
- tests/test_video_runtime.py — 14 unit tests covering probe, pipeline
  routing, preload/unload lifecycle, and the not-yet-implemented
  generate() guard. No weights are loaded in tests.
- tests/test_video_routes.py — extended to 14 contract tests covering
  the real preload/unload paths and probe-backed runtime endpoint.

Verified: pytest 377 + 4 subtests passing, tsc clean, vitest 114/114.
Live probe on M4 Max 64GB reports activeEngine=diffusers, device=mps,
realGenerationAvailable=true with imageio flagged for generation.
Phase 8: wires /api/video/download, /status, /cancel, /delete to the
existing state.start_download plumbing so the Video tab can pull curated
model snapshots from Hugging Face. Endpoints are locked down to repos in
VIDEO_MODEL_FAMILIES — unknown repos 404 before any download runs.

Adds Wan 2.1 1.3B (~2.5GB) and 14B variants to the catalog and registers
both in PIPELINE_REGISTRY. Both route to the same WanPipeline class as
Wan 2.2 since the version lives in the weights, not the pipeline code.
The 1.3B variant is the intended first-download target for end-to-end
testing on modest hardware.

state.start_download now composes image-then-video post-download
validation so Wan 2.1's diffusers layout is checked the same way LTX or
Hunyuan would be. _unload_repo_from_runtimes also evicts from
video_runtime on delete. _known_repo_size_gb falls back to the video
catalog for preflight size hints.

Frontend api.ts gains downloadVideoModel, getVideoDownloadStatus,
cancelVideoDownload, deleteVideoDownload, preloadVideoModel, and
unloadVideoModel using the existing DownloadStatus types.

Tests: 9 new contract tests for the download routes cover unknown-repo
404s, missing-field 422s, empty status, non-video repo filtering, video
repo surfacing, and cancel/delete on repos that were never downloaded.
The actual HF snapshot_download happy path is left for integration
testing against a real Wan 2.1 1.3B pull — no engine internals are
mocked. PipelineRegistryTests updated with the three Wan variants plus
a new test asserting they all route to WanPipeline.
Real HF preflight on a running Wan 2.1 T2V 1.3B download reports 16.37GB,
not 2.5GB. The "1.3B" is just the transformer — the repo also ships a
UMT5-XXL text encoder (~11GB) and VAE/CLIP weights. Catalog sizeGb now
reflects the full repo footprint for both 1.3B (16.4GB) and 14B (45GB)
variants, so the Discover tab won't mislead users before a download
starts.

Test isolation: make_client() now redirects HF_HUB_CACHE,
HUGGINGFACE_HUB_CACHE, and HF_HOME into the per-test tempdir. Without
this, a unit test calling delete_download on a valid video repo would
physically wipe the user's in-progress snapshot on disk — a real bug we
hit once the first real Wan 2.1 download was running while tests tried
to exercise the same repo. Environment is snapshotted per test and
restored in tearDown.
Replace VideoPlaceholderTab with four live tabs backed by the catalog,
runtime, and download endpoints landed in earlier phases:

- VideoDiscoverTab — search/filter the curated families, download, pause,
  resume, delete, deep-link to the model card.
- VideoModelsTab — list installed variants with preload/unload controls
  driven by the diffusers video runtime.
- VideoStudioTab — model picker plus prompt/negative prompt/seed scaffold
  with the Generate button disabled until Phase 10 lands the mp4 loop.
- VideoGalleryTab — empty-state placeholder until outputs exist.

Adds useVideoState (simpler cousin of useImageState) with 2s download
polling, catalog refresh, preload/unload handlers, and selection sync;
plus utils/videos.ts and types/video.ts for discover filters and
runtime error helpers. The placeholder component is left untouched but
no longer referenced from App.tsx.
The base Wan-AI/Wan2.* and tencent/HunyuanVideo repos ship in their
native formats without model_index.json, so diffusers' from_pretrained
can't load them directly. Switch the catalog + pipeline registry to the
-Diffusers mirrors (and hunyuanvideo-community/HunyuanVideo) which ship
the standard diffusers layout WanPipeline and HunyuanVideoPipeline
expect.

This was caught when a 16GB Wan 2.1 1.3B download finished at 99% and
then failed preload with "missing model_index.json; found Wan2.1_VAE.pth,
diffusion_pytorch_model.safetensors, ...".
Phase 10 ships the real video generation loop. DiffusersVideoEngine.generate
now runs the pipeline through diffusers, encodes frames to mp4 via
imageio-ffmpeg, and returns a GeneratedVideo. The FastAPI route persists the
artifact to a day-bucketed outputs directory (mp4 + JSON sidecar) and the
Video Studio / Gallery tabs surface live runs via the existing auth flow.

Backend:
- DiffusersVideoEngine.generate() with _invoke_pipeline / _encode_frames_to_mp4
  test seams; HunyuanVideoPipeline's missing negative_prompt kwarg is handled
  and MPS generators are built on CPU per diffusers docs.
- VideoRuntimeManager.generate() facade returns (GeneratedVideo, runtime_dict)
  and refuses to run when the runtime isn't ready rather than shipping a fake
  clip.
- New outputs CRUD in backend_service/helpers/video.py plus DataLocation
  fields. /api/video/generate persists the artifact, /api/video/outputs lists
  saved clips, /api/video/outputs/{id}/file streams the mp4 via FileResponse,
  and DELETE removes both the mp4 and JSON sidecar.

Frontend:
- api.ts: generateVideo, getVideoOutputs, deleteVideoOutput, and
  fetchVideoOutputBlobUrl (uses apiFetch + createObjectURL so the auth token
  isn't smuggled into a query string for the <video> src).
- useVideoState: videoOutputs state, handleVideoGenerate (resolves resolution
  + frame count from the selected variant, validates seed, navigates to
  gallery on success), handleDeleteVideoOutput.
- VideoStudioTab: generation is no longer placeholder — the Generate button
  is gated on a real set of preconditions and surfaces specific reasons via
  the button title.
- VideoGalleryTab: real gallery with per-card <video> element backed by a
  blob URL, reveal + delete actions.

Tests:
- 20 video runtime tests including generate happy path, empty-frame rejection,
  output-deps enforcement, and the manager facade.
- Video routes: 404 / 409 / 422 error paths, a full generate -> persist ->
  stream -> delete round-trip, RuntimeError surfacing as 400, and the 410
  Gone path when metadata exists but the mp4 is gone from disk.
- Full suite: 399 Python tests + 114 TS tests pass; tsc clean.
Surface a file-manager shortcut on each installed variant card so users
can jump from the UI to the underlying HF snapshot on disk. Exposes a
`localPath` field on image/video variant payloads (populated only when
something is actually on disk) and wires the existing handleRevealPath
handler through all four tabs.

Tabs updated: ImageModelsTab, ImageDiscoverTab (via LatestImageDiscoverCard),
VideoModelsTab, VideoDiscoverTab. The button is platform-aware via the
shared `fileRevealLabel` tooltip ("Show in Finder"/"Show in Explorer"/
"Show in Files").
Upstream v0.1.4 renamed ``generate_dflash_once`` to
``stream_dflash_generate`` (a streaming iterator) and removed the
non-speculative baseline fallback branch. The PyPI artefact is still
stuck at 0.1.0, so pin every install site to the git tag instead.

``_generate_dflash`` now drives the iterator to completion and pulls
the final ``{"event": "summary", ...}`` payload — whose shape matches
the old single-dict return — so the rest of the worker is unchanged.
Two fixes for the video gen loop:

1. snapshot_download for video repos was pulling every legacy checkpoint
   sibling — repos like Lightricks/LTX-Video ship the diffusers pipeline
   layout plus half a dozen standalone safetensors at root, so the same
   2 GB pipeline was advertised as a 200+ GB download and crawled at 1%.
   Plumb an optional allow_patterns list through the HF helper and
   populate it with the standard diffusers-pipeline folders for any
   repo that belongs to VIDEO_MODEL_FAMILIES. Non-video repos keep the
   current unfiltered behaviour.

2. Video Studio dropdown listed every catalogued variant, which was
   confusing when nothing was installed yet. Filter to variants with
   local data (or a live download) and fall back to an inline CTA that
   sends the user to Video Discover when the library is empty.
Video generation needs imageio + imageio-ffmpeg to write mp4 output. They
already appear in videoRuntimeStatus.missingDependencies but previously
only surfaced as muted badges, leaving users to pip install from a
terminal. Now the Video Studio renders an "Install mp4 encoder" callout
when either dep is missing and installs both via the existing
/api/setup/install-package endpoint, then re-probes the runtime.

Both entries are whitelisted in _INSTALLABLE_PIP_PACKAGES. Added 3 tests
covering the new pip install path and a regression guard on the
allowlist keys so the UI contract doesn't silently drift.
The LTX-Video failure ("no file named config.json found in
/snapshots/...") came from a partial download whose model_index.json
listed transformer/, vae/, and tokenizer/ but the subfolders never
landed on disk. validate_local_diffusers_snapshot now walks the index
and flags any required component whose folder is missing or has no
config, so users see "missing components: transformer, tokenizer, vae"
plus the redownload guidance instead of a cryptic OSError from inside
diffusers.from_pretrained.

Adds imageOutputsDirectory and videoOutputsDirectory settings so the
delivery folder for finished renders can point at an external SSD,
Dropbox, or a client folder without moving the entire data directory.
Empty string keeps the default {data dir}/images/outputs or
/videos/outputs. The image and video helpers in app.py resolve the
override on every call, so changes apply to the next generation
without a backend restart.
Replaces the time-based progress estimates in the image + video
generation modals with a live signal driven by the backend pipelines,
and adds a video generation modal mirroring the existing image one.

Backend
- New `progress.ProgressTracker` thread-safe scratchpad with
  `IMAGE_PROGRESS` / `VIDEO_PROGRESS` module singletons. Phase IDs
  (loading / encoding / diffusing / decoding / saving) match what the
  modals already render.
- `image_runtime` and `video_runtime` wrap their generate paths in
  `begin()` / `set_phase()` / `finish()` and feed
  `callback_on_step_end` into `set_step()` so the bar follows the real
  diffuser step counter. TypeError fallback chain drops the callback,
  then `negative_prompt`, for older diffusers signatures.
- New GET `/api/images/progress` and `/api/video/progress` endpoints
  expose the singleton snapshot.

Frontend
- New `useGenerationProgress` hook polls the endpoint every 500ms
  while the modal is busy and reports `null` when the backend goes
  idle so callers can fall back to estimates.
- `LiveProgress` accepts an optional `realProgress` prop. When live,
  it picks the active phase by ID instead of by elapsed time and the
  diffusion phase fills proportionally to `step / totalSteps` rather
  than to a guess.
- New `VideoGenerationModal` mirrors `ImageGenerationModal`: opens at
  generation start, shows the live progress bar, then swaps to the
  rendered clip via the same blob-URL trick as the gallery. Replaces
  the old "switch to gallery on success" behaviour.
- `useVideoState` now drives the modal lifecycle and exposes the run
  metadata (model, prompt, frames, fps, steps, needsPipelineLoad).

Tests
- New `tests/test_progress.py` (20 tests) locks the tracker lifecycle,
  snapshot contract, post-finish no-op, and endpoint shape /
  isolation. Full pytest, vitest, and `tsc --noEmit` all green.
Wan 2.1 1.3B at catalog defaults (832x480 × 96 frames × 50 steps) blew up
Apple Silicon's MPS backend with a 73 GB attention-matrix allocation,
killing the sidecar mid-generation. The user had no way to dial the run
down without editing code. Now the Studio exposes width / height /
frames / fps / steps / guidance as form fields, and useVideoState seeds
them with short, MPS-safe defaults (33 frames, 30 steps, 5.0 guidance)
pulled from the variant's ``recommendedResolution`` hint.

Frames snap to ``(n - 1) % 4 == 0`` because Wan-family pipelines reject
anything else — the input field steps by 4 and the generate handler
re-snaps defensively in case a stale value slips through. The selected
variant's id drives a reset effect that refreshes width/height from the
catalog hint but leaves user edits alone on unrelated catalog refreshes.
``test_outputs_is_empty_until_generation_lands`` failed on any machine
where the user had generated a real video, because ``make_client`` only
redirected HF cache env vars — not the module-level ``VIDEO_OUTPUTS_DIR``
and ``SETTINGS_PATH`` constants in ``backend_service.app``. Those capture
the user's real ``~/.chaosengine/`` paths at import time, so the
``/api/video/outputs`` route kept reading live artifacts from disk.

``make_client`` now patches ``VIDEO_OUTPUTS_DIR``, ``IMAGE_OUTPUTS_DIR``,
and ``SETTINGS_PATH`` into the per-test tempdir and stashes the originals
under sentinel keys (``__video_outputs_dir__`` etc.) in the returned
snapshot dict. ``restore_env`` restores the module attrs on teardown
while skipping the sentinels in its env-var loop.

The SETTINGS_PATH patch matters because a user-set
``videoOutputsDirectory`` in real settings would otherwise bypass our
patched ``VIDEO_OUTPUTS_DIR`` and point the test back at a real location.

Existing tests that patch ``VIDEO_OUTPUTS_DIR`` themselves
(``VideoOutputsPersistenceTests``, ``VideoOutputFileTests``) keep
working: they layer their own override on top of the tempdir value and
restore to it, before ``restore_env`` unwinds to the real original.
Two distinct user-reported bugs:

**Can't delete digits in Frames / Width / Height / FPS / Steps / Guidance.**
The previous ``onChange={Number(event.target.value) || fallback}`` pattern
treated an empty string as 0, then ``|| fallback`` snapped back to the
default — so deleting the last digit of "33" reinstated "33" instantly.
The user had to type "N33", delete "N", to get "N33" → "33" (or type
"320" then delete the leading digit). Now every numeric input carries
``NaN`` for "user is mid-edit / field is empty", renders as "" via a
``displayNumber`` helper, and snaps to a fallback only on blur.
``handleVideoGenerate`` guards every numeric field against ``NaN`` so
nothing invalid reaches the backend.

**Generate button stayed disabled after a Wan 2.1 MPS crash.**
The hover-only tooltip made the disable reason invisible unless the user
thought to hover. Now the reason is surfaced inline as a muted
"Generate disabled: <reason>" line under the button, so any stuck state
is obvious at a glance. Plus, ``handleVideoGenerate`` now runs
``refreshVideoData`` in the catch path — a sidecar crash can leave
``videoRuntimeStatus`` stale (``realGenerationAvailable`` flipping false
with no trigger to re-probe), so a forced resync on failure gets the
Studio back to a known-good state without requiring a page reload.
The previous safety heuristic used a flat MPS-strict token threshold,
which over-warned on beefy Macs (a 64 GB M4 Max saw "caution" at any
clip >40 frames even though it could clearly handle more) and treated
CUDA the same as MPS. Now the heuristic estimates peak attention memory
from (tokens² × bytes × calibrated-multiplier) and compares it against
the device's effective budget — scaling cleanly from an 8 GB base M1
to a 128 GB M3 Ultra to a 24 GB RTX 4090 with one formula.

Backend surfaces ``deviceMemoryGb`` on ``VideoRuntimeStatus`` from the
existing ``gpu`` helper (sysctl on macOS, nvidia-smi on CUDA). The
Studio now shows an always-visible capacity line under the numeric
controls ("Apple Silicon · 64 GB total · this run ≈ 2.9 GB of
attention memory") so users can see their headroom before generate,
not just when something's already wrong. The warning callout also
talks in GB rather than opaque latent-token counts.

Calibrated against the Wan 2.1 T2V 1.3B 832×480 × 96-frame crash
report: that config correctly lands in "danger" on 16 GB, "caution"
on a 40 GB A100, and "safe" on a 128 GB Mac.
The old estimate only summed attention memory, which under-counted real
resident memory by ~25×. On a 64 GB M4 Max, Wan 2.1 T2V 1.3B at 40 frames
looked "safe" but crashed the backend with an 88 GB allocation: the model
weights + UMT5-XXL text encoder alone occupy ~23 GB resident, before any
attention activations.

Changes:
- assessVideoGenerationSafety now takes baseModelFootprintGb and
  estimates resident footprint with device-class fragmentation factors
  (MPS 1.4x, CPU 1.3x, CUDA 1.05x).
- Peak memory = modelFootprintGb + attentionPeakGb.
- Short-circuit when the model alone exceeds the caution budget: hand
  back a null suggestion and a "try a smaller model" message instead of
  pretending a smaller clip will help.
- Studio capacity line breaks out model vs attention cost, and the
  safety callout swaps the "Use safer settings" button for "Browse
  smaller models" when no safer settings can rescue the run.
- Fallback indicator on the capacity line when the backend hasn't
  reported real device memory yet (e.g. stale sidecar before restart):
  shows "~16 GB (default — restart backend for real detection)" so the
  conservative default is obvious instead of being presented as truth.
- 9 new videos.test.ts cases covering the Wan 2.1 crash scenario,
  model-too-big short-circuit, device-class fragmentation ordering,
  and ignoring non-positive/NaN footprints.
The My Models table's RAM and Compressed columns were reading
matchedVariant.estimatedMemoryGb from the catalog, computed by a crude
params_b * quant_factor + 1.6 formula against whichever family variant
scored best. When a user's installed model had no close catalog variant
(e.g. Qwen2.5-0.5B, Qwen3-8B, Qwen3-0.6B) the match collapsed onto the
family's flagship, so three wildly different models all reported the
same ~76.6 GB RAM — higher than the biggest 67 GB model in the table.

Replaces the lookup with estimateLibraryItemResidentGb() /
estimateLibraryItemCompressedGb() in utils/library.ts, which start from
item.sizeGb (ground truth — MLX, GGUF, and safetensors all store weights
at runtime precision) and add a small KV-cache term for an 8K GQA
context plus framework overhead. The compressed variant halves the KV
term, matching the ChaosEngine/TurboQuant/RotorQuant cache strategies.

Falls back to the catalog estimate only when sizeGb is unknown or
non-positive, which should be rare.

New library.test.ts coverage:
- Monotonicity across disk sizes (the Qwen regression)
- Sane ballpark for 0.5B and 8B models
- Fallback to catalog estimate on 0/NaN/negative sizeGb
- Null when both on-disk and catalog data are missing
- Compressed < uncompressed at short contexts, with a small delta
Two issues hit Windows users from a fresh clone:

1) build.ps1 fails with EPERM during stage-runtime's fs.rmSync of
   .runtime-stage\windows-x86_64. Node's `force: true` does not actually
   chmod read-only files writable on Windows, and Defender / Explorer /
   pip-installed package metadata regularly hold transient locks that
   surface as EPERM/EBUSY/ENOTEMPTY. The whole staging run blew up
   instead of retrying.

2) tauri:dev appeared to hang for 5-15 minutes after staging optional
   packages. It was actually tarring the multi-GB staging tree into
   src-tauri/resources/embedded/runtime-*.tar.gz — output that the
   Tauri shell ignores in dev mode (see src-tauri/src/lib.rs around the
   "development embedded runtime detected; preferring source workspace"
   branch, which returns None and falls back to the live workspace
   without ever extracting the archive).

Changes:
- New safeRmSync() helper: clears the read-only bit recursively before
  rm, then retries on EPERM/EBUSY/ENOTEMPTY/EACCES with exponential
  backoff. macOS/Linux keep the fast path. On final failure surfaces an
  actionable message pointing at Windows Defender exclusions.
- All five fs.rmSync sites in stage-runtime.mjs now use safeRmSync.
- Skip the tar archive when not in --mode=release. Stale archive (if
  any) is removed so the Rust shell does not reach for it. Release
  builds keep the archive untouched.
- build.ps1: pre-clear .runtime-stage to dodge the EPERM root cause,
  and install the same extras stage-runtime validates against
  (`pip install -e ".[desktop,images]"`) instead of a hand-maintained
  short list that was missing diffusers/torch/etc.
Python 3.14 added a stdlib `compression` namespace package (PEP 784) that
exposes `compression.zstd`, `compression.bz2`, and `compression._common`.
Our local regular package shadowed it, and anything in the Python import
graph that reached for `compression._common` (including torch's internal
imports in some environments) failed with
`ModuleNotFoundError: No module named 'compression._common'`.

Renaming to `cache_compression` keeps the package descriptive, avoids the
stdlib collision, and doesn't clash with any PyPI package. All 5 adapter
modules, 10 backend import sites, the test suite, the stage-runtime
validator, the pre-build check, and docs are updated. Directory moves
preserved via `git mv` so history survives.

build.ps1 needs no rename-specific change — it installs via
`pip install -e ".[desktop,images]"` and picks up the new package name
from pyproject.toml's packages.find include list.
The chat-oriented My Models tab already filters on
``modelType === "text" || !item.modelType`` in App.tsx, but discovery was
classifying every video pipeline as "text" because there was no video
detector. Result: HunyuanVideo, Mochi-1, Wan2.x T2V, etc. all appeared
alongside LLMs (and in the chat picker), even while still mid-download.

Add `_looks_like_video_model()` with a curated keyword set covering every
family in ``backend_service/catalog/video_models.py`` plus common T2V/I2V
markers, and run it ahead of the image check in ``_iter_discovered_models``
so the ``modelType="video"`` tag is stable even before ``model_index.json``
lands. Keywords are deliberately specific ("hunyuanvideo" not "hunyuan",
"wan2" not "wan", "mochi-1" not "mochi") so the image-model Hunyuan and
unrelated LLMs don't get swept up.

Video models now only surface in the dedicated Video → My Models view.
Updated across pyproject.toml, package.json, package-lock.json,
src-tauri/Cargo.toml, src-tauri/Cargo.lock (project entry only — the
unrelated `tower` crate also at 0.5.3 was left alone), and
src-tauri/tauri.conf.json. Added a 0.6.0 CHANGELOG entry summarising
this release: the Python 3.14 compression-rename fix, on-disk-size-based
library RAM estimates, video models kept out of the LLM list, video-gen
memory safety that includes the model footprint, and the Windows staging
hardening (EPERM retries, dev-mode tar skip, pyproject-driven extras in
build.ps1).
When Wan 2.1 OOMs MPS on Apple Silicon, the Python sidecar is killed by
Metal (no graceful exception) and every in-flight fetch rejects with
WebKit's literal "TypeError: Load failed". That string was being stored
verbatim in the runtime status message, so the Studio displayed "Load
failed" / "ENGINE: UNAVAILABLE" / "Fallback active" — opaque copy that
read like a Diffusers problem rather than a transport problem. Worse,
even after the Tauri-managed sidecar restarted and `backendOnline`
flipped back to true, the video runtime was never re-probed, so the
Generate button stayed stuck disabled even when the user picked a
smaller model (LTX-Video) that would have run fine.

Three fixes:

* Translate WebKit's "Load failed" (and Chromium's "Failed to fetch")
  into "Backend is not responding — try Restart Backend." in
  videoRuntimeErrorStatus, so users see actionable copy instead of the
  raw transport error.
* Add a backend-recovery effect that fires refreshVideoData() when
  backendOnline transitions false→true. Uses a ref to track the
  previous value so we don't refetch on first mount (already covered
  by App.tsx's initial load effect).
* Add a model-change retry effect: when the runtime is stuck in
  activeEngine === "unavailable" and the user picks a different model,
  re-probe the backend. Gives the user a natural "try again with a
  smaller model" path without needing to know about Restart Backend.

Tests cover all four canonical fetch-transport error strings and lock
in that real backend errors (e.g. "Diffusers is not installed") still
pass through unchanged.
The Settings tab had grown to seven panels stacked on a 2-column grid
(Appearance, Data Directory, Delivery Folders, Model Directories, Remote
Providers, Hugging Face, Integrations) — squished horizontally on narrow
widths and forcing users to scroll past unrelated controls to reach the
one they wanted.

Group the panels into four sections and navigate between them:

* General — Appearance
* Storage — Data Directory, Delivery Folders, Model Directories
* Providers — Remote Providers, Hugging Face
* Integrations — tooling snippets

The navigation style mirrors the user's existing sidebarMode preference,
so the in-page UX matches the rest of the app:

* tabs mode gets a horizontal pill bar across the top, reusing the same
  .subtab-bar / .subtab classes as the top-level sub-tabs.
* collapsible mode gets a vertical menu down the left (the macOS / iOS
  Settings idiom) with label + short hint per section, sticky on
  scroll. On narrow widths (< 900 px) the menu collapses to a horizontal
  strip so it doesn't eat the content column.

Section selection is local component state — users land on General each
time, matching how most Settings UIs open. A :has() rule in content-grid
drops to a single column when only one panel is in the active section
(e.g. General = Appearance only) so it doesn't look lonely in half a
2-col grid.

No prop surface changes — the SettingsTab already received sidebarMode
from App.tsx for the Appearance panel, so the layout switch reuses it.
The collapsible-mode variant of the Settings page (vertical menu down
the left) felt clunky at the viewport widths this app actually runs at
— too much wasted horizontal space for four shortish labels, and the
sticky-aside positioning interacted awkwardly with the workspace
content frame's scroll.

Use the horizontal sub-tab bar unconditionally for Settings. The
app-wide ``sidebarMode`` preference still controls the top-level
sidebar (where the collapsible/tabs trade-off makes sense) — Settings
just no longer cascades from it.

Removes the ``.settings-side-nav*`` CSS, the ``data-mode`` attribute
hook, the side-nav-button branch in the render, and the per-section
``hint`` field that only the side-menu used. Leaves the section split
itself (General / Storage / Providers / Integrations) and the
``:has()`` rule that single-panel sections (e.g. General) get a full-
width grid.
Storage is the densest section of Settings — three panels (Data
Directory, Delivery Folders, Model Directories) plus a scrollable list
— so a flat 2-col ``content-grid`` either squished Model Directories'
list into half the row height or dropped one of the small panels into
an orphan second column.

Give Storage its own 2-col layout: the two small ``where files live``
panels stack in the left column, and Model Directories takes the full
height of the right column so its list + add-row can breathe. Each
Settings section now owns its own grid wrapper inside a bare
``.settings-content`` frame, so future sections can pick the shape
that fits them without fighting a shared outer grid.

Surface the resolved delivery paths in the input fields themselves,
not just a placeholder, so users can read the full effective location
even when they haven't set an override — matching what the backend
prints in logs. A ``default`` badge next to the row and a disabled
``Reset to default`` button together make it unambiguous whether a
path is inherited from the data directory or explicitly overridden.

At narrow widths (<=900px) the storage grid collapses to a single
column along with the other multi-col settings layouts.
The Studio used to default to "mps" whenever the backend probe hadn't
reported a device — sidecar dead, Failed to fetch, first-launch race.
On Windows that surfaced "close to the safe limit on Apple Silicon
(MPS) (~8 GB of 16 GB total)" to RTX 4090 users, which both reads as
nonsense and undersells the actual hardware budget.

assessVideoGenerationSafety now buckets the unknown case from
navigator.userAgentData / navigator.platform: macOS stays on MPS,
everything else falls through to CUDA. The result also surfaces
effectiveDevice + effectiveDeviceWasInferred so VideoStudioTab can
label the device correctly when the backend hasn't yet reported one.

Also adds CogVideoX 2B and 5B to the catalog and pipeline registry —
THUDM's open-weight family fills the gap between LTX (2 GB) and Wan
2.2 (14 GB) and is the most-requested missing model in the discover
list.
The "tiktoken is required" mid-generate error from LTX-Video had no
escape route in the UI — only the mp4 encoder bucket got a one-click
install. Same problem for Wan / HunyuanVideo / CogVideoX users when
sentencepiece or protobuf is missing.

Probe now also looks for tiktoken / sentencepiece / protobuf / ftfy
and surfaces any missing ones in missingDependencies. The Studio
groups them into a separate "Install missing video dependencies"
panel that names exactly which packages it'll add, so the user can
fix the LTX failure without dropping to a terminal.

handleInstallVideoOutputDeps grew an optional packages list — the
existing mp4 encoder button keeps its hardcoded pair, the new button
passes whatever the probe surfaced.
git checkout writes "Updated 1 path from the index" to stderr even when
it succeeds. With \$ErrorActionPreference = "Stop" at the top of the
script, the ``2>&1 | Out-Null`` redirect wraps that stderr text as a
NativeCommandError — so the cleanup step crashes the build *after* the
NSIS installer has already been produced.

Use git's --quiet flag to suppress the message and a ``2>\$null`` file
redirect to swallow anything else (file redirects bypass PowerShell's
stream-wrapping logic). Keep an explicit \$LASTEXITCODE check so a real
git failure (e.g. dirty working tree) still surfaces as a build error.
build.ps1 was patching tauri.conf.json to run stage:runtime (dev mode)
before bundling. Dev-mode staging skips building the tar.gz runtime
archive AND writes mode=development into the manifest — so the embedded
Python runtime wasn't in the installer, and the Tauri shell looked for a
live source workspace at the customer's install path that doesn't exist.
Result: a 3 MB installer that can't actually boot the backend.

Switched to stage:runtime:release, added a pre-flight llama.cpp check
with three clear remediation paths, and introduced
CHAOSENGINE_RELEASE_ALLOW_NO_LLAMA=1 for operators who want a
diffusers-only installer without having to compile llama.cpp locally.
stage-runtime.mjs now honours that env var in strict mode — a loud
warning instead of a hard failure.
/api/video/runtime was calling get_gpu_metrics() on every probe, which
shells out to nvidia-smi synchronously. On Windows each spawn flashed
a console window and cost 1-3s, and because total VRAM never changes
we paid that cost on every call. Combined with FastAPI's sync route
blocking a worker, the frontend's 15s fetchJson timeout was firing
and surfacing as "Failed to fetch".

Three fixes stacked defensively:
  - Pass CREATE_NO_WINDOW to every subprocess.check_output in
    helpers/gpu.py so nvidia-smi / sysctl / ioreg no longer pop a
    console window on Windows. Cuts spawn latency too.
  - New get_device_vram_total_gb() with a process-wide cache. The
    first probe pays the shelling-out cost; subsequent ones are a
    dict lookup. video_runtime._detect_device_memory_gb routes
    through it instead of the full live snapshot.
  - Bump getVideoRuntime's fetchJson timeout 15s -> 30s so the very
    first probe of a sidecar's life (which also imports torch) has
    headroom on cold disks.

Added cache and Windows-flag tests in test_gpu.py.
Windows PowerShell 5.1's parser rejects '&&' as an invalid statement
separator even when it appears inside a double-quoted string literal
(PS 7+ and pwsh don't have this issue). Replaced with '; ' so the
script parses on stock Windows installs.
Windows PowerShell 5.1 has a parser quirk where a tokenization error
inside a comment (like the literal string '&&' I used to describe the
previous fix) cascades into parse recovery that then mis-tokenizes
later comments containing apostrophes - reporting things like "token
's' unexpected" on lines like "customer's install path".

Rewrote the new comments to avoid apostrophes and the double-ampersand
literal entirely. Pre-existing "PowerShell's" on the git-checkout
comment is untouched because it parsed cleanly in 223b6af before my
new block triggered recovery mode.
Video Studio was showing "Backend is not responding - try Restart
Backend" whenever /api/video/runtime raised Failed-to-fetch, which
directly contradicted the global BACKEND ONLINE pill driven from
/api/health. The two probes are independent - the sidecar can be up
and answering health checks while the video probe fails (common
during a backend restart, or the first probe of a sidecar's life
while torch is importing on Windows).

Renamed the translated message to "Video runtime did not respond"
so the UI stops claiming the whole backend is dead, and added a
separate branch for fetchJson's "timed out after Xs" error - those
mean the backend accepted the request but didn't answer in time,
which is different from failed-to-fetch and deserves its own copy.

Tests pinned: the message must name "video runtime" (not "backend"),
and the timeout branch must explicitly mention "timed out".
The cmake hint line contained "..\llama.cpp\ (cmake -B build; cmake
--build build)", and Windows PowerShell 5.1 mis-tokenizes the "\ ("
sequence inside a double-quoted string as a subexpression start -
triggering "Missing closing ')' in expression" and a cascade of
parser recovery errors that made the whole if/else block look
unclosed.

Split the hint across two Write-Host calls and dropped parens from
the strings entirely. Plain prose sidesteps the quirk and reads
just as clearly at the prompt.
Four related fixes so a cold Windows sidecar stops leaving the Studios
stuck on "Failed to fetch":

- Warm up PyTorch in a background thread at sidecar startup so the
  video runtime probe returns immediately with an "initializing" status
  instead of blocking on a 30-60s import.
- Whitelist the core image runtime packages (diffusers, torch,
  accelerate, huggingface_hub, pillow) and surface a one-click
  "Install image runtime" button in Image Studio.
- Always offer Restart Backend when the runtime probe failed, not only
  when Tauri manages the sidecar.
- Replace the opaque "Failed to fetch" copy with an honest explanation
  of the 30-60s cold-import window, and poll every 5s while the engine
  is initializing or unavailable so the promised auto-refresh actually
  happens.
Release dates help users spot ancient models that aren't worth downloading.
Curated variants now carry `releaseDate` and the Discover tabs fall back to
the live Hugging Face `createdAt` value, both rendered via a shared
`formatReleaseLabel` helper so the label stays consistent.

File sizes on installed cards were off by 30+ GB for FLUX.1 Schnell because
(a) downloads pulled duplicate standalone safetensors the diffusers pipeline
never loads, and (b) the card showed the curated estimate instead of what is
actually on disk. Image downloads now use an allowlist that mirrors the
video one (keeps the pipeline layout, skips legacy single-file checkpoints),
and both image and video payloads surface the real snapshot directory size
so the installed cards can display "X GB on disk".
build.ps1 silently reported "Build complete!" when tauri build had already
failed because $ErrorActionPreference=Stop does not catch native command
failures. Add Assert-LastExit after every pip/npm/node/npx/git call so a
non-zero exit actually aborts the script, and always run the tauri.conf.json
restore even when the build failed.

The inline node -e "..." that patched tauri.conf.json was fragile under
PowerShell quoting — one misparse left the JSON empty, which then cascaded
into a confusing "EOF while parsing a value at line 1 column 0" from the
next tauri build. Extract the patch/restore into scripts/patch-tauri-conf.mjs
so there is no quoting surface area, and self-heal an empty config by
restoring from git before patching.

On Windows, `pip install torch` from PyPI delivers the CPU-only wheel, which
leaves an RTX 4090 idle and makes FLUX.1 Dev take 7+ minutes per diffusion
step. build.ps1 now installs torch from the CUDA 12.1 index first (override
via CHAOSENGINE_TORCH_INDEX_URL) and the image runtime probe returns an
actionable hint when torch is CPU-only on a machine with nvidia-smi present
— so users see "reinstall with the CUDA wheel" instead of a silent hang.

The Chat tab was also showing a sticky "Failed to fetch" banner after cold
start because refreshImageData / refreshVideoData raced the sidecar's port
bind and surfaced the resulting TypeError as a global error. Add
isTransientNetworkError() and swallow those in both refresh paths — real
HTTP errors still bubble up as before.
The backend always required a bearer token on /api and /v1, which blocked
external clients (OpenWebUI, curl, sibling apps) from connecting to the
local server even when the user wanted them to. Introduce a requireApiAuth
setting (default true, so existing installs stay secure) that gates the
auth middleware, surface it as a "Require API token" checkbox on the Server
tab next to LAN access, and persist it through the settings PATCH path so
the toggle hot-applies without a server restart.

Also honour CHAOSENGINE_REQUIRE_AUTH in create_app — values of 0, false,
no, or off disable auth regardless of the saved setting, which is useful
for headless/CI runs that don't have a settings.json to edit.

Tests cover three cases: tokenless request is rejected when the toggle is
on, succeeds when the toggle is off, and flipping the toggle via PATCH
/api/settings takes effect on the very next request without restarting
the app. Env-override path has its own test.
Two related fixes for Windows/Linux users with NVIDIA GPUs where the CPU-only
torch wheel and a slow cold-disk probe were conspiring to make image and
video generation look broken.

- Extract the nvidia-smi presence check into backend_service.helpers.gpu as
  shared nvidia_gpu_present() / gpu_status_snapshot() helpers, add a matching
  /api/system/gpu-status endpoint (auth-exempt so the banner works regardless
  of token state), and surface an amber "Running on CPU" banner in App.tsx
  with a persistent Dismiss. Video runtime now emits the same CUDA-wheel
  install hint the image runtime already did.

- Reorder DiffusersVideoEngine.probe() so torch_warmup_status() runs before
  any importlib.util.find_spec / nvidia-smi work. On a cold NTFS volume the
  find_spec + subprocess cost was pushing /api/video/runtime past the
  frontend's 30s fetch budget; Chromium aborted the fetch with
  "Failed to fetch" and the Studio froze on "ENGINE: UNAVAILABLE" even
  though /api/health kept responding. "not_started" now kicks off the warmup
  and returns fast, and the warmup worker pre-caches VRAM detection and the
  dep find_spec lookups so the post-warmup probe is a hashmap lookup.
Every Windows build attempt failed with parser errors whose location kept
shifting (string terminators, missing braces), which was the tell: the
file itself wasn't malformed, it was being decoded wrong.

PS 5.1 on Windows reads scripts without a BOM as the system ANSI code
page (Windows-1252), not UTF-8. build.ps1 had twelve non-ASCII
characters (em-dashes and box-drawing chars used as section dividers in
comments). Each one decodes as three garbage Windows-1252 bytes on the
user's machine, and those bytes land inside strings or near braces —
which is why the reported error location kept moving as we fixed
individual strings around them.

Fix: replace -- and box-drawing separators with ASCII dashes so the
file is pure ASCII and encoding detection becomes moot. Also update
the stale comment at the bottom to document the real root cause for
future maintainers.

Only build.ps1 is affected; no other .ps1 files in the repo have the
same issue.
- build.ps1: missing llama-server.exe now warns and continues by default;
  opt into the old strict behaviour with CHAOSENGINE_REQUIRE_LLAMA=1
- build.ps1 / build.sh / stage-runtime.mjs: upgrade setuptools>=77 before
  the vendor/ChaosEngine install so PEP 639 license strings validate
- install-cuda-torch: sweep leftover ~* pip stubs and split the install
  into --force-reinstall --no-deps + plain deps pass, so swapping torch
  no longer tries to overwrite markupsafe/_speedups.pyd while it's loaded
- new publish-artifacts.mjs collects .exe/.dmg/.app/.deb/.AppImage/.msi
  into a flat assets/ folder at the repo root; wired into all build paths
- Auto-download a prebuilt Vulkan llama.cpp release from ggml-org in
  stage-runtime.mjs when no local build is present, caching under
  ~/.chaosengine/prebuilt-llama/. The Windows installer now ships with
  native inference without requiring the user to clone llama.cpp and
  run cmake + VS Build Tools first.
- Pin setuptools to >=77,<82 in build.ps1, build.sh, and stage-runtime.mjs
  so recent torch wheels (which declare setuptools<82) stop warning on
  every pip invocation.
- Verify torch.cuda.is_available() after the CUDA install when nvidia-smi
  is present; surface a loud warning (or fail with CHAOSENGINE_REQUIRE_CUDA_TORCH=1)
  so we catch "shipped CPU-only torch on an RTX 4090" at build time.
- Harden the Windows restart-backend path in lib.rs: capture taskkill's
  exit status, fall back to child.kill() on failure, and replace the
  blocking child.wait() with a bounded try_wait loop. Previously a
  non-zero taskkill left the BackendManager mutex held forever and
  deadlocked the UI's runtime_info poll.
@cryptopoly cryptopoly merged commit f2416c0 into main Apr 20, 2026
1 check failed
cryptopoly added a commit that referenced this pull request Apr 21, 2026
User reported CUDA torch install succeeded but Image/Video Studio
still showed DEVICE: CPU and video generation ran on CPU despite
an RTX 4090 being present.

Root cause: the backend was in source-workspace launcher mode (Tauri
couldn't find / fell back from the embedded runtime), so Python ran
against the dev .venv. The extras-site-packages prepend to PYTHONPATH
only existed in apply_embedded_runtime_env, which the source-workspace
branch never calls. Result: Python started with no PYTHONPATH,
found torch in .venv (or nowhere), never looked at the 2.5 GB of
CUDA torch the GPU bundle install had just dropped into
~/.chaosengine/extras/site-packages/.

Evidence from user's diagnostics snapshot:
  PYTHONPATH      | null (not set)
  sysPath         | includes .venv/Lib/site-packages, missing extras

Fix: add the matching PYTHONPATH prepend in the source-workspace
branch of bootstrap(). Same shape as the embedded path, just without
the runtime-specific entries (source-workspace Python auto-discovers
.venv via sys.prefix, so we only need to inject extras to win over
whatever the dev venv happens to have).

Per Karpathy CLAUDE.md #3 (Surgical Changes): single additive block
inside an existing else branch. No existing logic modified. No new
helpers. Embedded path is untouched because it already handles this.

cargo check clean. No Python or TS tests exercise this path (it's
purely subprocess env-var wiring), verification is post-rebuild on
the user's Windows VM: /api/diagnostics/snapshot should show the
extras path in both PYTHONPATH and sysPath.
cryptopoly added a commit that referenced this pull request Apr 21, 2026
User asked for: fixed-width terminal, single scroll region, step
counter showing progress. Previous per-step <details> cards were OK
on a 3-package install but stacked too tall on the 13-package GPU
bundle — output scrolled off-screen and users lost track of which
step was current.

New layout:

- Single monospace <pre> region, max-height 380px, auto-scrolls to
  bottom on new attempts (tail -f behaviour). Doesn't steal scroll
  on phase transitions — only on new output — so a user reading
  earlier lines doesn't get yanked forward.
- Step line above the terminal shows 'Step 3/13: accelerate · 42%'
  while running, 'Final: 12/13 packages · 100%' when done.
- Per-attempt markers ([ OK ], [FAIL], [....] for in-progress) line
  up on the left edge so failures are scannable.

Also strips pip's dep-resolver noise from the displayed output. The
user hit this on their Windows box where their .venv's leftover
turboquant-mlx-full declared an mlx>= constraint that will never be
satisfied on Windows — pip prints a scary-looking ERROR block
('chaos-engine-compressor ... requires safetensors, which is not
installed'), cosmetic but alarming. Raw attempt.output still has
the noise; we just filter it from the rendered terminal. Users who
want the full pip trace still get it via the attempts array.

Per CLAUDE.md #2 (Simplicity First): single component file, no new
abstractions, no new deps. Per #3 (Surgical Changes): only touched
InstallLogPanel.tsx + its CSS block. Per #4 (Goal-Driven): visible
improvement (scroll works, step counter visible, noise suppressed)
verifiable at a glance on next build.

174 TS tests still pass, tsc clean.
cryptopoly added a commit that referenced this pull request Apr 21, 2026
Two UX asks from the user's latest round:

1. 'Image Studio doesn't say whether it's trying to use CPU or CUDA'
2. 'Even though I just installed everything, the Image Studio still
   shows an install GPU runtime button. I closed and reopened the
   app and it disappeared.'

## Fix C: Device chip

The chip at line 256 only rendered when runtime.device was set,
and since my earlier refactor removed the speculative torch import
from probe() the device is now null until a model is actually loaded.
Added an expectedDevice field to ImageRuntimeStatus that's
computed WITHOUT importing torch — find_spec + nvidia_gpu_present +
platform.machine — so we can show 'Device: cuda (expected)' before
the first Generate even fires. Same constraint as probe(): absolutely
no torch import (would pin torch/lib/*.dll and break the install
flow we just fixed).

The chip now reads:

  Device: cuda                (model loaded, actual device)
  Device: cuda (expected)     (torch installed + NVIDIA seen)
  Device: mps (expected)      (Apple Silicon)
  Device: cpu (expected)      (torch installed, no GPU)
  (hidden)                    (torch not installed)

## Fix D: Post-install restart nudge

Before this, after a successful GPU bundle install the Image Studio
still showed 'Install GPU runtime'. Root cause: backend's sys.path
is snapshotted at spawn time, so find_spec still reports torch as
missing until the backend restarts with the new PYTHONPATH (Fix A's
domain). User was confused — install said success, UI said install
again. App restart made it go away.

Split the 'runtime-not-available' block into two paths:

  - Post-install awaiting restart (job.phase === 'done' &&
    job.requiresRestart): show 'installed to <path>, restart to
    activate' + a Restart Backend button. No install button — you
    just installed, clicking it again would be confusing.
  - All other cases: show the install button as before.

Mirrored the same split to Video Studio for consistency.

Per CLAUDE.md #1 (Think Before Coding): surfaced the 'running
backend can't see new packages' reality to the user instead of
hiding it. Per #3 (Surgical Changes): added one field to the
backend status + one branch to each Studio tab; no shared
component extraction yet because we only use this shape in two
places and hoisting would obscure the per-tab differences.

478 Python tests pass, 174 TS tests pass, tsc clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant