Feature/video gen by cryptopoly · Pull Request #3 · cryptopoly/ChaosEngineAI

cryptopoly · 2026-04-20T20:02:06Z

No description provided.

Pulls brand block, nav list, and footer out of App.tsx into src/components/Sidebar.tsx. Identical JSX and filtering (Conversion tab still hidden on non-Darwin). Prep work for upcoming sidebar grouping and video-gen menu additions.

Reorganises the primary nav into collapsible groups with monochrome stroke icons: - Standalone: Dashboard, Chat, Server, Logs, Settings - Grouped: Models (Discover, My Models) Images (Discover, My Models, Studio, Gallery) Benchmarks (Run, History) Tools (Conversion, Fine-Tuning, Prompts, Plugins) TabId remains flat so programmatic setActiveTab calls from useImageState and elsewhere keep working. Adds a useSidebarPrefs hook backed by localStorage that remembers collapsed groups, sidebar mode (for Phase 3), and the last-clicked child per group. Groups auto-expand while one of their children is active so users never lose sight of where they are. Styling drops the standalone outline boxes and the children's vertical guide line for a cleaner, more unified look. Adds four invariant tests in sidebarGroups.test.ts to prevent future drift: each group's defaultChild exists, every tab.group references a known group, non-placeholder groups have children, and grouped tabs declare a shortLabel.

Adds a second sidebar style alongside the collapsible one, chosen via a new Settings > Appearance panel: - Collapsible (default): group headers expand/collapse children inline - Tabs: groups render as a single compact sidebar button; clicking one jumps to its last-used child (or defaultChild on first visit), and a horizontal SubtabBar above the workspace content shows sibling tabs. useSidebarPrefs is now lifted to App so Sidebar, SubtabBar, and the Settings panel share the same persisted state (mode, collapsedGroups, lastChildByGroup). Sidebar no longer calls the hook internally; it receives prefs as props. SubtabBar only renders when mode is "tabs" and the active tab belongs to a group. Conversion is filtered out on non-Darwin so the bar stays consistent with the sidebar.

Within the Models and Images groups, list the locally-installed tab ("My Models" / "Image Models") before the Discover tab. Since the group's defaultChild maps to whichever tab is first in each group's natural order via sidebarGroups.ts metadata, this makes "what's on this machine" the landing view instead of "search and download". Search remains one click away as the second child.

Adds a Video group to the sidebar alongside Models and Images, with four child tabs mirroring the Images layout — My Models, Discover, Studio, Gallery. Routing is wired end-to-end (TabId union, tabs.ts, sidebarGroups.ts, App.tsx) and every tab renders a shared VideoPlaceholderTab component. The placeholder surfaces the planned first-wave engines (LTX-Video, Wan 2.2, HunyuanVideo, Mochi 1) so users understand what's coming and what's on the roadmap. No engine code ships — this is purely UX shell, ready for the video runtime to slot in behind it. Test updated: the previous "video group may be empty" exception is gone; every declared group must now have at least one child tab.

Backs the Video tab UX with a minimal, contract-locked API surface so the frontend can start consuming real data ahead of the runtime: - backend_service/catalog/video_models.py — first-wave engine catalog (LTX-Video, Wan 2.2, HunyuanVideo, Mochi 1) with size, resolution, duration defaults, and task support. Mirrors the image catalog shape so the frontend can reuse rendering code. - backend_service/routes/video.py — FastAPI router with: * GET /api/video/catalog -> planned engines (populated) * GET /api/video/runtime -> "not available" status (mirrors image runtime shape for future drop-in) * GET /api/video/library -> empty list * GET /api/video/outputs -> empty list * POST /api/video/generate -> 501 * POST /api/video/preload -> 501 * POST /api/video/download -> 501 - tests/test_video_routes.py — 9 contract tests covering catalog shape, runtime flags, library/outputs empty state, and 501 on unimplemented endpoints. - src/types.ts — VideoModelVariant, VideoModelFamily, VideoCatalogResponse, VideoRuntimeStatus, VideoModelTask. - src/api.ts — getVideoCatalog() and getVideoRuntime() client methods. No runtime code — purely contract + data + types. The engine slots in behind these routes when it's ready.

Video generation runtime lands behind the previously-scaffolded API. The frontend contract is unchanged — endpoints now return real data from diffusers instead of static placeholders. Runtime module (backend_service/video_runtime.py): - DiffusersVideoEngine with probe(), preload(), unload(). Generation deliberately raises NotImplementedError for a future phase. - PIPELINE_REGISTRY mapping each first-wave repo to the right diffusers pipeline class: LTXPipeline, MochiPipeline, WanPipeline, HunyuanVideoPipeline. - Dependency probe separates core deps (diffusers, torch, accelerate, huggingface_hub, pillow — gate realGenerationAvailable) from output deps (imageio, imageio-ffmpeg — warn only, needed for mp4 encoding). - Device detection: cuda > mps > cpu with bf16 / fp16 / fp32 fallback. - Memory-saving loaders enabled by default: attention slicing, VAE slicing + tiling, with sequential CPU offload fallback if the model doesn't fit on the chosen device. - VideoRuntimeManager facade mirrors ImageRuntimeManager for clean state-level substitution. Helpers (backend_service/helpers/video.py): - _find_video_variant / _find_video_variant_by_repo - _video_variant_available_locally via local snapshot validation - _video_model_payloads enriches catalog variants with availableLocally + hasLocalData so the frontend can render install state. State wiring (backend_service/state.py): - ChaosEngineState now exposes state.video_runtime, initialised the same way as state.image_runtime. Pydantic models: - VideoRuntimePreloadRequest / VideoRuntimeUnloadRequest - VideoGenerationRequest shape defined early so the frontend can type against it ahead of the generate endpoint lighting up. Routes (backend_service/routes/video.py): - GET /api/video/catalog now returns per-variant availableLocally - GET /api/video/runtime delegates to state.video_runtime.capabilities() - POST /api/video/preload real: 404 unknown / 409 not-installed / 200 OK - POST /api/video/unload real - GET /api/video/library filters catalog by local snapshot readiness - POST /api/video/generate still 501 (next phase) - POST /api/video/download still 501 (next phase) Tests: - tests/test_video_runtime.py — 14 unit tests covering probe, pipeline routing, preload/unload lifecycle, and the not-yet-implemented generate() guard. No weights are loaded in tests. - tests/test_video_routes.py — extended to 14 contract tests covering the real preload/unload paths and probe-backed runtime endpoint. Verified: pytest 377 + 4 subtests passing, tsc clean, vitest 114/114. Live probe on M4 Max 64GB reports activeEngine=diffusers, device=mps, realGenerationAvailable=true with imageio flagged for generation.

Phase 8: wires /api/video/download, /status, /cancel, /delete to the existing state.start_download plumbing so the Video tab can pull curated model snapshots from Hugging Face. Endpoints are locked down to repos in VIDEO_MODEL_FAMILIES — unknown repos 404 before any download runs. Adds Wan 2.1 1.3B (~2.5GB) and 14B variants to the catalog and registers both in PIPELINE_REGISTRY. Both route to the same WanPipeline class as Wan 2.2 since the version lives in the weights, not the pipeline code. The 1.3B variant is the intended first-download target for end-to-end testing on modest hardware. state.start_download now composes image-then-video post-download validation so Wan 2.1's diffusers layout is checked the same way LTX or Hunyuan would be. _unload_repo_from_runtimes also evicts from video_runtime on delete. _known_repo_size_gb falls back to the video catalog for preflight size hints. Frontend api.ts gains downloadVideoModel, getVideoDownloadStatus, cancelVideoDownload, deleteVideoDownload, preloadVideoModel, and unloadVideoModel using the existing DownloadStatus types. Tests: 9 new contract tests for the download routes cover unknown-repo 404s, missing-field 422s, empty status, non-video repo filtering, video repo surfacing, and cancel/delete on repos that were never downloaded. The actual HF snapshot_download happy path is left for integration testing against a real Wan 2.1 1.3B pull — no engine internals are mocked. PipelineRegistryTests updated with the three Wan variants plus a new test asserting they all route to WanPipeline.

Real HF preflight on a running Wan 2.1 T2V 1.3B download reports 16.37GB, not 2.5GB. The "1.3B" is just the transformer — the repo also ships a UMT5-XXL text encoder (~11GB) and VAE/CLIP weights. Catalog sizeGb now reflects the full repo footprint for both 1.3B (16.4GB) and 14B (45GB) variants, so the Discover tab won't mislead users before a download starts. Test isolation: make_client() now redirects HF_HUB_CACHE, HUGGINGFACE_HUB_CACHE, and HF_HOME into the per-test tempdir. Without this, a unit test calling delete_download on a valid video repo would physically wipe the user's in-progress snapshot on disk — a real bug we hit once the first real Wan 2.1 download was running while tests tried to exercise the same repo. Environment is snapshotted per test and restored in tearDown.

Replace VideoPlaceholderTab with four live tabs backed by the catalog, runtime, and download endpoints landed in earlier phases: - VideoDiscoverTab — search/filter the curated families, download, pause, resume, delete, deep-link to the model card. - VideoModelsTab — list installed variants with preload/unload controls driven by the diffusers video runtime. - VideoStudioTab — model picker plus prompt/negative prompt/seed scaffold with the Generate button disabled until Phase 10 lands the mp4 loop. - VideoGalleryTab — empty-state placeholder until outputs exist. Adds useVideoState (simpler cousin of useImageState) with 2s download polling, catalog refresh, preload/unload handlers, and selection sync; plus utils/videos.ts and types/video.ts for discover filters and runtime error helpers. The placeholder component is left untouched but no longer referenced from App.tsx.

The base Wan-AI/Wan2.* and tencent/HunyuanVideo repos ship in their native formats without model_index.json, so diffusers' from_pretrained can't load them directly. Switch the catalog + pipeline registry to the -Diffusers mirrors (and hunyuanvideo-community/HunyuanVideo) which ship the standard diffusers layout WanPipeline and HunyuanVideoPipeline expect. This was caught when a 16GB Wan 2.1 1.3B download finished at 99% and then failed preload with "missing model_index.json; found Wan2.1_VAE.pth, diffusion_pytorch_model.safetensors, ...".

Phase 10 ships the real video generation loop. DiffusersVideoEngine.generate now runs the pipeline through diffusers, encodes frames to mp4 via imageio-ffmpeg, and returns a GeneratedVideo. The FastAPI route persists the artifact to a day-bucketed outputs directory (mp4 + JSON sidecar) and the Video Studio / Gallery tabs surface live runs via the existing auth flow. Backend: - DiffusersVideoEngine.generate() with _invoke_pipeline / _encode_frames_to_mp4 test seams; HunyuanVideoPipeline's missing negative_prompt kwarg is handled and MPS generators are built on CPU per diffusers docs. - VideoRuntimeManager.generate() facade returns (GeneratedVideo, runtime_dict) and refuses to run when the runtime isn't ready rather than shipping a fake clip. - New outputs CRUD in backend_service/helpers/video.py plus DataLocation fields. /api/video/generate persists the artifact, /api/video/outputs lists saved clips, /api/video/outputs/{id}/file streams the mp4 via FileResponse, and DELETE removes both the mp4 and JSON sidecar. Frontend: - api.ts: generateVideo, getVideoOutputs, deleteVideoOutput, and fetchVideoOutputBlobUrl (uses apiFetch + createObjectURL so the auth token isn't smuggled into a query string for the <video> src). - useVideoState: videoOutputs state, handleVideoGenerate (resolves resolution + frame count from the selected variant, validates seed, navigates to gallery on success), handleDeleteVideoOutput. - VideoStudioTab: generation is no longer placeholder — the Generate button is gated on a real set of preconditions and surfaces specific reasons via the button title. - VideoGalleryTab: real gallery with per-card <video> element backed by a blob URL, reveal + delete actions. Tests: - 20 video runtime tests including generate happy path, empty-frame rejection, output-deps enforcement, and the manager facade. - Video routes: 404 / 409 / 422 error paths, a full generate -> persist -> stream -> delete round-trip, RuntimeError surfacing as 400, and the 410 Gone path when metadata exists but the mp4 is gone from disk. - Full suite: 399 Python tests + 114 TS tests pass; tsc clean.

Surface a file-manager shortcut on each installed variant card so users can jump from the UI to the underlying HF snapshot on disk. Exposes a `localPath` field on image/video variant payloads (populated only when something is actually on disk) and wires the existing handleRevealPath handler through all four tabs. Tabs updated: ImageModelsTab, ImageDiscoverTab (via LatestImageDiscoverCard), VideoModelsTab, VideoDiscoverTab. The button is platform-aware via the shared `fileRevealLabel` tooltip ("Show in Finder"/"Show in Explorer"/ "Show in Files").

Upstream v0.1.4 renamed ``generate_dflash_once`` to ``stream_dflash_generate`` (a streaming iterator) and removed the non-speculative baseline fallback branch. The PyPI artefact is still stuck at 0.1.0, so pin every install site to the git tag instead. ``_generate_dflash`` now drives the iterator to completion and pulls the final ``{"event": "summary", ...}`` payload — whose shape matches the old single-dict return — so the rest of the worker is unchanged.

Two fixes for the video gen loop: 1. snapshot_download for video repos was pulling every legacy checkpoint sibling — repos like Lightricks/LTX-Video ship the diffusers pipeline layout plus half a dozen standalone safetensors at root, so the same 2 GB pipeline was advertised as a 200+ GB download and crawled at 1%. Plumb an optional allow_patterns list through the HF helper and populate it with the standard diffusers-pipeline folders for any repo that belongs to VIDEO_MODEL_FAMILIES. Non-video repos keep the current unfiltered behaviour. 2. Video Studio dropdown listed every catalogued variant, which was confusing when nothing was installed yet. Filter to variants with local data (or a live download) and fall back to an inline CTA that sends the user to Video Discover when the library is empty.

Video generation needs imageio + imageio-ffmpeg to write mp4 output. They already appear in videoRuntimeStatus.missingDependencies but previously only surfaced as muted badges, leaving users to pip install from a terminal. Now the Video Studio renders an "Install mp4 encoder" callout when either dep is missing and installs both via the existing /api/setup/install-package endpoint, then re-probes the runtime. Both entries are whitelisted in _INSTALLABLE_PIP_PACKAGES. Added 3 tests covering the new pip install path and a regression guard on the allowlist keys so the UI contract doesn't silently drift.

The LTX-Video failure ("no file named config.json found in /snapshots/...") came from a partial download whose model_index.json listed transformer/, vae/, and tokenizer/ but the subfolders never landed on disk. validate_local_diffusers_snapshot now walks the index and flags any required component whose folder is missing or has no config, so users see "missing components: transformer, tokenizer, vae" plus the redownload guidance instead of a cryptic OSError from inside diffusers.from_pretrained. Adds imageOutputsDirectory and videoOutputsDirectory settings so the delivery folder for finished renders can point at an external SSD, Dropbox, or a client folder without moving the entire data directory. Empty string keeps the default {data dir}/images/outputs or /videos/outputs. The image and video helpers in app.py resolve the override on every call, so changes apply to the next generation without a backend restart.

Replaces the time-based progress estimates in the image + video generation modals with a live signal driven by the backend pipelines, and adds a video generation modal mirroring the existing image one. Backend - New `progress.ProgressTracker` thread-safe scratchpad with `IMAGE_PROGRESS` / `VIDEO_PROGRESS` module singletons. Phase IDs (loading / encoding / diffusing / decoding / saving) match what the modals already render. - `image_runtime` and `video_runtime` wrap their generate paths in `begin()` / `set_phase()` / `finish()` and feed `callback_on_step_end` into `set_step()` so the bar follows the real diffuser step counter. TypeError fallback chain drops the callback, then `negative_prompt`, for older diffusers signatures. - New GET `/api/images/progress` and `/api/video/progress` endpoints expose the singleton snapshot. Frontend - New `useGenerationProgress` hook polls the endpoint every 500ms while the modal is busy and reports `null` when the backend goes idle so callers can fall back to estimates. - `LiveProgress` accepts an optional `realProgress` prop. When live, it picks the active phase by ID instead of by elapsed time and the diffusion phase fills proportionally to `step / totalSteps` rather than to a guess. - New `VideoGenerationModal` mirrors `ImageGenerationModal`: opens at generation start, shows the live progress bar, then swaps to the rendered clip via the same blob-URL trick as the gallery. Replaces the old "switch to gallery on success" behaviour. - `useVideoState` now drives the modal lifecycle and exposes the run metadata (model, prompt, frames, fps, steps, needsPipelineLoad). Tests - New `tests/test_progress.py` (20 tests) locks the tracker lifecycle, snapshot contract, post-finish no-op, and endpoint shape / isolation. Full pytest, vitest, and `tsc --noEmit` all green.

Wan 2.1 1.3B at catalog defaults (832x480 × 96 frames × 50 steps) blew up Apple Silicon's MPS backend with a 73 GB attention-matrix allocation, killing the sidecar mid-generation. The user had no way to dial the run down without editing code. Now the Studio exposes width / height / frames / fps / steps / guidance as form fields, and useVideoState seeds them with short, MPS-safe defaults (33 frames, 30 steps, 5.0 guidance) pulled from the variant's ``recommendedResolution`` hint. Frames snap to ``(n - 1) % 4 == 0`` because Wan-family pipelines reject anything else — the input field steps by 4 and the generate handler re-snaps defensively in case a stale value slips through. The selected variant's id drives a reset effect that refreshes width/height from the catalog hint but leaves user edits alone on unrelated catalog refreshes.

``test_outputs_is_empty_until_generation_lands`` failed on any machine where the user had generated a real video, because ``make_client`` only redirected HF cache env vars — not the module-level ``VIDEO_OUTPUTS_DIR`` and ``SETTINGS_PATH`` constants in ``backend_service.app``. Those capture the user's real ``~/.chaosengine/`` paths at import time, so the ``/api/video/outputs`` route kept reading live artifacts from disk. ``make_client`` now patches ``VIDEO_OUTPUTS_DIR``, ``IMAGE_OUTPUTS_DIR``, and ``SETTINGS_PATH`` into the per-test tempdir and stashes the originals under sentinel keys (``__video_outputs_dir__`` etc.) in the returned snapshot dict. ``restore_env`` restores the module attrs on teardown while skipping the sentinels in its env-var loop. The SETTINGS_PATH patch matters because a user-set ``videoOutputsDirectory`` in real settings would otherwise bypass our patched ``VIDEO_OUTPUTS_DIR`` and point the test back at a real location. Existing tests that patch ``VIDEO_OUTPUTS_DIR`` themselves (``VideoOutputsPersistenceTests``, ``VideoOutputFileTests``) keep working: they layer their own override on top of the tempdir value and restore to it, before ``restore_env`` unwinds to the real original.

Two distinct user-reported bugs: **Can't delete digits in Frames / Width / Height / FPS / Steps / Guidance.** The previous ``onChange={Number(event.target.value) || fallback}`` pattern treated an empty string as 0, then ``|| fallback`` snapped back to the default — so deleting the last digit of "33" reinstated "33" instantly. The user had to type "N33", delete "N", to get "N33" → "33" (or type "320" then delete the leading digit). Now every numeric input carries ``NaN`` for "user is mid-edit / field is empty", renders as "" via a ``displayNumber`` helper, and snaps to a fallback only on blur. ``handleVideoGenerate`` guards every numeric field against ``NaN`` so nothing invalid reaches the backend. **Generate button stayed disabled after a Wan 2.1 MPS crash.** The hover-only tooltip made the disable reason invisible unless the user thought to hover. Now the reason is surfaced inline as a muted "Generate disabled: <reason>" line under the button, so any stuck state is obvious at a glance. Plus, ``handleVideoGenerate`` now runs ``refreshVideoData`` in the catch path — a sidecar crash can leave ``videoRuntimeStatus`` stale (``realGenerationAvailable`` flipping false with no trigger to re-probe), so a forced resync on failure gets the Studio back to a known-good state without requiring a page reload.

The previous safety heuristic used a flat MPS-strict token threshold, which over-warned on beefy Macs (a 64 GB M4 Max saw "caution" at any clip >40 frames even though it could clearly handle more) and treated CUDA the same as MPS. Now the heuristic estimates peak attention memory from (tokens² × bytes × calibrated-multiplier) and compares it against the device's effective budget — scaling cleanly from an 8 GB base M1 to a 128 GB M3 Ultra to a 24 GB RTX 4090 with one formula. Backend surfaces ``deviceMemoryGb`` on ``VideoRuntimeStatus`` from the existing ``gpu`` helper (sysctl on macOS, nvidia-smi on CUDA). The Studio now shows an always-visible capacity line under the numeric controls ("Apple Silicon · 64 GB total · this run ≈ 2.9 GB of attention memory") so users can see their headroom before generate, not just when something's already wrong. The warning callout also talks in GB rather than opaque latent-token counts. Calibrated against the Wan 2.1 T2V 1.3B 832×480 × 96-frame crash report: that config correctly lands in "danger" on 16 GB, "caution" on a 40 GB A100, and "safe" on a 128 GB Mac.

The old estimate only summed attention memory, which under-counted real resident memory by ~25×. On a 64 GB M4 Max, Wan 2.1 T2V 1.3B at 40 frames looked "safe" but crashed the backend with an 88 GB allocation: the model weights + UMT5-XXL text encoder alone occupy ~23 GB resident, before any attention activations. Changes: - assessVideoGenerationSafety now takes baseModelFootprintGb and estimates resident footprint with device-class fragmentation factors (MPS 1.4x, CPU 1.3x, CUDA 1.05x). - Peak memory = modelFootprintGb + attentionPeakGb. - Short-circuit when the model alone exceeds the caution budget: hand back a null suggestion and a "try a smaller model" message instead of pretending a smaller clip will help. - Studio capacity line breaks out model vs attention cost, and the safety callout swaps the "Use safer settings" button for "Browse smaller models" when no safer settings can rescue the run. - Fallback indicator on the capacity line when the backend hasn't reported real device memory yet (e.g. stale sidecar before restart): shows "~16 GB (default — restart backend for real detection)" so the conservative default is obvious instead of being presented as truth. - 9 new videos.test.ts cases covering the Wan 2.1 crash scenario, model-too-big short-circuit, device-class fragmentation ordering, and ignoring non-positive/NaN footprints.

The My Models table's RAM and Compressed columns were reading matchedVariant.estimatedMemoryGb from the catalog, computed by a crude params_b * quant_factor + 1.6 formula against whichever family variant scored best. When a user's installed model had no close catalog variant (e.g. Qwen2.5-0.5B, Qwen3-8B, Qwen3-0.6B) the match collapsed onto the family's flagship, so three wildly different models all reported the same ~76.6 GB RAM — higher than the biggest 67 GB model in the table. Replaces the lookup with estimateLibraryItemResidentGb() / estimateLibraryItemCompressedGb() in utils/library.ts, which start from item.sizeGb (ground truth — MLX, GGUF, and safetensors all store weights at runtime precision) and add a small KV-cache term for an 8K GQA context plus framework overhead. The compressed variant halves the KV term, matching the ChaosEngine/TurboQuant/RotorQuant cache strategies. Falls back to the catalog estimate only when sizeGb is unknown or non-positive, which should be rare. New library.test.ts coverage: - Monotonicity across disk sizes (the Qwen regression) - Sane ballpark for 0.5B and 8B models - Fallback to catalog estimate on 0/NaN/negative sizeGb - Null when both on-disk and catalog data are missing - Compressed < uncompressed at short contexts, with a small delta

Two issues hit Windows users from a fresh clone: 1) build.ps1 fails with EPERM during stage-runtime's fs.rmSync of .runtime-stage\windows-x86_64. Node's `force: true` does not actually chmod read-only files writable on Windows, and Defender / Explorer / pip-installed package metadata regularly hold transient locks that surface as EPERM/EBUSY/ENOTEMPTY. The whole staging run blew up instead of retrying. 2) tauri:dev appeared to hang for 5-15 minutes after staging optional packages. It was actually tarring the multi-GB staging tree into src-tauri/resources/embedded/runtime-*.tar.gz — output that the Tauri shell ignores in dev mode (see src-tauri/src/lib.rs around the "development embedded runtime detected; preferring source workspace" branch, which returns None and falls back to the live workspace without ever extracting the archive). Changes: - New safeRmSync() helper: clears the read-only bit recursively before rm, then retries on EPERM/EBUSY/ENOTEMPTY/EACCES with exponential backoff. macOS/Linux keep the fast path. On final failure surfaces an actionable message pointing at Windows Defender exclusions. - All five fs.rmSync sites in stage-runtime.mjs now use safeRmSync. - Skip the tar archive when not in --mode=release. Stale archive (if any) is removed so the Rust shell does not reach for it. Release builds keep the archive untouched. - build.ps1: pre-clear .runtime-stage to dodge the EPERM root cause, and install the same extras stage-runtime validates against (`pip install -e ".[desktop,images]"`) instead of a hand-maintained short list that was missing diffusers/torch/etc.

Python 3.14 added a stdlib `compression` namespace package (PEP 784) that exposes `compression.zstd`, `compression.bz2`, and `compression._common`. Our local regular package shadowed it, and anything in the Python import graph that reached for `compression._common` (including torch's internal imports in some environments) failed with `ModuleNotFoundError: No module named 'compression._common'`. Renaming to `cache_compression` keeps the package descriptive, avoids the stdlib collision, and doesn't clash with any PyPI package. All 5 adapter modules, 10 backend import sites, the test suite, the stage-runtime validator, the pre-build check, and docs are updated. Directory moves preserved via `git mv` so history survives. build.ps1 needs no rename-specific change — it installs via `pip install -e ".[desktop,images]"` and picks up the new package name from pyproject.toml's packages.find include list.

The chat-oriented My Models tab already filters on ``modelType === "text" || !item.modelType`` in App.tsx, but discovery was classifying every video pipeline as "text" because there was no video detector. Result: HunyuanVideo, Mochi-1, Wan2.x T2V, etc. all appeared alongside LLMs (and in the chat picker), even while still mid-download. Add `_looks_like_video_model()` with a curated keyword set covering every family in ``backend_service/catalog/video_models.py`` plus common T2V/I2V markers, and run it ahead of the image check in ``_iter_discovered_models`` so the ``modelType="video"`` tag is stable even before ``model_index.json`` lands. Keywords are deliberately specific ("hunyuanvideo" not "hunyuan", "wan2" not "wan", "mochi-1" not "mochi") so the image-model Hunyuan and unrelated LLMs don't get swept up. Video models now only surface in the dedicated Video → My Models view.

Updated across pyproject.toml, package.json, package-lock.json, src-tauri/Cargo.toml, src-tauri/Cargo.lock (project entry only — the unrelated `tower` crate also at 0.5.3 was left alone), and src-tauri/tauri.conf.json. Added a 0.6.0 CHANGELOG entry summarising this release: the Python 3.14 compression-rename fix, on-disk-size-based library RAM estimates, video models kept out of the LLM list, video-gen memory safety that includes the model footprint, and the Windows staging hardening (EPERM retries, dev-mode tar skip, pyproject-driven extras in build.ps1).

When Wan 2.1 OOMs MPS on Apple Silicon, the Python sidecar is killed by Metal (no graceful exception) and every in-flight fetch rejects with WebKit's literal "TypeError: Load failed". That string was being stored verbatim in the runtime status message, so the Studio displayed "Load failed" / "ENGINE: UNAVAILABLE" / "Fallback active" — opaque copy that read like a Diffusers problem rather than a transport problem. Worse, even after the Tauri-managed sidecar restarted and `backendOnline` flipped back to true, the video runtime was never re-probed, so the Generate button stayed stuck disabled even when the user picked a smaller model (LTX-Video) that would have run fine. Three fixes: * Translate WebKit's "Load failed" (and Chromium's "Failed to fetch") into "Backend is not responding — try Restart Backend." in videoRuntimeErrorStatus, so users see actionable copy instead of the raw transport error. * Add a backend-recovery effect that fires refreshVideoData() when backendOnline transitions false→true. Uses a ref to track the previous value so we don't refetch on first mount (already covered by App.tsx's initial load effect). * Add a model-change retry effect: when the runtime is stuck in activeEngine === "unavailable" and the user picks a different model, re-probe the backend. Gives the user a natural "try again with a smaller model" path without needing to know about Restart Backend. Tests cover all four canonical fetch-transport error strings and lock in that real backend errors (e.g. "Diffusers is not installed") still pass through unchanged.

The Settings tab had grown to seven panels stacked on a 2-column grid (Appearance, Data Directory, Delivery Folders, Model Directories, Remote Providers, Hugging Face, Integrations) — squished horizontally on narrow widths and forcing users to scroll past unrelated controls to reach the one they wanted. Group the panels into four sections and navigate between them: * General — Appearance * Storage — Data Directory, Delivery Folders, Model Directories * Providers — Remote Providers, Hugging Face * Integrations — tooling snippets The navigation style mirrors the user's existing sidebarMode preference, so the in-page UX matches the rest of the app: * tabs mode gets a horizontal pill bar across the top, reusing the same .subtab-bar / .subtab classes as the top-level sub-tabs. * collapsible mode gets a vertical menu down the left (the macOS / iOS Settings idiom) with label + short hint per section, sticky on scroll. On narrow widths (< 900 px) the menu collapses to a horizontal strip so it doesn't eat the content column. Section selection is local component state — users land on General each time, matching how most Settings UIs open. A :has() rule in content-grid drops to a single column when only one panel is in the active section (e.g. General = Appearance only) so it doesn't look lonely in half a 2-col grid. No prop surface changes — the SettingsTab already received sidebarMode from App.tsx for the Appearance panel, so the layout switch reuses it.

The collapsible-mode variant of the Settings page (vertical menu down the left) felt clunky at the viewport widths this app actually runs at — too much wasted horizontal space for four shortish labels, and the sticky-aside positioning interacted awkwardly with the workspace content frame's scroll. Use the horizontal sub-tab bar unconditionally for Settings. The app-wide ``sidebarMode`` preference still controls the top-level sidebar (where the collapsible/tabs trade-off makes sense) — Settings just no longer cascades from it. Removes the ``.settings-side-nav*`` CSS, the ``data-mode`` attribute hook, the side-nav-button branch in the render, and the per-section ``hint`` field that only the side-menu used. Leaves the section split itself (General / Storage / Providers / Integrations) and the ``:has()`` rule that single-panel sections (e.g. General) get a full- width grid.

Storage is the densest section of Settings — three panels (Data Directory, Delivery Folders, Model Directories) plus a scrollable list — so a flat 2-col ``content-grid`` either squished Model Directories' list into half the row height or dropped one of the small panels into an orphan second column. Give Storage its own 2-col layout: the two small ``where files live`` panels stack in the left column, and Model Directories takes the full height of the right column so its list + add-row can breathe. Each Settings section now owns its own grid wrapper inside a bare ``.settings-content`` frame, so future sections can pick the shape that fits them without fighting a shared outer grid. Surface the resolved delivery paths in the input fields themselves, not just a placeholder, so users can read the full effective location even when they haven't set an override — matching what the backend prints in logs. A ``default`` badge next to the row and a disabled ``Reset to default`` button together make it unambiguous whether a path is inherited from the data directory or explicitly overridden. At narrow widths (<=900px) the storage grid collapses to a single column along with the other multi-col settings layouts.

The Studio used to default to "mps" whenever the backend probe hadn't reported a device — sidecar dead, Failed to fetch, first-launch race. On Windows that surfaced "close to the safe limit on Apple Silicon (MPS) (~8 GB of 16 GB total)" to RTX 4090 users, which both reads as nonsense and undersells the actual hardware budget. assessVideoGenerationSafety now buckets the unknown case from navigator.userAgentData / navigator.platform: macOS stays on MPS, everything else falls through to CUDA. The result also surfaces effectiveDevice + effectiveDeviceWasInferred so VideoStudioTab can label the device correctly when the backend hasn't yet reported one. Also adds CogVideoX 2B and 5B to the catalog and pipeline registry — THUDM's open-weight family fills the gap between LTX (2 GB) and Wan 2.2 (14 GB) and is the most-requested missing model in the discover list.

The "tiktoken is required" mid-generate error from LTX-Video had no escape route in the UI — only the mp4 encoder bucket got a one-click install. Same problem for Wan / HunyuanVideo / CogVideoX users when sentencepiece or protobuf is missing. Probe now also looks for tiktoken / sentencepiece / protobuf / ftfy and surfaces any missing ones in missingDependencies. The Studio groups them into a separate "Install missing video dependencies" panel that names exactly which packages it'll add, so the user can fix the LTX failure without dropping to a terminal. handleInstallVideoOutputDeps grew an optional packages list — the existing mp4 encoder button keeps its hardcoded pair, the new button passes whatever the probe surfaced.

git checkout writes "Updated 1 path from the index" to stderr even when it succeeds. With \$ErrorActionPreference = "Stop" at the top of the script, the ``2>&1 | Out-Null`` redirect wraps that stderr text as a NativeCommandError — so the cleanup step crashes the build *after* the NSIS installer has already been produced. Use git's --quiet flag to suppress the message and a ``2>\$null`` file redirect to swallow anything else (file redirects bypass PowerShell's stream-wrapping logic). Keep an explicit \$LASTEXITCODE check so a real git failure (e.g. dirty working tree) still surfaces as a build error.

build.ps1 was patching tauri.conf.json to run stage:runtime (dev mode) before bundling. Dev-mode staging skips building the tar.gz runtime archive AND writes mode=development into the manifest — so the embedded Python runtime wasn't in the installer, and the Tauri shell looked for a live source workspace at the customer's install path that doesn't exist. Result: a 3 MB installer that can't actually boot the backend. Switched to stage:runtime:release, added a pre-flight llama.cpp check with three clear remediation paths, and introduced CHAOSENGINE_RELEASE_ALLOW_NO_LLAMA=1 for operators who want a diffusers-only installer without having to compile llama.cpp locally. stage-runtime.mjs now honours that env var in strict mode — a loud warning instead of a hard failure.

/api/video/runtime was calling get_gpu_metrics() on every probe, which shells out to nvidia-smi synchronously. On Windows each spawn flashed a console window and cost 1-3s, and because total VRAM never changes we paid that cost on every call. Combined with FastAPI's sync route blocking a worker, the frontend's 15s fetchJson timeout was firing and surfacing as "Failed to fetch". Three fixes stacked defensively: - Pass CREATE_NO_WINDOW to every subprocess.check_output in helpers/gpu.py so nvidia-smi / sysctl / ioreg no longer pop a console window on Windows. Cuts spawn latency too. - New get_device_vram_total_gb() with a process-wide cache. The first probe pays the shelling-out cost; subsequent ones are a dict lookup. video_runtime._detect_device_memory_gb routes through it instead of the full live snapshot. - Bump getVideoRuntime's fetchJson timeout 15s -> 30s so the very first probe of a sidecar's life (which also imports torch) has headroom on cold disks. Added cache and Windows-flag tests in test_gpu.py.

Windows PowerShell 5.1's parser rejects '&&' as an invalid statement separator even when it appears inside a double-quoted string literal (PS 7+ and pwsh don't have this issue). Replaced with '; ' so the script parses on stock Windows installs.

Windows PowerShell 5.1 has a parser quirk where a tokenization error inside a comment (like the literal string '&&' I used to describe the previous fix) cascades into parse recovery that then mis-tokenizes later comments containing apostrophes - reporting things like "token 's' unexpected" on lines like "customer's install path". Rewrote the new comments to avoid apostrophes and the double-ampersand literal entirely. Pre-existing "PowerShell's" on the git-checkout comment is untouched because it parsed cleanly in 223b6af before my new block triggered recovery mode.

Video Studio was showing "Backend is not responding - try Restart Backend" whenever /api/video/runtime raised Failed-to-fetch, which directly contradicted the global BACKEND ONLINE pill driven from /api/health. The two probes are independent - the sidecar can be up and answering health checks while the video probe fails (common during a backend restart, or the first probe of a sidecar's life while torch is importing on Windows). Renamed the translated message to "Video runtime did not respond" so the UI stops claiming the whole backend is dead, and added a separate branch for fetchJson's "timed out after Xs" error - those mean the backend accepted the request but didn't answer in time, which is different from failed-to-fetch and deserves its own copy. Tests pinned: the message must name "video runtime" (not "backend"), and the timeout branch must explicitly mention "timed out".

The cmake hint line contained "..\llama.cpp\ (cmake -B build; cmake --build build)", and Windows PowerShell 5.1 mis-tokenizes the "\ (" sequence inside a double-quoted string as a subexpression start - triggering "Missing closing ')' in expression" and a cascade of parser recovery errors that made the whole if/else block look unclosed. Split the hint across two Write-Host calls and dropped parens from the strings entirely. Plain prose sidesteps the quirk and reads just as clearly at the prompt.

Four related fixes so a cold Windows sidecar stops leaving the Studios stuck on "Failed to fetch": - Warm up PyTorch in a background thread at sidecar startup so the video runtime probe returns immediately with an "initializing" status instead of blocking on a 30-60s import. - Whitelist the core image runtime packages (diffusers, torch, accelerate, huggingface_hub, pillow) and surface a one-click "Install image runtime" button in Image Studio. - Always offer Restart Backend when the runtime probe failed, not only when Tauri manages the sidecar. - Replace the opaque "Failed to fetch" copy with an honest explanation of the 30-60s cold-import window, and poll every 5s while the engine is initializing or unavailable so the promised auto-refresh actually happens.

Release dates help users spot ancient models that aren't worth downloading. Curated variants now carry `releaseDate` and the Discover tabs fall back to the live Hugging Face `createdAt` value, both rendered via a shared `formatReleaseLabel` helper so the label stays consistent. File sizes on installed cards were off by 30+ GB for FLUX.1 Schnell because (a) downloads pulled duplicate standalone safetensors the diffusers pipeline never loads, and (b) the card showed the curated estimate instead of what is actually on disk. Image downloads now use an allowlist that mirrors the video one (keeps the pipeline layout, skips legacy single-file checkpoints), and both image and video payloads surface the real snapshot directory size so the installed cards can display "X GB on disk".

build.ps1 silently reported "Build complete!" when tauri build had already failed because $ErrorActionPreference=Stop does not catch native command failures. Add Assert-LastExit after every pip/npm/node/npx/git call so a non-zero exit actually aborts the script, and always run the tauri.conf.json restore even when the build failed. The inline node -e "..." that patched tauri.conf.json was fragile under PowerShell quoting — one misparse left the JSON empty, which then cascaded into a confusing "EOF while parsing a value at line 1 column 0" from the next tauri build. Extract the patch/restore into scripts/patch-tauri-conf.mjs so there is no quoting surface area, and self-heal an empty config by restoring from git before patching. On Windows, `pip install torch` from PyPI delivers the CPU-only wheel, which leaves an RTX 4090 idle and makes FLUX.1 Dev take 7+ minutes per diffusion step. build.ps1 now installs torch from the CUDA 12.1 index first (override via CHAOSENGINE_TORCH_INDEX_URL) and the image runtime probe returns an actionable hint when torch is CPU-only on a machine with nvidia-smi present — so users see "reinstall with the CUDA wheel" instead of a silent hang. The Chat tab was also showing a sticky "Failed to fetch" banner after cold start because refreshImageData / refreshVideoData raced the sidecar's port bind and surfaced the resulting TypeError as a global error. Add isTransientNetworkError() and swallow those in both refresh paths — real HTTP errors still bubble up as before.

The backend always required a bearer token on /api and /v1, which blocked external clients (OpenWebUI, curl, sibling apps) from connecting to the local server even when the user wanted them to. Introduce a requireApiAuth setting (default true, so existing installs stay secure) that gates the auth middleware, surface it as a "Require API token" checkbox on the Server tab next to LAN access, and persist it through the settings PATCH path so the toggle hot-applies without a server restart. Also honour CHAOSENGINE_REQUIRE_AUTH in create_app — values of 0, false, no, or off disable auth regardless of the saved setting, which is useful for headless/CI runs that don't have a settings.json to edit. Tests cover three cases: tokenless request is rejected when the toggle is on, succeeds when the toggle is off, and flipping the toggle via PATCH /api/settings takes effect on the very next request without restarting the app. Env-override path has its own test.

Two related fixes for Windows/Linux users with NVIDIA GPUs where the CPU-only torch wheel and a slow cold-disk probe were conspiring to make image and video generation look broken. - Extract the nvidia-smi presence check into backend_service.helpers.gpu as shared nvidia_gpu_present() / gpu_status_snapshot() helpers, add a matching /api/system/gpu-status endpoint (auth-exempt so the banner works regardless of token state), and surface an amber "Running on CPU" banner in App.tsx with a persistent Dismiss. Video runtime now emits the same CUDA-wheel install hint the image runtime already did. - Reorder DiffusersVideoEngine.probe() so torch_warmup_status() runs before any importlib.util.find_spec / nvidia-smi work. On a cold NTFS volume the find_spec + subprocess cost was pushing /api/video/runtime past the frontend's 30s fetch budget; Chromium aborted the fetch with "Failed to fetch" and the Studio froze on "ENGINE: UNAVAILABLE" even though /api/health kept responding. "not_started" now kicks off the warmup and returns fast, and the warmup worker pre-caches VRAM detection and the dep find_spec lookups so the post-warmup probe is a hashmap lookup.

Every Windows build attempt failed with parser errors whose location kept shifting (string terminators, missing braces), which was the tell: the file itself wasn't malformed, it was being decoded wrong. PS 5.1 on Windows reads scripts without a BOM as the system ANSI code page (Windows-1252), not UTF-8. build.ps1 had twelve non-ASCII characters (em-dashes and box-drawing chars used as section dividers in comments). Each one decodes as three garbage Windows-1252 bytes on the user's machine, and those bytes land inside strings or near braces — which is why the reported error location kept moving as we fixed individual strings around them. Fix: replace -- and box-drawing separators with ASCII dashes so the file is pure ASCII and encoding detection becomes moot. Also update the stale comment at the bottom to document the real root cause for future maintainers. Only build.ps1 is affected; no other .ps1 files in the repo have the same issue.

- build.ps1: missing llama-server.exe now warns and continues by default; opt into the old strict behaviour with CHAOSENGINE_REQUIRE_LLAMA=1 - build.ps1 / build.sh / stage-runtime.mjs: upgrade setuptools>=77 before the vendor/ChaosEngine install so PEP 639 license strings validate - install-cuda-torch: sweep leftover ~* pip stubs and split the install into --force-reinstall --no-deps + plain deps pass, so swapping torch no longer tries to overwrite markupsafe/_speedups.pyd while it's loaded - new publish-artifacts.mjs collects .exe/.dmg/.app/.deb/.AppImage/.msi into a flat assets/ folder at the repo root; wired into all build paths

- Auto-download a prebuilt Vulkan llama.cpp release from ggml-org in stage-runtime.mjs when no local build is present, caching under ~/.chaosengine/prebuilt-llama/. The Windows installer now ships with native inference without requiring the user to clone llama.cpp and run cmake + VS Build Tools first. - Pin setuptools to >=77,<82 in build.ps1, build.sh, and stage-runtime.mjs so recent torch wheels (which declare setuptools<82) stop warning on every pip invocation. - Verify torch.cuda.is_available() after the CUDA install when nvidia-smi is present; surface a loud warning (or fail with CHAOSENGINE_REQUIRE_CUDA_TORCH=1) so we catch "shipped CPU-only torch on an RTX 4090" at build time. - Harden the Windows restart-backend path in lib.rs: capture taskkill's exit status, fall back to child.kill() on failure, and replace the blocking child.wait() with a bounded try_wait loop. Previously a non-zero taskkill left the BackendManager mutex held forever and deadlocked the UI's runtime_info poll.

User reported CUDA torch install succeeded but Image/Video Studio still showed DEVICE: CPU and video generation ran on CPU despite an RTX 4090 being present. Root cause: the backend was in source-workspace launcher mode (Tauri couldn't find / fell back from the embedded runtime), so Python ran against the dev .venv. The extras-site-packages prepend to PYTHONPATH only existed in apply_embedded_runtime_env, which the source-workspace branch never calls. Result: Python started with no PYTHONPATH, found torch in .venv (or nowhere), never looked at the 2.5 GB of CUDA torch the GPU bundle install had just dropped into ~/.chaosengine/extras/site-packages/. Evidence from user's diagnostics snapshot: PYTHONPATH | null (not set) sysPath | includes .venv/Lib/site-packages, missing extras Fix: add the matching PYTHONPATH prepend in the source-workspace branch of bootstrap(). Same shape as the embedded path, just without the runtime-specific entries (source-workspace Python auto-discovers .venv via sys.prefix, so we only need to inject extras to win over whatever the dev venv happens to have). Per Karpathy CLAUDE.md #3 (Surgical Changes): single additive block inside an existing else branch. No existing logic modified. No new helpers. Embedded path is untouched because it already handles this. cargo check clean. No Python or TS tests exercise this path (it's purely subprocess env-var wiring), verification is post-rebuild on the user's Windows VM: /api/diagnostics/snapshot should show the extras path in both PYTHONPATH and sysPath.

User asked for: fixed-width terminal, single scroll region, step counter showing progress. Previous per-step <details> cards were OK on a 3-package install but stacked too tall on the 13-package GPU bundle — output scrolled off-screen and users lost track of which step was current. New layout: - Single monospace <pre> region, max-height 380px, auto-scrolls to bottom on new attempts (tail -f behaviour). Doesn't steal scroll on phase transitions — only on new output — so a user reading earlier lines doesn't get yanked forward. - Step line above the terminal shows 'Step 3/13: accelerate · 42%' while running, 'Final: 12/13 packages · 100%' when done. - Per-attempt markers ([ OK ], [FAIL], [....] for in-progress) line up on the left edge so failures are scannable. Also strips pip's dep-resolver noise from the displayed output. The user hit this on their Windows box where their .venv's leftover turboquant-mlx-full declared an mlx>= constraint that will never be satisfied on Windows — pip prints a scary-looking ERROR block ('chaos-engine-compressor ... requires safetensors, which is not installed'), cosmetic but alarming. Raw attempt.output still has the noise; we just filter it from the rendered terminal. Users who want the full pip trace still get it via the attempts array. Per CLAUDE.md #2 (Simplicity First): single component file, no new abstractions, no new deps. Per #3 (Surgical Changes): only touched InstallLogPanel.tsx + its CSS block. Per #4 (Goal-Driven): visible improvement (scroll works, step counter visible, noise suppressed) verifiable at a glance on next build. 174 TS tests still pass, tsc clean.

Two UX asks from the user's latest round: 1. 'Image Studio doesn't say whether it's trying to use CPU or CUDA' 2. 'Even though I just installed everything, the Image Studio still shows an install GPU runtime button. I closed and reopened the app and it disappeared.' ## Fix C: Device chip The chip at line 256 only rendered when runtime.device was set, and since my earlier refactor removed the speculative torch import from probe() the device is now null until a model is actually loaded. Added an expectedDevice field to ImageRuntimeStatus that's computed WITHOUT importing torch — find_spec + nvidia_gpu_present + platform.machine — so we can show 'Device: cuda (expected)' before the first Generate even fires. Same constraint as probe(): absolutely no torch import (would pin torch/lib/*.dll and break the install flow we just fixed). The chip now reads: Device: cuda (model loaded, actual device) Device: cuda (expected) (torch installed + NVIDIA seen) Device: mps (expected) (Apple Silicon) Device: cpu (expected) (torch installed, no GPU) (hidden) (torch not installed) ## Fix D: Post-install restart nudge Before this, after a successful GPU bundle install the Image Studio still showed 'Install GPU runtime'. Root cause: backend's sys.path is snapshotted at spawn time, so find_spec still reports torch as missing until the backend restarts with the new PYTHONPATH (Fix A's domain). User was confused — install said success, UI said install again. App restart made it go away. Split the 'runtime-not-available' block into two paths: - Post-install awaiting restart (job.phase === 'done' && job.requiresRestart): show 'installed to <path>, restart to activate' + a Restart Backend button. No install button — you just installed, clicking it again would be confusing. - All other cases: show the install button as before. Mirrored the same split to Video Studio for consistency. Per CLAUDE.md #1 (Think Before Coding): surfaced the 'running backend can't see new packages' reality to the user instead of hiding it. Per #3 (Surgical Changes): added one field to the backend status + one branch to each Studio tab; no shared component extraction yet because we only use this shape in two places and hoisting would obscure the per-tab differences. 478 Python tests pass, 174 TS tests pass, tsc clean.

cryptopoly added 30 commits April 18, 2026 18:13

Update TurboQuant-mlx references to arozanov fork

7ea7add

Document DDTree in upstream dependency table

8eec18d

cryptopoly added 25 commits April 19, 2026 20:35

Windows build fix

14eeb09

CUDA fixes

cea16fd

Build fixes

422bc51

CUDA pip fix

df9e159

cryptopoly merged commit f2416c0 into main Apr 20, 2026
1 check failed

cryptopoly mentioned this pull request May 1, 2026

Preserve Windows GPU runtime on uninstall + fix install log z-index #23

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/video gen#3

Feature/video gen#3
cryptopoly merged 55 commits intomainfrom
feature/video-gen

cryptopoly commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cryptopoly commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant