Feature/big switch#1
Merged
cryptopoly merged 2 commits intomainfrom Apr 13, 2026
Merged
Conversation
Break up the previous monolithic backend_service/app.py into a modular package structure and add frontend components. Core app logic and data (catalogs, image/text models, helpers for caching/discovery/persistence/network/settings/system/huggingface/images/formatting/documents) were extracted into backend_service/catalog, backend_service/helpers, and backend_service/models. API endpoints and routing were moved into backend_service/routes with dedicated files for chat, images, models, cache, health, server, settings, setup and OpenAI compatibility; runtime/inference logic updated (inference.py, vllm_engine.py). Application state moved into backend_service/state. Many new React frontend components, hooks, types, constants and utilities were added under src/ (components, features, hooks, utils, constants) and related configs/tests were updated (App.tsx, tests/test_backend_service.py, tsconfig.json, vite.config.ts). The refactor improves separation of concerns, maintainability and testability by isolating responsibilities into smaller modules.
Increase allowed contextTokens max to 2,097,152 across backend models and settings. Harden MLX worker streaming: detect worker exit/lost-model conditions, clear loaded_model and raise clearer errors when stream_request fails or raises “No MLX model is loaded”. Frontend fixes: add a 30s timeout to model search, improve error/detail parsing for generateChatStream responses and SSE events. Update RuntimeControls presets and cache/fp16 logic (ensure sensible defaults and reapply active preset on strategy switch). Fix useChat to only remove empty assistant placeholders (preserve user messages) to avoid dropping user input.
cryptopoly
added a commit
that referenced
this pull request
Apr 21, 2026
Two UX asks from the user's latest round:
1. 'Image Studio doesn't say whether it's trying to use CPU or CUDA'
2. 'Even though I just installed everything, the Image Studio still
shows an install GPU runtime button. I closed and reopened the
app and it disappeared.'
## Fix C: Device chip
The chip at line 256 only rendered when runtime.device was set,
and since my earlier refactor removed the speculative torch import
from probe() the device is now null until a model is actually loaded.
Added an expectedDevice field to ImageRuntimeStatus that's
computed WITHOUT importing torch — find_spec + nvidia_gpu_present +
platform.machine — so we can show 'Device: cuda (expected)' before
the first Generate even fires. Same constraint as probe(): absolutely
no torch import (would pin torch/lib/*.dll and break the install
flow we just fixed).
The chip now reads:
Device: cuda (model loaded, actual device)
Device: cuda (expected) (torch installed + NVIDIA seen)
Device: mps (expected) (Apple Silicon)
Device: cpu (expected) (torch installed, no GPU)
(hidden) (torch not installed)
## Fix D: Post-install restart nudge
Before this, after a successful GPU bundle install the Image Studio
still showed 'Install GPU runtime'. Root cause: backend's sys.path
is snapshotted at spawn time, so find_spec still reports torch as
missing until the backend restarts with the new PYTHONPATH (Fix A's
domain). User was confused — install said success, UI said install
again. App restart made it go away.
Split the 'runtime-not-available' block into two paths:
- Post-install awaiting restart (job.phase === 'done' &&
job.requiresRestart): show 'installed to <path>, restart to
activate' + a Restart Backend button. No install button — you
just installed, clicking it again would be confusing.
- All other cases: show the install button as before.
Mirrored the same split to Video Studio for consistency.
Per CLAUDE.md #1 (Think Before Coding): surfaced the 'running
backend can't see new packages' reality to the user instead of
hiding it. Per #3 (Surgical Changes): added one field to the
backend status + one branch to each Studio tab; no shared
component extraction yet because we only use this shape in two
places and hoisting would obscure the per-tab differences.
478 Python tests pass, 174 TS tests pass, tsc clean.
cryptopoly
added a commit
that referenced
this pull request
Apr 28, 2026
User reported failures on Wan 2.2 5B (GGUF Q4_K_M + base) and LTX-2 distilled (MLX). Three independent bugs identified and fixed. 1. Wan / Hunyuan / Mochi / CogVideoX channel error Symptom: "ValueError: Image must have 1, 2, 3 or 4 channels". Root cause: WanPipeline (and HunyuanVideo / Mochi / CogVideoX) defaults ``output_type="np"`` which returns a 5D numpy array (B, F, H, W, C). Our ``_invoke_pipeline`` unwrap path expects PIL list-of-lists; the numpy ndarray leaks through to ``_encode_frames_to_mp4`` where ``PIL.Image.fromarray`` then sees a 4D tensor as a single "frame" and rejects it for >4 channels. Fix: force ``output_type="pil"`` in ``_build_pipeline_kwargs`` so every video pipeline returns the same PIL convention LTXPipeline already uses (which is why LTX kept working but Wan broke). 2. CFG decay floor flipping classifier-free guidance off mid-loop Symptom: Wan / LTX silent corruption, sometimes shape mismatches, sometimes the channel error in #1 (the two bugs combined). Root cause: ``_make_step_callback`` decayed ``pipeline.guidance_scale`` from initial → 1.0 by the final step. Diffusers' video pipelines compute ``do_classifier_free_guidance = self._guidance_scale > 1.0`` (strict). At the last step the scale equals 1.0, the property flips to False, and the pipeline tries to run a 1-batch forward on an already-prepared 2-batch (cond + uncond) embedding shape. Fix: floor the decay at 1.5 instead of 1.0 — strictly above the classifier-free threshold so the pipeline keeps running 2-batch throughout. Tests updated for the new ramp curve. New assertion that the floor stays > 1.0. 3. mlx-video install spec points at the wrong PyPI package Symptom: "mlx-video subprocess exited with code 1 after 0.3s" on LTX-2 distilled — even though mlx-video appeared installed under ``~/Library/Application Support/ChaosEngineAI/extras/site-packages/``. Root cause: PyPI's ``mlx-video 0.1.0`` is an unrelated utilities package (only ``load``, ``normalize``, ``resize``, ``to_float`` — no LTX-2 / Wan / Hunyuan generation entry points). Blaizzy's actual mlx-video lives only at github.com/Blaizzy/mlx-video. Fix: pin ``_INSTALLABLE_PIP_PACKAGES["mlx-video"]`` and the GPU bundle entry to ``mlx-video @ git+https://github.com/Blaizzy/mlx-video.git`` so the install pulls the real generation engine. Existing test updated to match the new spec shape. Tests: 701 pytest pass (+2 new); 175 vitest pass; tsc clean. Action for users: - Already-installed mlx-video from the broken PyPI package needs to be replaced. Reinstall via Setup → "Install mlx-video" (now pulls from git) or the GPU runtime bundle. - Wan + Hunyuan + Mochi + CogVideoX clips that were producing the channel-error 500 will now run end-to-end on the diffusers MPS path. - LTX-Video runs the Phase E2 CFG decay schedule with floor=1.5 instead of 1.0, so output is unaffected by the classifier-free toggle bug.
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.