Skip to content

Feature/big switch#1

Merged
cryptopoly merged 2 commits intomainfrom
feature/big-switch
Apr 13, 2026
Merged

Feature/big switch#1
cryptopoly merged 2 commits intomainfrom
feature/big-switch

Conversation

@cryptopoly
Copy link
Copy Markdown
Owner

No description provided.

Break up the previous monolithic backend_service/app.py into a modular package structure and add frontend components. Core app logic and data (catalogs, image/text models, helpers for caching/discovery/persistence/network/settings/system/huggingface/images/formatting/documents) were extracted into backend_service/catalog, backend_service/helpers, and backend_service/models. API endpoints and routing were moved into backend_service/routes with dedicated files for chat, images, models, cache, health, server, settings, setup and OpenAI compatibility; runtime/inference logic updated (inference.py, vllm_engine.py). Application state moved into backend_service/state. Many new React frontend components, hooks, types, constants and utilities were added under src/ (components, features, hooks, utils, constants) and related configs/tests were updated (App.tsx, tests/test_backend_service.py, tsconfig.json, vite.config.ts). The refactor improves separation of concerns, maintainability and testability by isolating responsibilities into smaller modules.
Increase allowed contextTokens max to 2,097,152 across backend models and settings. Harden MLX worker streaming: detect worker exit/lost-model conditions, clear loaded_model and raise clearer errors when stream_request fails or raises “No MLX model is loaded”. Frontend fixes: add a 30s timeout to model search, improve error/detail parsing for generateChatStream responses and SSE events. Update RuntimeControls presets and cache/fp16 logic (ensure sensible defaults and reapply active preset on strategy switch). Fix useChat to only remove empty assistant placeholders (preserve user messages) to avoid dropping user input.
@cryptopoly cryptopoly merged commit b621528 into main Apr 13, 2026
@cryptopoly cryptopoly deleted the feature/big-switch branch April 13, 2026 08:22
cryptopoly added a commit that referenced this pull request Apr 21, 2026
Two UX asks from the user's latest round:

1. 'Image Studio doesn't say whether it's trying to use CPU or CUDA'
2. 'Even though I just installed everything, the Image Studio still
   shows an install GPU runtime button. I closed and reopened the
   app and it disappeared.'

## Fix C: Device chip

The chip at line 256 only rendered when runtime.device was set,
and since my earlier refactor removed the speculative torch import
from probe() the device is now null until a model is actually loaded.
Added an expectedDevice field to ImageRuntimeStatus that's
computed WITHOUT importing torch — find_spec + nvidia_gpu_present +
platform.machine — so we can show 'Device: cuda (expected)' before
the first Generate even fires. Same constraint as probe(): absolutely
no torch import (would pin torch/lib/*.dll and break the install
flow we just fixed).

The chip now reads:

  Device: cuda                (model loaded, actual device)
  Device: cuda (expected)     (torch installed + NVIDIA seen)
  Device: mps (expected)      (Apple Silicon)
  Device: cpu (expected)      (torch installed, no GPU)
  (hidden)                    (torch not installed)

## Fix D: Post-install restart nudge

Before this, after a successful GPU bundle install the Image Studio
still showed 'Install GPU runtime'. Root cause: backend's sys.path
is snapshotted at spawn time, so find_spec still reports torch as
missing until the backend restarts with the new PYTHONPATH (Fix A's
domain). User was confused — install said success, UI said install
again. App restart made it go away.

Split the 'runtime-not-available' block into two paths:

  - Post-install awaiting restart (job.phase === 'done' &&
    job.requiresRestart): show 'installed to <path>, restart to
    activate' + a Restart Backend button. No install button — you
    just installed, clicking it again would be confusing.
  - All other cases: show the install button as before.

Mirrored the same split to Video Studio for consistency.

Per CLAUDE.md #1 (Think Before Coding): surfaced the 'running
backend can't see new packages' reality to the user instead of
hiding it. Per #3 (Surgical Changes): added one field to the
backend status + one branch to each Studio tab; no shared
component extraction yet because we only use this shape in two
places and hoisting would obscure the per-tab differences.

478 Python tests pass, 174 TS tests pass, tsc clean.
cryptopoly added a commit that referenced this pull request Apr 28, 2026
User reported failures on Wan 2.2 5B (GGUF Q4_K_M + base) and LTX-2
distilled (MLX). Three independent bugs identified and fixed.

1. Wan / Hunyuan / Mochi / CogVideoX channel error
   Symptom: "ValueError: Image must have 1, 2, 3 or 4 channels".
   Root cause: WanPipeline (and HunyuanVideo / Mochi / CogVideoX) defaults
   ``output_type="np"`` which returns a 5D numpy array (B, F, H, W, C).
   Our ``_invoke_pipeline`` unwrap path expects PIL list-of-lists; the
   numpy ndarray leaks through to ``_encode_frames_to_mp4`` where
   ``PIL.Image.fromarray`` then sees a 4D tensor as a single "frame"
   and rejects it for >4 channels.
   Fix: force ``output_type="pil"`` in ``_build_pipeline_kwargs`` so
   every video pipeline returns the same PIL convention LTXPipeline
   already uses (which is why LTX kept working but Wan broke).

2. CFG decay floor flipping classifier-free guidance off mid-loop
   Symptom: Wan / LTX silent corruption, sometimes shape mismatches,
   sometimes the channel error in #1 (the two bugs combined).
   Root cause: ``_make_step_callback`` decayed ``pipeline.guidance_scale``
   from initial → 1.0 by the final step. Diffusers' video pipelines
   compute ``do_classifier_free_guidance = self._guidance_scale > 1.0``
   (strict). At the last step the scale equals 1.0, the property flips
   to False, and the pipeline tries to run a 1-batch forward on an
   already-prepared 2-batch (cond + uncond) embedding shape.
   Fix: floor the decay at 1.5 instead of 1.0 — strictly above the
   classifier-free threshold so the pipeline keeps running 2-batch
   throughout. Tests updated for the new ramp curve. New assertion
   that the floor stays > 1.0.

3. mlx-video install spec points at the wrong PyPI package
   Symptom: "mlx-video subprocess exited with code 1 after 0.3s" on
   LTX-2 distilled — even though mlx-video appeared installed under
   ``~/Library/Application Support/ChaosEngineAI/extras/site-packages/``.
   Root cause: PyPI's ``mlx-video 0.1.0`` is an unrelated utilities
   package (only ``load``, ``normalize``, ``resize``, ``to_float`` —
   no LTX-2 / Wan / Hunyuan generation entry points). Blaizzy's
   actual mlx-video lives only at github.com/Blaizzy/mlx-video.
   Fix: pin ``_INSTALLABLE_PIP_PACKAGES["mlx-video"]`` and the GPU
   bundle entry to ``mlx-video @ git+https://github.com/Blaizzy/mlx-video.git``
   so the install pulls the real generation engine. Existing test
   updated to match the new spec shape.

Tests: 701 pytest pass (+2 new); 175 vitest pass; tsc clean.

Action for users:
- Already-installed mlx-video from the broken PyPI package needs to be
  replaced. Reinstall via Setup → "Install mlx-video" (now pulls from
  git) or the GPU runtime bundle.
- Wan + Hunyuan + Mochi + CogVideoX clips that were producing the
  channel-error 500 will now run end-to-end on the diffusers MPS path.
- LTX-Video runs the Phase E2 CFG decay schedule with floor=1.5
  instead of 1.0, so output is unaffected by the classifier-free
  toggle bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant