Feature/big switch by cryptopoly · Pull Request #1 · cryptopoly/ChaosEngineAI

cryptopoly · 2026-04-13T08:21:55Z

No description provided.

Break up the previous monolithic backend_service/app.py into a modular package structure and add frontend components. Core app logic and data (catalogs, image/text models, helpers for caching/discovery/persistence/network/settings/system/huggingface/images/formatting/documents) were extracted into backend_service/catalog, backend_service/helpers, and backend_service/models. API endpoints and routing were moved into backend_service/routes with dedicated files for chat, images, models, cache, health, server, settings, setup and OpenAI compatibility; runtime/inference logic updated (inference.py, vllm_engine.py). Application state moved into backend_service/state. Many new React frontend components, hooks, types, constants and utilities were added under src/ (components, features, hooks, utils, constants) and related configs/tests were updated (App.tsx, tests/test_backend_service.py, tsconfig.json, vite.config.ts). The refactor improves separation of concerns, maintainability and testability by isolating responsibilities into smaller modules.

Increase allowed contextTokens max to 2,097,152 across backend models and settings. Harden MLX worker streaming: detect worker exit/lost-model conditions, clear loaded_model and raise clearer errors when stream_request fails or raises “No MLX model is loaded”. Frontend fixes: add a 30s timeout to model search, improve error/detail parsing for generateChatStream responses and SSE events. Update RuntimeControls presets and cache/fp16 logic (ensure sensible defaults and reapply active preset on strategy switch). Fix useChat to only remove empty assistant placeholders (preserve user messages) to avoid dropping user input.

Two UX asks from the user's latest round: 1. 'Image Studio doesn't say whether it's trying to use CPU or CUDA' 2. 'Even though I just installed everything, the Image Studio still shows an install GPU runtime button. I closed and reopened the app and it disappeared.' ## Fix C: Device chip The chip at line 256 only rendered when runtime.device was set, and since my earlier refactor removed the speculative torch import from probe() the device is now null until a model is actually loaded. Added an expectedDevice field to ImageRuntimeStatus that's computed WITHOUT importing torch — find_spec + nvidia_gpu_present + platform.machine — so we can show 'Device: cuda (expected)' before the first Generate even fires. Same constraint as probe(): absolutely no torch import (would pin torch/lib/*.dll and break the install flow we just fixed). The chip now reads: Device: cuda (model loaded, actual device) Device: cuda (expected) (torch installed + NVIDIA seen) Device: mps (expected) (Apple Silicon) Device: cpu (expected) (torch installed, no GPU) (hidden) (torch not installed) ## Fix D: Post-install restart nudge Before this, after a successful GPU bundle install the Image Studio still showed 'Install GPU runtime'. Root cause: backend's sys.path is snapshotted at spawn time, so find_spec still reports torch as missing until the backend restarts with the new PYTHONPATH (Fix A's domain). User was confused — install said success, UI said install again. App restart made it go away. Split the 'runtime-not-available' block into two paths: - Post-install awaiting restart (job.phase === 'done' && job.requiresRestart): show 'installed to <path>, restart to activate' + a Restart Backend button. No install button — you just installed, clicking it again would be confusing. - All other cases: show the install button as before. Mirrored the same split to Video Studio for consistency. Per CLAUDE.md #1 (Think Before Coding): surfaced the 'running backend can't see new packages' reality to the user instead of hiding it. Per #3 (Surgical Changes): added one field to the backend status + one branch to each Studio tab; no shared component extraction yet because we only use this shape in two places and hoisting would obscure the per-tab differences. 478 Python tests pass, 174 TS tests pass, tsc clean.

User reported failures on Wan 2.2 5B (GGUF Q4_K_M + base) and LTX-2 distilled (MLX). Three independent bugs identified and fixed. 1. Wan / Hunyuan / Mochi / CogVideoX channel error Symptom: "ValueError: Image must have 1, 2, 3 or 4 channels". Root cause: WanPipeline (and HunyuanVideo / Mochi / CogVideoX) defaults ``output_type="np"`` which returns a 5D numpy array (B, F, H, W, C). Our ``_invoke_pipeline`` unwrap path expects PIL list-of-lists; the numpy ndarray leaks through to ``_encode_frames_to_mp4`` where ``PIL.Image.fromarray`` then sees a 4D tensor as a single "frame" and rejects it for >4 channels. Fix: force ``output_type="pil"`` in ``_build_pipeline_kwargs`` so every video pipeline returns the same PIL convention LTXPipeline already uses (which is why LTX kept working but Wan broke). 2. CFG decay floor flipping classifier-free guidance off mid-loop Symptom: Wan / LTX silent corruption, sometimes shape mismatches, sometimes the channel error in #1 (the two bugs combined). Root cause: ``_make_step_callback`` decayed ``pipeline.guidance_scale`` from initial → 1.0 by the final step. Diffusers' video pipelines compute ``do_classifier_free_guidance = self._guidance_scale > 1.0`` (strict). At the last step the scale equals 1.0, the property flips to False, and the pipeline tries to run a 1-batch forward on an already-prepared 2-batch (cond + uncond) embedding shape. Fix: floor the decay at 1.5 instead of 1.0 — strictly above the classifier-free threshold so the pipeline keeps running 2-batch throughout. Tests updated for the new ramp curve. New assertion that the floor stays > 1.0. 3. mlx-video install spec points at the wrong PyPI package Symptom: "mlx-video subprocess exited with code 1 after 0.3s" on LTX-2 distilled — even though mlx-video appeared installed under ``~/Library/Application Support/ChaosEngineAI/extras/site-packages/``. Root cause: PyPI's ``mlx-video 0.1.0`` is an unrelated utilities package (only ``load``, ``normalize``, ``resize``, ``to_float`` — no LTX-2 / Wan / Hunyuan generation entry points). Blaizzy's actual mlx-video lives only at github.com/Blaizzy/mlx-video. Fix: pin ``_INSTALLABLE_PIP_PACKAGES["mlx-video"]`` and the GPU bundle entry to ``mlx-video @ git+https://github.com/Blaizzy/mlx-video.git`` so the install pulls the real generation engine. Existing test updated to match the new spec shape. Tests: 701 pytest pass (+2 new); 175 vitest pass; tsc clean. Action for users: - Already-installed mlx-video from the broken PyPI package needs to be replaced. Reinstall via Setup → "Install mlx-video" (now pulls from git) or the GPU runtime bundle. - Wan + Hunyuan + Mochi + CogVideoX clips that were producing the channel-error 500 will now run end-to-end on the diffusers MPS path. - LTX-Video runs the Phase E2 CFG decay schedule with floor=1.5 instead of 1.0, so output is unaffected by the classifier-free toggle bug.

cryptopoly added 2 commits April 12, 2026 19:17

cryptopoly merged commit b621528 into main Apr 13, 2026

cryptopoly deleted the feature/big-switch branch April 13, 2026 08:22

cryptopoly mentioned this pull request May 1, 2026

Hotfix: probe torch.cuda via subprocess + opaque install log #25

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/big switch#1

Feature/big switch#1
cryptopoly merged 2 commits intomainfrom
feature/big-switch

cryptopoly commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cryptopoly commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant