Feature/video gen#4
Merged
cryptopoly merged 15 commits intomainfrom Apr 21, 2026
Merged
Conversation
Four targeted fixes from the Windows diagnostic run: - video_runtime._find_missing: catch ModuleNotFoundError when find_spec walks a dotted name whose parent namespace is absent. The probe was crashing with "No module named 'google'" while checking google.protobuf on a clean install, which made /api/video/runtime return 500 forever and kept the UI stuck on "Video runtime did not respond". Added regression test covering the exact scenario. - useSettings.handleStopServer / handleRestartServer: decide on isTauri() rather than tauriBackend?.managedByTauri. The latter is false until the background bootstrap thread populates it, so clicking Restart before that finished (or after a stale-state refresh) sent the browser down the web-fallback branch that POSTs /api/server/shutdown and then polls for a respawn that never came. Now every desktop-context click goes through restart_backend_sidecar so Rust can own the shutdown+relaunch. - CI (.github/workflows/build.yml): install the [images] extras in the Python test job. Tests for the diffusers video runtime import torch and diffusers directly; without them every ready-path assertion failed on main after the feature/video-gen merge. - build.ps1 / build.sh: flip the CUDA torch check from opt-in strict (CHAOSENGINE_REQUIRE_CUDA_TORCH=1) to abort-by-default on NVIDIA hosts, with CHAOSENGINE_ALLOW_CPU_TORCH=1 as the escape. The build now prints the actual Python version + bundled torch version in the error, lists three concrete fix paths (switch to Py 3.13, override CUDA index, or deliberately ship CPU), and extends the CUDA index candidate list up to cu130/cu129 so newer drivers get picked up.
Drops torch + diffusers + accelerate + transformers and all CUDA DLLs from the bundled installer (saves ~1.3 GB). The app now ships chat + llama.cpp out of the box (<~500 MB installer), and Image / Video Studio offer a one-click "Install GPU runtime" button the first time the user visits them. The install writes to a PERSISTENT user-local directory that survives app updates — no more re-downloading 2 GB of CUDA torch every version bump. ## Build-time changes - ``build.ps1`` / ``build.sh``: drop ``[images]`` from the default venv install + kill the CUDA torch install/verify blocks. Users who want a one-installer GPU build can set ``CHAOSENGINE_BUNDLE_GPU=1`` to restore the old 1.9 GB behaviour (useful for air-gapped deployments). - ``scripts/stage-runtime.mjs``: ``images`` moves from required → optional in ``validateBundledPythonPackages``. Only ``desktop`` (fastapi, uvicorn, psutil, pypdf, huggingface_hub) is strictly required now. ## Runtime plumbing - ``src-tauri/src/lib.rs``: compute a platform-native persistent extras path (``%LOCALAPPDATA%\\ChaosEngineAI\\extras\\site-packages`` on Windows, ``~/Library/Application Support/ChaosEngineAI/extras/...`` on macOS, ``$XDG_DATA_HOME/ChaosEngineAI/extras/...`` on Linux), create it on bootstrap, prepend it to ``PYTHONPATH`` so its packages shadow anything in the bundled site-packages, and export ``CHAOSENGINE_EXTRAS_SITE_PACKAGES`` so the backend knows where to install. This path lives OUTSIDE the ephemeral ``%TEMP%`` runtime extraction, so app updates don't wipe the 2 GB of CUDA libraries the user just downloaded. ## New async install endpoint - ``backend_service/routes/setup.py``: ``POST /api/setup/install-gpu-bundle`` starts a background thread that walks the CUDA index list for torch (cu124 → cu126 → cu128 → cu121 → nightly cu128) and then pip-installs diffusers + accelerate + transformers + safetensors + pillow + huggingface-hub + imageio + imageio-ffmpeg + sentencepiece + tiktoken + protobuf + ftfy from PyPI. Target is the persistent extras dir, so we never touch the DLLs the running backend has loaded (goodbye WinError 5). A disk-space preflight (5 GB required) fails cleanly up front. - ``GET /api/setup/install-gpu-bundle/status`` returns live progress: phase, current package, percent, attempts list, ``noWheelForPython`` flag, and ``cudaVerified`` after the post-install ``torch.cuda.is_available()`` probe. Frontend polls at 1.5 Hz. - ``GET /api/setup/gpu-bundle-info`` exposes the target dir + package list + approx download size + free disk for a pre-install banner. ## Frontend wiring - ``src/api.ts``: typed ``startGpuBundleInstall``, ``getGpuBundleStatus``, ``fetchGpuBundleInfo`` helpers + the ``GpuBundleJobState`` interface. - ``src/hooks/useImageState.ts``: ``handleInstallImageRuntime`` rewritten to use the async bundle job and poll until done, surfacing live progress through ``imageBusyLabel`` (picked up by the existing "Installing..." text in ImageStudioTab). - ``src/hooks/useVideoState.ts``: new ``handleInstallVideoGpuRuntime`` mirroring the image one. The existing per-package install path stays for targeted fixes (mp4 encoder, tokenizers) that don't need the full 2 GB. - ``src/features/video/VideoStudioTab.tsx``: adds an "Install GPU runtime" banner when the video probe reports the engine as unavailable (was previously just a Restart Backend button that couldn't fix anything). Also fixes a latent bug where the parent passed a 0-arg wrapper that swallowed the package list from the tokenizer-deps install button. ## Tests - ``tests/test_setup_routes.py``: new ``InstallGpuBundleTests`` covers the idle-state status response, bundle-info contract, happy-path end-to-end install, disk-space preflight failure, ``noWheelForPython`` flagging after all-indexes fail, and the concurrent-start guard that prevents parallel installs. 466 Python tests pass; 171 TypeScript tests pass; tsc + cargo check clean.
The previous flow collapsed every install failure into a single toast
("GPU bundle install failed."), which is the "failed without reason"
the user reported. The backend already tracks every pip step's stdout
+ stderr in the job state — this exposes that as a collapsible log
panel under the Install GPU Runtime button in both Image Studio and
Video Studio.
UX:
- Auto-expands when the install enters the ``error`` phase so users
see the real pip output without clicking — by far the most common
thing they need when the install fails.
- Stays collapsed on success so the UI doesn't get noisy.
- Shows target dir + Python version + winning CUDA index up top
(everything you'd need to paste into a support thread).
- Each attempt is one row: OK/FAIL badge + step label
(``torch (from cu124)``, ``pip install diffusers``, ``Verify
torch.cuda.is_available()``…) + scrollable terminal-style ``<pre>``
with the last 60 lines of pip output.
- Final status row mirrors ``state.message`` so partial successes
surface cleanly ("CUDA verified, but ftfy + sentencepiece failed —
retry individually").
Backend hardening (``backend_service/routes/setup.py``):
- ``except Exception`` block now falls back to the exception type name
when ``str(exc)`` is empty, so the UI never sees a blank ``error``.
- "All CUDA index candidates failed" message now appends the last
three lines of the most recent pip output so the toast itself is
actionable, not just a pointer to the log.
- Non-fatal package failures (after torch lands) are tracked in a
``non_fatal_failures`` list and summarised in the final ``message``,
so success/restart prompts don't lie when half the bundle is missing.
- Verify-CUDA failure path now includes the verify subprocess's last
two output lines in the message — drives the user toward the actual
cause (driver mismatch vs CPU torch) without expanding the log.
Frontend (``src/components/InstallLogPanel.tsx``,
``src/hooks/use{Image,Video}State.ts``,
``src/features/{images,video}/{ImageStudio,VideoStudio}Tab.tsx``,
``src/App.tsx``):
- New ``InstallLogPanel`` component (no emojis, monochrome, matches
existing Studio styling).
- Both hooks now keep a ``gpuBundleJob: GpuBundleJobState | null``
React state, updated each poll tick. Exposed via the hook return
value, plumbed through ``App.tsx`` to the Studio tabs.
- Both Studio tabs render ``<InstallLogPanel job={gpuBundleJob} />``
inside the runtime-actions block whenever
``realGenerationAvailable === false``, so it appears the moment the
install starts and stays around until the user resolves it.
- Image Studio install button text changed from "Install image
runtime" to "Install GPU runtime" for consistency with Video Studio
and the new bundle naming. Description updated to mention the
~2.5 GB persistent download.
- Hook ``handleInstallImageRuntime`` / ``handleInstallVideoGpuRuntime``
now append " See the install log below for per-step pip output.
Target: <dir>" to the toast message on failure, so even users who
don't expand the log get pointed at where the bytes were going.
Image Studio intentionally stays at one install button (vs Video
Studio's three): mp4-encoder + tokenizer-deps buttons in Video Studio
exist because video pipelines lazy-load specific packages at preload
time. Image gen uses diffusers' bundled tokenizers and writes PNG
directly — its needs are covered entirely by the GPU bundle.
466 Python tests pass; 171 TypeScript tests pass; tsc clean.
Verbatim from ~/andrej-karpathy-skills/CLAUDE.md so the rules show up without needing the source file in scope. Lives at the top of the project guide because they govern how to work, not what's in the codebase.
Chat model downloads now appear in My Models as soon as they start, with a live progress badge, and the Discover card header surfaces the same state without requiring the user to expand the card. Mirrors the behaviour video models already had via ``activeVideoDownloads``. - src/App.tsx: synthesise library rows from ``activeDownloads`` entries that haven't landed in ``workspace.library`` yet (matched by repo via inferred HF cache path / catalog variant), so MyModelsTab renders them alongside real rows. - src/features/models/MyModelsTab.tsx: treat rows with a ``download://<repo>`` sentinel path as synthetic — hide the reveal button and the expanded library-path line. - src/features/models/OnlineModelsTab.tsx: add Downloading/Paused/Failed badges to both the curated family and hub card headers so the state is visible on the collapsed row. - src/features/models/*.test.tsx: cover the new synthetic-row rendering and the new collapsed header badges for curated + hub cards. npm test: 174 passed. npx tsc --noEmit: clean. pytest tests/: all pass.
Two different meanings of 'backend' were colliding on screen: the footer's green 'BACKEND ONLINE' badge means the Python API sidecar, while the Dashboard's 'Runtime engine: No backend' meant no inference engine was active. Users reasonably read the collision as a bug report. 'Idle' matches RuntimeStatus.state already used elsewhere and doesn't claim the sidecar is down. No logic branches on the literal string. - backend_service/inference.py: MockInferenceEngine.engine_label - src/mockData.ts: matching test fixture - tests/test_backend_service.py: matching FakeRuntime stub 466 Python / 174 TS tests pass, tsc clean.
Two related guardrails. 1. backend_service/state.py::load_model — raise a clear RuntimeError when the request's modelRef isn't in the library AND no path was provided. Previously the request fell through to llama-server's built-in HuggingFace auto-fetch, which on Windows blew up with 'HTTPLIB failed: SSL server verification failed' because the bundled llama-server.exe ships without a CA bundle. That was the root cause of the phantom Nemotron load the user hit. Whatever UI path triggered that load, users will now see 'Download it from the Discover tab first' instead of a confusing SSL error. 2. src/features/chat/ChatTab.tsx — Send button is disabled and the textarea Enter-key handler is a no-op when loadedModelRef is null. Placeholder text changes to 'Load a model first...' so the disabled state is self-explanatory. Title attribute provides a hover hint. Also: - tests/test_backend_service.py — new test_model_load_rejects_models_not_on_disk locks in the guard behaviour (500 + actionable detail, runtime.load_model never invoked). - test_compare_stream_uses_exclusive_loading_and_clears_warm_pool updated to pass an explicit path for modelB, since the new guard requires either a library entry or a path (the test isn't exercising the guard itself, just warm-pool eviction). 467 Python / 174 TS tests pass, tsc clean.
Three related fixes for the Image/Video Studio install UX. 1. Eager torch warmup was pinning DLLs in the backend process, making it impossible for pip to reinstall torch via the install-gpu-bundle flow. Removed start_torch_warmup() from create_app() and updated probe() in both image and video runtimes to use importlib.util.find_spec instead of an actual import torch. The torch import is now deferred to preload()/generate() where it's actually needed — first-generate pays the cold-import cost (~30s on Windows cold disk) but install-gpu-bundle is no longer blocked by locked torch/lib/*.dll. 2. Image Studio's install button was gated on activeEngine === "unavailable" but the backend never returns that literal — it returns "placeholder" both when packages are missing AND when they're installed-but-broken (the user's "cannot import name 'autocast' from 'torch.amp'" case). The button was effectively invisible in production. Gate is now just !realGenerationAvailable, matching the Video Studio pattern. 3. Added a DLL-lock detector in the install worker: if pip rmtree fails with WinError 5 / Access is denied on a torch DLL (which still happens in one edge case — user already triggered a torch import before clicking Install GPU runtime), we now raise a clear actionable error pointing at the Restart Backend button instead of burying the cause under pip's stack trace. Backend plumbing: - backend_service/app.py: drop start_torch_warmup() from create_app(). Comment explains why — the warmup infrastructure still exists for future explicit callers. - backend_service/image_runtime.py: probe() uses find_spec only. Device is reported from the currently-loaded pipeline (None when nothing is loaded) instead of speculatively calling detect_device. - backend_service/video_runtime.py: same treatment. Message now lists missing_core packages explicitly so the preload() RuntimeError string tells callers what to install. - backend_service/routes/setup.py: new _looks_like_dll_lock() + short-circuit in _install_torch_walking_indexes. Message points at Restart Backend. Frontend: - src/features/images/ImageStudioTab.tsx: install button always shows when runtime isn't available (removed dead activeEngine gate). Tests: - tests/test_video_runtime.py: renamed test_probe_reports_ready_when_all_deps_and_torch_import_cleanly -> test_probe_reports_ready_when_all_deps_findable, asserts device is None. Removed test_probe_returns_initializing_fast_path_when_warmup_in_progress (that code path no longer exists). Added test_probe_does_not_import_torch as a regression lock so nobody reintroduces the DLL-pin bug. - tests/test_setup_routes.py: new DllLockDetectionTests covering the three cases that matter — WinError 5 on a torch DLL (match), unrelated pip failure (no match), WinError 5 on a non-torch file (no match). 470 Python / 174 TS tests pass, tsc clean.
Replaces the 'ask user to run PowerShell' troubleshooting loop with a one-click snapshot the user can copy as Markdown and paste into a support thread. Everything we'd otherwise manually query — OS + hardware + driver versions, runtime paths, env vars, GPU package presence, torch subprocess status, extras size + free disk, backend log tail — lives in one collapsible panel with a Copy diagnostics button. Backend endpoints (backend_service/routes/diagnostics.py): - GET /api/diagnostics/snapshot — structured JSON with app, os, hardware (CPU/memory/swap/disks/GPU from nvidia-smi/sysctl), python, runtime, gpu (find_spec + torch-in-subprocess probe so we never pin DLLs), extras, environment (pinned + redacted), logs tail. - GET /api/diagnostics/log-tail?lines=N — clamped at 500 lines. - POST /api/diagnostics/reextract-runtime — deletes the ephemeral %TEMP% embedded-runtime extraction so the next backend restart re-extracts from scratch. Useful when the runtime tree is corrupted from a crashed install. Frontend (src/features/settings/DiagnosticsPanel.tsx): - Renders the snapshot as collapsible sections with monospace key/value tables. Empty / missing values render as '—' so 'not applicable' is visually distinct from 'broken'. - Copy button formats the entire snapshot as a Markdown document (tables for each section, fenced code block for the log tail) and writes it to the clipboard. Status flips to 'Copied!' for 2.5s. - Re-extract runtime button (with confirm) + Restart Backend button grouped under a muted 'Advanced' note so users don't hit them by accident. Redaction: - Env vars whose name contains token / secret / password / api_key come back as ***redacted***. Pinned CHAOSENGINE_* / PYTHON* / PATH / VIRTUAL_ENV vars appear even when unset so users can see what the backend inherited vs didn't. Regression locks: - test_snapshot_gpu_uses_find_spec_without_importing_torch asserts the snapshot endpoint never pulls torch into sys.modules. The whole install-flow fix this week hinges on keeping the backend torch-free; this test prevents a future diagnostics addition from regressing it. - test_snapshot_redacts_secret_env_vars locks in the redaction rules so users can safely paste payloads into public chat. Wiring: - New Diagnostics subtab added to SettingsTab alongside General / Storage / Providers / Integrations. - App.tsx threads backendOnline / onRestartServer / busyAction into SettingsTab so the Refresh + Restart buttons gate correctly. 478 Python / 174 TS tests pass, tsc clean.
The user reported two 28 GB llama-server.exe processes surviving ChaosEngineAI closing all windows. Root cause: cleanup_orphaned_backends only matched 'backend_service.app' commandlines, so when the Python sidecar crashed before taskkill /F /T could cascade to its children, the llama.cpp binary children were reparented and left running. Now sweeps four process shapes on both platforms, parent-liveness- checked the same way as before: - backend_service.app (Python sidecar, unchanged) - llama-server (standard llama.cpp) - llama-server-turbo (TurboQuant / RotorQuant fork) - llama-cli (defensive — not currently spawned but may be) Unix: extended the ps-based pattern match into a const marker array. Turbo listed before 'llama-server' so substring match doesn't swallow the more specific name. Windows: split the single WMIC query into two filters (commandline LIKE for Python, name= for the llama binaries) because those can't compose cleanly in one WHERE clause. Lifted the shared parse + tasklist + taskkill logic into sweep_orphans_by_wmic_filter. 478 Python tests still pass. cargo check --release clean.
User hit 'Failed to fetch' in the Diagnostics panel — the very panel
meant to troubleshoot failures fetch-failed. Most likely one of the
helpers (psutil disk_usage on a funky Windows mountpoint, a hung
nvidia-smi, a serialization edge case) raised and uvicorn returned an
unparseable response before the 500 handler could serialize.
Every section builder now runs inside a _guarded() wrapper that
catches any exception and returns {'error': '<Type>: <msg>', 'section':
'<name>'} in place of the normal section dict. The frontend's
Record<string, unknown> typing accepts that shape unchanged, so empty
values just render as '—' and the user still sees the other sections.
The panel's whole point is to stay useful when things break; it can't
itself be fragile. 478 Python tests still pass.
Two fixes surfaced by the user's first working diagnostics snapshot on
their Windows VM.
1. find_spec('google.protobuf') raises ModuleNotFoundError (not returns
None) when the 'google' namespace package isn't installed at all.
That crashed _gpu_info() whole — the section came back as
{error: 'ModuleNotFoundError: No module named google'} in the
snapshot. Same edge case we already fixed in video_runtime._find_missing;
the diagnostics helper didn't get the treatment. Now has a
_has_module() wrapper that catches ModuleNotFoundError / ValueError /
ImportError and returns False, matching the caller's 'is it
installed?' semantic.
2. The Diagnostics panel body wasn't scrollable. The snapshot has ~9
sections plus a 200-line log tail, so on a 1080p screen the last
few sections AND the Re-extract button at the bottom were below
the fold, unreachable. Capped the body at 75vh with overflow-y:
auto so users can scroll through the whole thing. The Copy button
already worked regardless so support flows were fine, just
frustrating to verify visually.
Both trivial fixes, shipped together.
Closes the main memory-leak failure mode the user reported: 28 GB
llama-server.exe processes surviving app close on Windows. The
existing reactive sweep (cleanup_orphaned_backends) catches these on
the next app launch, but only if the user actually relaunches. While
the app stays closed the orphans are pinning VRAM + RAM indefinitely.
Now three proactive mechanisms — each handling the case its platform
is best at — layered so no single crash path can leak children.
## Windows (new): Job Objects with KILL_ON_JOB_CLOSE
- Added windows-sys crate with Win32_Foundation + Win32_System_JobObjects
features.
- New src-tauri/src/lib.rs::windows_job module creates a singleton Job
Object at first spawn via CreateJobObjectW, flips the
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE bit via SetInformationJobObject,
holds the handle in a process-lifetime OnceLock<usize>.
- After command.spawn() we call AssignProcessToJobObject on the
child. Grandchildren (llama-server spawned by Python) auto-inherit
job membership on Windows 8+.
- When Tauri exits FOR ANY REASON — graceful close, crash, Task
Manager's End Task, OOM killer, power loss handshake, panic —
Windows kernel closes the handle and the KILL_ON_JOB_CLOSE flag
makes it atomically kill every process in the job. No user-mode
code runs; nothing can leak.
- Best-effort: if Job creation fails the reactive sweep still catches
anything that leaks.
## Linux (new): PR_SET_PDEATHSIG in pre_exec
- Added libc::prctl(PR_SET_PDEATHSIG, SIGKILL) inside the existing
setsid pre_exec block. Kernel now delivers SIGKILL to the backend
Python when Tauri dies, even if the Python watchdog hasn't ticked
yet (closes the worst-case 500ms race in the getppid poll).
- Python then dies, which signals its session (llama-server +
anything else) via the watchdog's SIGTERM+SIGKILL cascade below.
## macOS / Linux (hardened): watchdog now SIGTERM+SIGKILL
- backend_service/app.py::_watch_parent_and_exit was already sending
SIGTERM to the process group on parent death, but graceful SIGTERM
can be ignored indefinitely by a buggy / stuck llama-server. Now
we SIGTERM first (gives llama-server a chance to flush caches +
release GPU handles), wait 300ms, then SIGKILL the whole group as
belt-and-braces. killpg(pgrp, SIGKILL) kills us too so we never
need to loop; if SIGKILL is somehow ignored we fall through to
os._exit(0).
## Coverage matrix
| Graceful close | Tauri crash | Tauri SIGKILL
macOS | setsid+shutdown| watchdog | watchdog
Linux | setsid+shutdown| PDEATHSIG | PDEATHSIG
Windows | taskkill /T | Job Object | Job Object
Fallback: cleanup_orphaned_backends on next launch (all OSes).
cargo check clean on macOS (Windows cfg-gated code not compiled
locally but mechanically audited against windows-sys 0.61 API).
478 Python + 174 TS tests still pass.
Rolls all four version-declaring files forward together so nothing goes out of sync: - package.json (frontend + vite build) - package-lock.json (npm lockfile mirror) - pyproject.toml (backend Python package) - src-tauri/Cargo.toml (Rust shell package) - src-tauri/Cargo.lock (Cargo picked up the new windows-sys dep for the orphan-prevention Job Object work AND the chaosengineai package version change in one regen) - src-tauri/tauri.conf.json (bundle + updater manifest version) Cargo.lock was going to need committing regardless — the orphan-prevention commit added a top-level windows-sys dependency in Cargo.toml and the lockfile must mirror that for reproducible builds. Rolling it into the version bump keeps the diff single-commit and bisect-friendly. 478 Python / 174 TS tests pass, cargo check clean.
User reported the app hanging on 'Loading workspace state...' after
installing 0.6.2 over 0.6.0. Diagnostic showed the extraction tree in
a partial state:
backend 21/04 16:51 <-- stale, from 0.6.0
bin 21/04 17:53 <-- stale
python 21/04 18:13 <-- stale
manifest.json 21/04 19:57 <-- current, from 0.6.2
And backend/ was EMPTY inside — all the Python source that boots the
FastAPI sidecar had been rmtree'd but the unpack never re-populated
it. Root cause: ensure_embedded_runtime_extracted had this line:
if extraction_root.exists() {
let _ = fs::remove_dir_all(&extraction_root); // swallows err
}
On Windows, fs::remove_dir_all fails with 'Access denied' when a DLL
is held open — our own torch/lib/*.dll or llama-server.exe from a
previous session, or even a brief Defender scan lock. The let _
discarded that error; unpack then ran into a partially-emptied tree
and wrote only whatever entries came BEFORE the locked file. Result
shipped above: backend/ cleared, nothing re-extracted, manifest
written as if everything was fine.
## Fix: key extraction dir by manifest content hash
Each unique build manifest lands in its own directory:
chaosengine-embedded-runtime/
windows-x86_64-a1b2c3d4/ <-- 0.6.0 build
windows-x86_64-e5f6g7h8/ <-- 0.6.2 build
Fresh extraction is always to a fresh directory. No existing tree is
ever overwritten, so no rmtree race. If a specific-hash directory
exists in a partial state (e.g. we crashed mid-unpack), we DO rmtree
it — but only that one dir, and we propagate errors now instead of
swallowing them so users see a clear message.
Hash is a 32-bit DefaultHasher fingerprint (4B possible values,
collision chance negligible for the handful of manifest versions a
user ever sees). 8 hex chars keeps Windows MAX_PATH headroom for
deep torch/lib paths.
## Cleanup for upgrading users
cleanup_legacy_extraction_root deletes the pre-0.6.2 unsuffixed
path (chaosengine-embedded-runtime/<platform>/) on bootstrap,
best-effort. Idempotent — once it's gone it's a no-op forever. This
means users upgrading from 0.6.0/0.6.1 don't have to manually nuke
their extraction dir before the first 0.6.2 launch.
## Diagnostics re-extract button
Updated Python's _runtime_extraction_root to return the PARENT
directory (chaosengine-embedded-runtime/) instead of computing a
specific platform-tag path. After the hash-suffix switch, Python
can't know the current hash without access to the manifest. Deleting
the whole parent matches the button's semantics anyway — 'clear
everything and re-extract from scratch on next launch'.
478 Python tests pass, cargo check clean.
cryptopoly
added a commit
that referenced
this pull request
Apr 21, 2026
User asked for: fixed-width terminal, single scroll region, step
counter showing progress. Previous per-step <details> cards were OK
on a 3-package install but stacked too tall on the 13-package GPU
bundle — output scrolled off-screen and users lost track of which
step was current.
New layout:
- Single monospace <pre> region, max-height 380px, auto-scrolls to
bottom on new attempts (tail -f behaviour). Doesn't steal scroll
on phase transitions — only on new output — so a user reading
earlier lines doesn't get yanked forward.
- Step line above the terminal shows 'Step 3/13: accelerate · 42%'
while running, 'Final: 12/13 packages · 100%' when done.
- Per-attempt markers ([ OK ], [FAIL], [....] for in-progress) line
up on the left edge so failures are scannable.
Also strips pip's dep-resolver noise from the displayed output. The
user hit this on their Windows box where their .venv's leftover
turboquant-mlx-full declared an mlx>= constraint that will never be
satisfied on Windows — pip prints a scary-looking ERROR block
('chaos-engine-compressor ... requires safetensors, which is not
installed'), cosmetic but alarming. Raw attempt.output still has
the noise; we just filter it from the rendered terminal. Users who
want the full pip trace still get it via the attempts array.
Per CLAUDE.md #2 (Simplicity First): single component file, no new
abstractions, no new deps. Per #3 (Surgical Changes): only touched
InstallLogPanel.tsx + its CSS block. Per #4 (Goal-Driven): visible
improvement (scroll works, step counter visible, noise suppressed)
verifiable at a glance on next build.
174 TS tests still pass, tsc clean.
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.