feat(ui): batch worker failure notifications#372
Merged
Conversation
A systemd cascade like start-limit-hit hitting half a dozen queue workers at once, or an OOM kill taking out every worker for a site in the same tick, used to fire one push per unit and spam the user with five or six near-identical "Worker failed on …" notifications back to back. The watcher detects new failures every five seconds, so the bursts were tightly clustered and unavoidable on the wearer side. The watcher now hands its new-failure delta to a small batcher that buffers by unit and arms a single five-second flush timer. Failures arriving later in the same window join the in-flight batch without resetting the timer, so the grouped push lands at most five seconds after the first failure even under a sustained cascade. A single isolated failure still goes out with the existing per-unit shape (deep link to the affected site, per-unit tag for browser dedupe), while two or more collapse into one "N workers failed" payload listing every worker@site and tagged lerd-workers-group so a later grouped push supersedes the earlier one in the notification tray instead of stacking. Added matching notify_worker_failed_group_title and _body keys to all eight locale message files, with the Turkish strings translated and the rest falling back to English the same way the existing worker_failed entries do.
Merged
geodro
added a commit
that referenced
this pull request
May 18, 2026
First beta of the 1.21.0 line. The headline is desktop notifications via Web Push (#353), with a per-category settings page polished alongside a dashboard health row (#354). The PHP-FPM image grows a real shell environment, zsh plus starship plus eza, bat, fzf, zoxide, isolated from the host (#358), then loses around 800 MB of build toolchain in a multi-stage split that drops the image from 1.36 GB to 535 MB without losing any of its 68 PHP modules (#364). A new on-demand commands feature surfaces one-shot framework actions across the dashboard, the lerd run CLI, the command palette, and four new MCP tools, all backed by a generalised Dropdown component that replaces every native select in the UI (#363). The site detail header gets a browser-style address bar with the favicon, TLS lock, LAN-share chip, and worktrees promoted from a dropdown to tabs (#365), an Env tab joins Overview, Tinker, and Dumps to show the project .env verbatim (#366), and the tray menu picks up Dump bridge and Notifications toggles that update live via a new KindDumpsStatus event (#373). Postgres grows 17 and 18 alternates alongside a new MySQL 9.7 LTS line, all gated by a canonical-version pin so flipping the yaml canonical no longer silently major-jumps existing installs (#361). Türkçe joins the dashboard languages (#355), a public_dir override lands in .lerd.yaml for projects with a non-standard document root (#370), every git invocation in the tree now flows through internal/git (#356), and worker-failure pushes are batched so a systemd cascade no longer fires six near-identical notifications back to back (#372). Plus the post-1.20.2 fix queue covers the worktree-manager button rendering on non-git sites (#357), TLS certs not refreshing when a secured site's domain set changed (#367), streamed worktree install and a wave of audit follow-ups (#368), and tinker swallowing bare-expression results when the dump bridge was on (#371).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A systemd cascade like start-limit-hit hitting half a dozen queue workers at once, or an OOM kill taking out every worker for a site in the same tick, used to fire one push per unit and spam the user with five or six near-identical "Worker failed on …" notifications back to back. The watcher detects new failures every five seconds, so the bursts were tightly clustered and unavoidable on the wearer side.
The watcher now hands its new-failure delta to a small batcher that buffers by unit and arms a single five-second flush timer. Failures arriving later in the same window join the in-flight batch without resetting the timer, so the grouped push lands at most five seconds after the first failure even under a sustained cascade. A single isolated failure still goes out with the existing per-unit shape (deep link to the affected site, per-unit tag for browser dedupe), while two or more collapse into one "N workers failed" payload listing every worker@site and tagged lerd-workers-group so a later grouped push supersedes the earlier one in the notification tray instead of stacking.