feat: CLI-via-Goosed unified agent architecture with multi-agent routing#7238
Draft
bioinfornatics wants to merge 277 commits intoblock:mainfrom
Draft
feat: CLI-via-Goosed unified agent architecture with multi-agent routing#7238bioinfornatics wants to merge 277 commits intoblock:mainfrom
bioinfornatics wants to merge 277 commits intoblock:mainfrom
Conversation
Three new endpoints under /analytics/routing/: - POST /inspect: score a message against all agent modes, return decision + detail - POST /eval: run YAML eval set, return metrics/results/report - GET /catalog: list all registered agents and their modes Adds public score_mode_detail() to IntentRouter for per-mode scoring visibility.
Wires existing systemd/launchd generators into CLI subcommands: - goose service install: install goosed as system service - goose service uninstall: stop and remove service - goose service status: show service status - goose service logs: tail service logs Cross-platform: systemd user units on Linux, launchd agents on macOS.
…es (REACT-3+) Remove 284 lines of inline duplicates (StreamState, StreamAction, streamReducer, initialState, pushMessage, prefersReducedMotion, streamFromResponse) from useChatStream.ts, replacing with imports from chatStream/streamReducer.ts and chatStream/streamDecoder.ts. 860 → 576 lines. Zero behavioral change — hook body unchanged.
Design document covering 7 backlog items for routing analytics: - BL-1: Analytics server endpoints (now implemented) - BL-2: Analytics UI page with 3 tabs - BL-3: Live user feedback (thumbs up/down) - BL-4: LLM judge for routing quality - BL-5: Automatic misrouting detection - BL-6: Test set management UI - BL-7: Session outcome tracking Includes feedback loop architecture diagram and priority matrix.
- A2A/MCP/ACP protocol comparison matrix - Agentic implementation mindmap (2026-02-15) - Knowledge graph seed data for mindmap - Protocol state-of-the-art analysis vs goose implementation
Extract QA functionality from CodingAgent's single 'qa' mode into a dedicated QA Agent with four specialized modes: - analyze: static analysis and code smell detection (read-only) - test-design: test case generation and strategy planning - coverage-audit: test suite gap analysis and coverage reporting - review: structured code review with severity-based findings Each mode has its own prompt template, tool group access, and recommended extensions. The QA Agent is registered in IntentRouter alongside GooseAgent and CodingAgent. Includes 9 unit tests and 8 routing eval test cases (37 total). Default mode: analyze.
… catalog BL-2: Three-tab analytics page consuming the /analytics/routing/ endpoints: - RoutingInspector: type a message, see per-mode scoring with matched keywords and the routing decision - EvalRunner: paste/upload YAML eval sets, run against IntentRouter, view metrics with per-agent/per-mode accuracy and confusion matrix - AgentCatalog: browse all registered agents and their modes with when_to_use descriptions and tool group details Components are standalone and ready to be wired into the app's navigation/sidebar.
…, Promtail) Docker Compose-compatible configs for local observability: - prometheus.yml: scrape goosed metrics endpoint - loki.yaml: local log aggregation storage - promtail.yml: ship goosed logs to Loki These support the OTel pipeline and analytics dashboard.
Collaborator
|
thanks @bioinfornatics I like that general idea - looks like a lot of work to tidy up conflicts but would like to see what it looks like if you could show it here. |
PM Agent (4 modes): - requirements: user stories, acceptance criteria, PRDs - prioritize: RICE/MoSCoW scoring and backlog ordering - roadmap: milestone planning, phased rollout, risk registers - stakeholder: personas, competitive analysis, KPIs Security Agent (4 modes): - threat-model: STRIDE analysis, attack surface mapping - vulnerability: SAST-style code review, injection analysis - compliance: OWASP ASVS, PCI-DSS, SOC 2, HIPAA audit - pentest: security test plans and attack scenarios Both registered in IntentRouter with keyword routing. Routing eval expanded from 37 to 53 test cases. Total: 18 new unit tests (9 per agent).
…earn modes Standalone Research Agent for knowledge work: - investigate: deep-dive technical research with source citations - compare: structured technology comparisons with decision matrices - summarize: document/discussion summarization with key decisions - learn: concept explanations with examples and learning paths Registered in IntentRouter, 8 eval test cases (61 total). 10 unit tests, all passing.
- Add AnalyticsView import and /analytics route to App.tsx - Add AnalyticsRoute component wrapper - Add 'agents' and 'analytics' to View type in navigationUtils.ts - Add navigation handler cases for agents and analytics views
Mark GooseAgent's 'specialist' mode as deprecated with guidance to use the new dedicated agents instead (QA, Security, PM, Research). - Add 'deprecated: Option<String>' to BuiltinMode and AgentMode structs - Specialist mode still works but carries deprecation message - All other modes explicitly marked deprecated: None - Propagated through to_agent_modes() and to_public_agent_modes() - 3 new tests: deprecation present, non-specialist not deprecated, propagation to AgentMode
Add a comprehensive evaluation analytics system inspired by Copilot Studio for testing, tracking, and monitoring orchestrator routing accuracy. Backend (Rust): - New eval_storage.rs: Full eval persistence with SQLite tables for datasets, test cases, and eval runs (schema v8 migration) - 10 new API endpoints: dataset CRUD, eval run execution, overview dashboard, and topic-level analytics - SessionAnalytics: Extended usage insights (daily activity, provider breakdown, token trends, top directories) - Integration with IntentRouter for live evaluation execution Frontend (React/TypeScript): - 3-view architecture: Dashboard / Evaluate / Configure - Dashboard: Unified view combining usage KPIs with eval health indicators, regression alerts, accuracy trends, and per-agent performance bars - Evaluate: Overview tab (KPIs + trends), Datasets tab (full CRUD with inline editor + YAML mode), Run History (detail view with confusion matrix), Topics (tag-based accuracy analysis) - Configure: Routing Inspector, Legacy Eval Runner, Agent Catalog - Analytics entry added to sidebar navigation with BarChart3 icon - recharts library added for chart visualizations Quality gates: cargo fmt, clippy, tsc --noEmit all pass
- Add tool_analytics.rs: Extract tool usage metrics from existing messages DB using SQLite JSON functions (zero new instrumentation needed) - Track per-tool call counts, error rates, success rates, extension breakdown - Daily tool activity trends (calls, errors, unique tools over 30 days) - Per-session tool summaries (calls, errors, top tool) - Agent performance metrics (provider breakdown, session duration stats, active extensions, avg tools/messages per session) - Add GET /analytics/tools and GET /analytics/tools/agents endpoints - Add ToolAnalyticsTab.tsx: Full tool analytics UI with KPI cards, daily activity chart, sortable tool table, extension pie chart, session tool table, and agent performance panels - Update AnalyticsView with 3-group tab architecture (Observe/Evaluate/Configure) - Register all new types in OpenAPI and regenerate TypeScript client
…, visual DAG workflows Proposes restructuring the desktop app from 9 sidebar items to 4 zones: - Home (merged chat + workspace) - Workflows (recipes + visual DAG builder + schedules) - Observatory (analytics + monitoring + eval) - Platform (extensions + agents + apps + settings) Key innovations: - Persistent context-adaptive prompt bar on every page - Generative UI responses (inline charts, forms, tables in chat) - Visual no-code DAG workflow builder (agent/tool graph editor) - Slash commands for universal navigation and actions Implementation phased: navigation restructure → adaptive prompt → DAG builder → live monitoring
…render research Updated the design document with: - DAG workflow format research (Argo, CWL, LangGraph, n8n, CrewAI, AutoGen) - Proposed Goose Pipeline Format (YAML-based, extends recipe concept) - Node types: trigger, agent, tool, condition, transform, human, subpipeline, a2a - TOON format research for token-efficient data serialization - json-render (Vercel) integration for generative UI - Component catalog spec with Zod schemas - Goose 3-tier architecture diagram - Conversational vs DAG workflow comparison - Implementation phases with effort estimates - React Flow recommendation for visual DAG editor Key decisions documented: - Sessions are per-project, multi-session supported - Two workflow types: conversational (.md/.yaml/.toon) and visual DAG (.yaml) - No existing standard for agent DAG workflows - propose Goose Pipeline Format - json-render for guardrailed generative UI, MCP Apps for complex custom UIs - Desktop and CLI are the maintained interfaces
## Summary - CodingAgent: reduced from 8 SDLC modes to 5 behavioral modes - Removed: pm, qa, security (overlap with dedicated agents) - Renamed: backend → code, sre+devsecops → devops - Added: debug (new behavioral mode) - Kept: architect, frontend - New prompt templates: code.md, debug.md, devops.md Each follows 7-section persona structure: Identity → Expertise → Tools → Approach → Boundaries → Communication - Intent router: improved scoring with absolute match bonus to prevent short keyword lists from outscoring specific agents - Orchestrator: updated catalog, routing tests, serialization tests - ACP discovery/IDE/agent_card: updated mode counts and assertions - Architecture doc: docs/architecture/agent-persona-cleanup.md Full design with current-state analysis, proposed model, prompt design patterns, routing test matrix, implementation plan All 222 agent tests pass. All 40 server tests pass. cargo fmt + clippy clean.
Consolidate 9 flat menu items into 4 semantic zones: - Home Zone: Home + Chat with collapsible recent sessions - Workflows: Recipes, Apps, Scheduler - Observatory: Analytics, Agents - Platform: Extensions, Settings Each zone is a collapsible group with a header icon and label. Zones auto-expand when they contain the active route. Items within zones are indented for visual hierarchy. Part of UX Phase 1 (goose4-e5x).
…y zone - Add ToolsHealthView component with KPI cards, daily activity chart, sortable tools table with health indicators, and extension breakdown - Group sidebar sessions by project (working_dir) with collapsible project headers and folder icons - Add Tools entry to Observatory navigation zone - Add /tools route in App.tsx The Tools Health page monitors tool execution health, success rates, and failure patterns across all extensions. Sessions in the sidebar are now grouped by project when spanning multiple working directories.
409f4a2 to
f8e7762
Compare
…oposal - Replace 'Extensions' concept with three Catalogs (Agents, Tools, Workflows) - Document resolved decisions: sessions are project-bound, default General project - Multi-session support for multi-tasking within projects - Updated open questions to focus on registry, versioning, TOON adoption - Catalog lifecycle: browse → install → configure → version → share
Phase 2 of UX redesign: persistent prompt bar across all pages. - PromptBarContext: zone-aware context (home/chat/workflows/observatory/platform) - Maps URL paths to navigation zones - Provides zone-specific placeholder text and hints - Global slash commands: /new, /recipe, /model, /project - Zone-specific commands: /eval, /run, /schedule, /install, /settings - Session creation fallback via PROMPT_BAR_SUBMIT event - PromptBar component: lightweight fixed-bottom input - Slash command autocomplete with keyboard navigation - Cmd/K (Ctrl+K) global focus shortcut - Auto-hides on /pair route (ChatInput handles it there) - Zone-aware hint text - Loading state during session creation - AppLayout: wraps content with PromptBarProvider + renders PromptBar - Renders between ChatSessionsContainer and ReasoningDetailPanel - PromptBarProvider wraps inside SidebarProvider for theme access Closes: goose4-cj2
…modes Phase 2A + 2B of agent persona cleanup: - Add UniversalMode enum (ask/plan/write/review/debug) as shared behavioral abstraction across all agents. Each mode defines base tool groups and ACP-compatible AgentMode conversion. - Add DeveloperAgent using UniversalMode set, replacing CodingAgent's ad-hoc modes with proper behavioral stances. - Add 5 developer prompt templates (ask.md, plan.md, write.md, review.md, debug.md) with 7-section structure (Identity/Mode/Tools/Approach/Boundaries). - Register DeveloperAgent in IntentRouter (disabled, alongside CodingAgent for backward compat during transition). - Register developer templates in prompt_template.rs. Design: - Agent = WHO (persona/role) — orthogonal to Mode = HOW (behavioral stance) - Aligned with ACP SessionMode, A2A AgentSkill, and Kilo Code custom modes - UniversalMode is shared; agents add persona-specific extra tools per mode All 282 tests pass (242 goose + 40 server). Zero clippy warnings.
…tent prompt bar Hub now listens for PROMPT_BAR_SUBMIT custom events dispatched by the PromptBar component. When a user types in the prompt bar on the home page, it creates a new session and navigates to the chat view, providing the same flow as typing directly in the Hub's ChatInput.
- Recipes → Workflows (under Workflows zone) - Agents → Agent Catalog (under Observatory zone) - Extensions → Tools Catalog (under Catalogs zone) - Platform zone → Catalogs zone - Updated tooltips to reflect new terminology
…lan/write/review) - Refactored QaAgent, PmAgent, SecurityAgent, ResearchAgent to use UniversalMode enum - Added Serialize derive to UniversalMode for serde compatibility - Created 16 new prompt templates (4 agents × 4 modes: ask/plan/write/review) - Updated intent_router scoring with agent-level description bonus (critical for universal modes where all agents share same mode keywords) - Updated template registry to use new universal prompt paths - All 275 tests pass (235 goose + 40 server), zero clippy warnings
- Create CatalogsOverview.tsx: unified entry page showing all 3 catalogs (Tools, Agents, Workflows) as cards with item counts and previews - Add /catalogs route in App.tsx with CatalogsRoute component - Make 'Catalogs' zone header clickable → navigates to /catalogs overview - Zone labels with route prop show separate click target vs collapse chevron - Cross-catalog search filtering - Quick action buttons (add, import, browse registry) per catalog - Status indicators for enabled/disabled items
Phase 5: Verify routing accuracy with new agent×mode model. - 50 test cases across 6 agents (Goose, Coding, QA, PM, Security, Research) - All modes use universal slugs (ask/plan/write/review/debug) - Tagged with priority (p0/p1) and category for targeted analysis - 3 ambiguity test cases for edge-case disambiguation - Keyword router baseline: 42% agent accuracy (expected for fast-path) - LLM orchestrator handles remaining routing via splitting.md prompt
- analytics-architecture.md: Backend modules, API endpoints, data flow, SQL patterns for tool analytics extraction, frontend component map - navigation-architecture.md: 4-zone sidebar structure, routing map, prompt bar behavior, session grouping, design decisions, remaining work
…ge for ACP Knowledge Graph: - 1,651 entities (was 1,637), 2,339 relations (was 2,315) - New entities: PolicyEngine, PolicyStore, PolicyRule, PolicyDecision, AuditLogger, AuditEvent, QuotaManager, control_plane routes, authorize_middleware, route_to_policy, UserIdentity.roles, AgentSlotRegistry SQLite, keyring session tokens, X-Api-Key decision - New relations: middleware chain wiring, AppState composition, control plane management, RBAC checking, persistence Non-Obvious Knowledge: - authorize_middleware uses Extension<Arc<T>> not State<Arc<T>> (dual axum) - #[allow(dead_code)] required for route-handler-only functions - Middleware execution order and Extension layer ordering - Policy engine evaluation semantics (first-match, Abstain=allow, RBAC OR) - Quota hierarchical scope matching (usage per exact scope, not aggregated) - AgentSlotRegistry SQLite schema - Control plane API endpoint reference table - route_to_policy action mapping examples - Test count checkpoints (987 goose + 97 server = 1,084+) - CLI session tokens now in keyring (migrate_legacy_token auto-migration)
…l-plane/v1 Avoid naming collision with existing ACP (Agent Communication Protocol) routes. The /acp/* prefix is used for IDE ↔ agent communication (Cursor, Windsurf). The /control-plane/v1/* prefix is now used for enterprise management API. Also adds control-plane route mappings to authorize_middleware's route_to_policy. Updates docs and knowledge graph references accordingly.
…ement from default rules) The deny-guest-management policy rule is now extracted to a standalone guest_management_deny_rule() function. Multi-tenant deployments should add it explicitly via the control plane API when OIDC auth is configured. Local desktop mode no longer blocks config/management operations for guest users.
…ironment - Local (default): no restrictions, permissive for solo desktop use - Team: deny guest management on shared servers (secret key configured) - Enterprise: full enforcement + deny guest execution (OIDC configured) - Auto-detect from GOOSE_SECURITY_MODE env, or infer from OIDC/secret key presence - PolicyStore.for_mode() factory replaces hardcoded default_rules() - state.rs uses SecurityMode::detect() for auto-configuration - 19/19 policy tests pass, clippy clean
…resh non-obvious knowledge for SecurityMode + UI cleanup
- UserAvatarMenu: persistent avatar/dropdown in top-right of every page Shows guest icon or user initials, Settings link, Sign in/out - AuthConfig in config.yaml: support preset providers (google, azure, github, gitlab, auth0, okta, aws) and custom OIDC configuration - Server loads auth config at startup and auto-registers OIDC provider - 6 unit tests for auth config parsing
- New WelcomePage component combining authentication and AI provider selection on a single page - Two-step flow: Step 1 (auth) shows OIDC providers + API key in grid, Step 2 (provider) shows AI model providers in grid - Responsive grid layout using auto-fill for horizontal space usage - OIDC provider buttons with branded icons (Google, GitHub, Microsoft, etc.) - Progress indicator and skip/get-started bottom bar - Auth step auto-skipped when not required - Replaces old WelcomeRoute (was just ProviderSettings wrapper)
…elcome.png - Dark left sidebar (320px) with Goose logo, title, and panel switcher - Right panel: auth grid or provider grid depending on active panel - OIDC providers in responsive grid (auto-fill, 180px min) - AI providers reusing existing ProviderCard in grid (auto-fill, 200px min) - Panel switcher in sidebar shows auth/provider completion status - Auth panel: OIDC buttons + API key form + skip option - Provider panel: card grid + Get Started button - Matches goose_welcome.png reference design
- SecurityMode::detect() no longer infers Team mode from GOOSE_SERVER__SECRET_KEY (always set by CLI when spawning goosed). Use GOOSE_TEAM_MODE=1 explicitly. - Add telemetry/setup routes to route_to_policy as write:* (not manage:*) - Local mode has NO deny rules — everything passes through for best solo UX - Team/Enterprise modes still enforce guest restrictions
- Remove 459 lines of inline onboarding UI from ProviderGuard - ProviderGuard now only checks provider state and delegates to WelcomePage - Dead code cleanup: removed unused imports, state, callbacks, setup handlers - ProviderGuard: 526 → 84 lines
- ProviderGuard only checks provider state + redirects to /welcome - No inline UI, no setup modals, no analytics — just check and redirect - WelcomePage is the dedicated first-launch page at /welcome route - Clear separation: guard (component) vs welcome (page)
- New AuthSection component in settings/auth/ - Uses design system: Button, Input, Separator, Switch, LoadingState - Shows auth status, SSO providers grid, API key form - Auth mode indicator (Enterprise vs Local) - Wired into SettingsView as new 'auth' tab with Shield icon
…t, Separator) - Replace 8 raw <button> with Button component (variant/size/disabled) - Replace 1 raw <input> with Input component - Add Separator between auth sections - Use Loader2 spinner instead of raw SVG - 420 lines → 310 lines, 0 raw HTML controls - Matches design system audit recommendations
… dedicated folders - pages/: WelcomePage, LoginView, LauncherView - guards/: AuthGuard, ProviderGuard - modals/: AnnouncementModal, TelemetryOptOutModal, SetupModal, OllamaSetup - Updated all imports in App.tsx, App.test.tsx, TelemetrySettings, OllamaSetup.test - Fixed App.test.tsx mocks for new component paths - All 325+ tests pass (only pre-existing MarkdownContent failure)
Move 25 component files into atomic design subdirectories: atoms/ (12 files — self-contained primitives): button, input, separator, switch, skeleton, scroll-area, Tooltip, tabs, Dot, Expand, Stop, icons molecules/ (13 files — composed from atoms): card, dialog, collapsible, dropdown-menu, sheet, Select, sidebar, BackButton, BaseModal, ConfirmationModal, Diagnostics, JsonSchemaForm, RecipeWarningModal design-system/ (13 files — unchanged, already organized) Changes: - Create atoms/index.ts and molecules/index.ts barrel exports - Fix all ~100 consumer import paths across the codebase - Fix internal cross-references (molecules → ../atoms/) - Fix cn utility import depths (../../utils → ../../../utils) - Fix design-system/LoadingState.tsx skeleton import path No functional changes. 325 tests passing (1 pre-existing failure).
…nto domain dirs Move all 40 root-level .tsx files from components/ into domain directories: chat/ (10 files): BaseChat, ChatInput, ChatSessionsContainer, GooseMessage, UserMessage, ProgressiveMessageList, MessageQueue, MentionPopover, WelcomeState, LoadingGoose messages/ (11 files + 1 test): ToolCallWithResponse, ToolCallArguments, ToolCallConfirmation, ToolCallStatusIndicator, ToolApprovalButtons, WorkBlockIndicator, ReasoningDetailPanel, MessageCopyLink, MCPUIResourceRenderer, ElicitationRequest, MarkdownContent branding/ (4 files): GooseLogo, WelcomeGooseLogo, FlyingBird, AnimatedIcons modals/ (2 files + 2 tests): ExtensionInstallModal, ParameterInputModal (+ OllamaSetup.test.tsx) shared/ (8 files): ImagePreview, ItemIcon, SessionIndicators, ErrorBoundary, RecipeHeader, ApiKeyTester, UserAvatarMenu, GroupedExtensionLoadingToast contexts/ (2 files moved to src/contexts/): ConfigContext, ModelAndProviderContext Updated ~95 files: all consumer imports, test mock paths, and internal cross-references. Zero root .tsx files remain in components/. Tests: 325 passing, 1 pre-existing failure (MarkdownContent) tsc: clean (3 pre-existing warnings only)
Backend: - Add security_mode field to AuthStatusResponse (auth_config.rs) - Returns SecurityMode::detect().to_string() (local/team/enterprise) - Regenerated OpenAPI spec + SDK types Frontend: - Rewrite AuthSection.tsx with 5 composable cards: - AuthStatusCard: shows current identity, auth method, mode badge - ApiKeyCard: set API key (all modes) - OidcCard: SSO login buttons (team/enterprise only) - EnterpriseCard: tenant/policy/quota info (enterprise only) - SignOutCard: logout button when authenticated - Add securityMode to useAuth hook state + context - Cards conditionally render based on detected security tier - 210 → 195 lines, zero pre-existing warnings fixed
Replace status-card dashboard with a configuration-first UI: - Card grid to pick auth method: No Auth / GitHub / GitLab / Azure AD / Google / Custom OIDC / API Key - OIDC form: issuer URL, client ID, client secret + advanced (tenant/group claims) - API Key form: shared secret input - Auto-loads existing provider config on mount - Green dot on configured methods - Session banner with sign-out when authenticated - Save/Sign-in action buttons with error/success feedback
…count SQL errors Backend (tool_analytics.rs): - Replace all 'message_count' column references with subquery JOINs against the messages table (sessions table has no message_count column) - Affects get_response_quality (overall, daily trend, by-provider queries) and get_agent_performance (provider stats, avg stats) - Use INNER JOIN for quality queries (only sessions with messages) and LEFT JOIN for performance queries (include empty sessions) Frontend (ResponseQualityTab.tsx): - Move useChartColors() hook above early returns to fix 'Rendered more hooks than during the previous render' violation - React hooks must be called in the same order every render Also: remove stale email reference in AuthSection.tsx
…smatch SQLite's AVG() on integer columns (COUNT(*)) can return INTEGER when all values are identical, causing sqlx decode errors when Rust expects f64. Added CAST(... AS REAL) to all AVG() expressions over integer-derived columns in get_agent_performance and get_response_quality queries.
Replace misleading 'schema may not exist yet' error messages with specific handler names for easier debugging: - get_agent_performance: 'Analytics agent performance query failed' - get_response_quality: 'Analytics response quality query failed'
Root cause: MAX(created_timestamp) - MIN(created_timestamp) returns INTEGER in SQLite (timestamps are stored as unix epoch integers). Rust expects f64. Wrapping with CAST(... AS REAL) fixes the type mismatch. This was the actual 'column 0: f64 not compatible with INTEGER' error from get_agent_performance — the duration_stats query, not the AVG() queries.
…ith two-step Rust approach The per-tool error query joined two json_each CTEs (18K tool_requests × 18K tool_responses) causing a 336M row cross-join that timed out. New approach: 1. Fetch error response IDs (small set, ~193 rows, 9ms) 2. Fetch tool requests only from sessions that had errors 3. Match IDs in Rust using HashSet (O(n) lookup) This makes the /analytics/tools endpoint respond in <1s instead of timing out.
The #[instrument] on dispatch_tool_call logged the entire Session struct
(name, extension_data, conversation) on every tool call. The session name
was stale ('New Chat') because the background naming task hadn't completed.
Fix: skip session in #[instrument], log only session_id as a span field.
When the assistant finishes generating a response, the ReasoningDetailPanel (work block side panel) now auto-closes so the final answer is visible in the main chat area. Previously, the panel stayed open after streaming ended, causing the analysis result to appear only in the side panel instead of the main chat. Added a useEffect that detects the isStreaming transition (true → false) and calls closeDetail() when the panel is active.
…x codebase - Install @biomejs/biome 2.4.2 as dev dependency - Configure biome.json with project-specific rules (a11y warnings, style warnings) - Migrate ESLint rules via 'biome migrate eslint' - Migrate Prettier settings via 'biome migrate prettier' - Add npm scripts: lint, lint:check, format, format:check, check, check:ci - Auto-fix 281 files: import sorting, useImportType, useConst, useOptionalChain, useTemplate, useNodejsImportProtocol - Fix tsc errors introduced by auto-fix: hoisting issues in RecipesView, LocalModelManager; null-safe progress in AlertBox; optional fileContent in ImportRecipeForm; remove stale @ts-expect-error in generative-registry - 0 errors, 411 warnings (all at warn level for incremental cleanup) - tsc: clean (0 errors) - Tests: 325 passing, 1 pre-existing failure
…inite loop
Biome's useExhaustiveDependencies auto-fix removed 'children' from the
auto-scroll useEffect dependency array. This caused the scroll area to
stop reacting to content changes and, more critically, triggered an
infinite setState loop in @radix-ui/react-scroll-area:
Error: Maximum update depth exceeded
at @radix-ui_react-scroll-area.js:71:66
at setRef
The EventEmitter memory leak (11 add-extension listeners) was a
secondary symptom — the ErrorBoundary recovery from the scroll crash
caused ExtensionInstallModal to re-mount repeatedly, accumulating
listeners.
The /config/pricing 404 is expected behavior — custom providers
(e.g. custom_claude_(azure)) have no canonical pricing data. The
frontend already handles this gracefully with pricingFailed state.
Add biome-ignore comment to prevent future auto-removal of the dep.
…o-scroll The auto-scroll useEffect depended on `children` prop, which is a new React element reference on every render. This caused: 1. useEffect firing every render (not just on content changes) 2. scrollTo() triggering Radix ScrollArea's internal setRef callback 3. setRef calling setState → re-render → new children ref → loop 4. 'Maximum update depth exceeded' crash Fix: Replace children dependency with ResizeObserver on the viewport's scroll container. ResizeObserver fires only when content size actually changes, breaking the render loop while preserving auto-scroll behavior. Before: useEffect([children, autoScroll]) → fires every render After: ResizeObserver on scrollContent → fires on resize only
…vent infinite loop
The useRegisterSession hook had a single useEffect that both registered
(setter({...})) and unregistered (cleanup setter(null)) the session.
Its dependency array included unstable references (functions, arrays, objects)
that changed identity every render, causing:
render → effect fires → setter({...}) → provider re-renders →
BaseChat re-renders → new refs → cleanup setter(null) → re-render → loop
Split into two effects:
1. Register/unregister: depends only on sessionId + stableSubmit.
The cleanup setter(null) runs ONLY here — on session change or unmount.
2. Update fields: depends on primitive/length values only.
Uses functional updater setter(prev => ...) with no cleanup.
Also widened setSessionState type to accept functional updater pattern.
Dependency array uses .length for arrays to avoid identity-based re-runs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CLI-via-Goosed: Unified Agent Architecture
Summary
This PR introduces a unified architecture where the CLI communicates with agents through
goosed(the server binary), aligning desktop and CLI on a single communication path. It also adds multi-agent orchestration with an intent router, ACP/A2A protocol compatibility, and comprehensive UI improvements.Key Changes
🏗️ Architecture: CLI-via-Goosed
goosedserver instead of directly instantiating agentsGoosedClientmanages server lifecycle (spawn, health check, graceful shutdown)~/.config/goose/goosed.state)goose service install|uninstall|status|logsfor managed daemon lifecycle (systemd/launchd)🤖 Multi-Agent System
judge,planner,recipe_maker) filtered from public discovery📡 Protocol Compatibility
📊 Analytics & Observability
POST /analytics/routing/inspect,POST /analytics/routing/eval,GET /analytics/routing/catalogorchestrator.route,orchestrator.llm_classify,intent_router.route🖥️ UI Improvements
useChatStreamsplit intostreamReducer.ts+streamDecoder.ts(860→576 lines)🔒 Security & Reliability
/runsendpoints viaServiceBuilderErrorResponseon all 11 bareStatusCodereturns inruns.rsErrorResponse::bad_request()andconflict()constructors addedQuality Gates
cargo build --all-targetscargo fmt --checkcargo clippy --all-targets -- -D warningscargo test -p goose --lib(789 tests)cargo test -p goose-server(40 tests)npx tsc --noEmitnpx vitest run(325/326, 1 pre-existing)npx eslintNew Test Coverage
Routing Evaluation Baseline
Files Changed
Follow-up Work