Conversation
- Implement hardware detection service (RAM, CPU, OS) - Add Ollama manager for binary lifecycle and model downloads - Create provider-agnostic LLM client abstraction layer - Implement config service with JSON persistence - Add IPC handlers for all LLM operations - Create FirstRunWizard component for user onboarding - Integrate LLM services into main process with auto-start - Expose LLM API to renderer via preload bridge - Add comprehensive documentation in LLM_INTEGRATION.md Architecture: - Hardware tier detection (insufficient/minimal/recommended/excellent) - Model download with real-time progress tracking - Provider-swappable design (future: OpenAI, Anthropic, Gemini) - Config stored in app data JSON, secrets ready for encrypted DB The integration bundles Ollama binary with the installer and downloads Mistral 7B (~4GB) on first run based on hardware check results.
- Create BackendConfiguration component with tabs (LocalDB, VectorDB, LLM, Automation) - Implement one-click automatic LLM installation in LLM tab - Show hardware info, system requirements, and installation status - Auto-install flow: Start Ollama -> Download Mistral 7B -> Verify - Integrate into Settings modal instead of separate wizard - Progress bar with real-time status updates - Error handling and Ollama detection This matches the intended UX: Backend configuration in Settings with a single click to automatically install everything.
- Enable LLM tab in BackendSwitcherNew component (previously disabled) - Add LLM configuration UI with hardware info and status - Implement one-click auto-install in extension UI - Add HTTP API endpoints for extension to communicate with Electron: * GET /api/llm/status - Get Ollama and model status * GET /api/llm/hardware - Get system hardware info * POST /api/llm/start - Start Ollama server * POST /api/llm/download-model - Download Mistral 7B - Show checkmark on LLM tab when ready - Progress bar with real-time status in extension Now clicking the LLM tab in the extension opens the full LLM setup and configuration UI with automatic installation capability.
- Modified BackendSwitcher.tsx (used in sidepanel sidebar) instead of BackendSwitcherNew.tsx - Enable LLM tab with same functionality as implemented earlier - Add LLM state management (status, hardware, install progress) - Implement loadLlmStatus and handleAutoInstallLlm functions - Show hardware info, system compatibility, and installation status - One-click auto-install button with real-time progress - Checkmark on LLM tab when ready - Matches the purple-themed sidebar UI from screenshot The LLM tab in the extension sidebar is now clickable and fully functional.
…kend Configuration panel
…r service - Phase 1: Added Ollama as LLM provider option in all agent box forms - Updated content-script.tsx sidebar agent box form - Updated grid-script-v2.js display grid form - Updated grid-script.js display grid form - Added Ollama models: mistral:7b, llama3:8b, phi3:mini, mistral:14b - Phase 2: Created AgentExecutor service module - Loads agent configuration from chrome.storage - Loads agent box LLM settings - Builds prompts from agent reasoning section (goals, role, rules) - Calls Ollama via Electron app HTTP API - Supports fallback to global/default LLM settings
Phase 3: Wire agent box execution and display - Added execute button (▶) to agent boxes in sidepanel - Import and integrate AgentExecutor service - Execute button calls agent with page context - Display results in agent box output area - Show notifications for execution status Phase 4: Agent box config persistence - Verified save handlers in grid-script.js capture provider/model - Verified save handlers in grid-script-v2.js capture provider/model - Agent box configurations properly stored and retrieved Phase 5: Error handling and fallbacks - Added checkElectronConnection() to verify app is running - Added checkOllamaRunning() to verify Ollama status - Added 60s timeout for LLM requests - Improved error messages for network failures - Improved error messages for timeout scenarios All phases of o.plan.md implementation complete. Next: User testing of agent → LLM → agentbox flow.
There was a problem hiding this comment.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| } | ||
| } | ||
| }) | ||
| }, [downloadDetails]) |
There was a problem hiding this comment.
Bug: useEffect dependency missing causes memory leak
The useEffect hook for listening to download progress has downloadDetails in the dependency array, causing the effect to re-run every time downloadDetails state updates. This creates a new event listener every time progress updates, accumulating listeners without cleanup. The listener should be set up only once on mount. The dependency array should be empty [] or the effect should include proper cleanup to remove the previous listener.
- Updated AgentExecutor.loadAgentConfig() to query SQLite via orchestrator API - Updated AgentExecutor.loadAgentBoxLLMSettings() to search sessions in SQLite - Updated AgentExecutor.loadGlobalLLMSettings() to load from SQLite - Removed chrome.storage dependency (storageGet import) - All agent data now correctly loaded from default database (SQLite)
…l Publisher Breaking Change: Provider now represents model publisher, not runtime connector. **New Structure (Option A):** - Provider = Model Publisher (Mistral, Meta, Microsoft, OpenAI, Anthropic, Google, xAI) - Model = Specific model version (7b, 14b, llama-3-8b, gpt-4o-mini, etc.) - Runtime = Implicit based on provider: - Mistral, Meta, Microsoft → Use Ollama (local) - OpenAI, Anthropic, Google, xAI → Use their APIs (cloud) **Changes:** - Updated all agent box forms (content-script.tsx, grid-script-v2.js, grid-script.js) - Updated AgentExecutor to determine runtime from provider - Added getOllamaModelName() to map provider+model to Ollama format - Mistral + 7b → mistral:7b - Meta + llama-3-8b → llama3:8b - Microsoft + phi-3-mini → phi3:mini - Cloud API providers show helpful error message (not yet implemented) **Why:** Ollama is a connector/runtime, not a provider. It can run models from multiple publishers locally. This structure correctly represents the architecture and allows future cloud API integration.
Created complete wiring system for agent input routing and output display: **New Services:** 1. InputCoordinator - Routes input events to matching agents - Finds agents with enabled listener sections - Matches by patterns, input sources, input types - Falls back to default agent (agent01) if no matches - Calls AgentExecutor for each matched agent 2. OutputCoordinator - Routes LLM output to correct Agent Box - Loads agent execution section config - Resolves target agent box (explicit, matching, fallback) - Appends content to agent box (append/replace modes) - TODO stubs for session data updates and UI refresh 3. AgentExecutor - Enhanced with runAgentExecution() - New method for coordinator integration - Builds prompts from input events - Uses LlmClient for HTTP bridge to Electron - Returns standardized AgentExecutionResult 4. LlmClient - HTTP wrapper for Electron LLM API - Sends chat completion requests to Electron - 60s timeout with helpful error messages - Checks Electron/Ollama availability **Configuration Types:** - ListenerSectionConfig - Controls agent input filtering - ExecutionSectionConfig - Controls output routing - InputEventPayload - Normalized input from command/DOM - SystemAgentConfig - Marks system agents - Updated AgentConfig with new sections **System Agents:** - Auto-create Input Coordinator and Output Coordinator on session load - Marked with isSystemAgent: true - systemAgentType: 'input_coordinator' | 'output_coordinator' - Stored in SQLite as special agent configs **Integration:** - Command chat in sidepanel now routes through InputCoordinator - Async handling with error messages - Placeholder assistant responses while processing **TODO Comments for Future:** - Load all agents from SQLite for pattern matching - Load session agent boxes for output routing - Update agent box in session data and emit UI refresh - Session creation triggering system agent setup All coordinator services use singleton pattern for easy access.
Changed behavior when no matching agents with enabled listeners are found: **Before (WRONG):** - No matches → fallback to agent01 always **After (CORRECT):** - No matches with listeners → check for agents WITHOUT listener sections - If agents without listeners exist (always-on agents) → execute those - If NO agents without listeners → do NOT forward input at all **Logic:** 1. Find agents with enabled listener sections that match input patterns 2. If matches found → execute those agents 3. If NO matches found: a. Find agents WITHOUT listener section enabled (reasoning/execution only) b. If such agents exist → execute those (always-on agents) c. If NO such agents exist → do nothing (no forwarding) **Why:** - Agents without listener sections are 'always-on' - they process any input - Agents with listener sections only process matching input - If no one wants the input (no matches, no always-on), don't force execution **New Methods:** - findAgentsWithoutListeners() - Finds always-on agents - executeAgents() - Extracted common execution logic Removed getDefaultAgent() method as it implemented incorrect fallback behavior.
InputCoordinator now returns agent responses for chat display: - handleInputEvent() returns string | null (response content) - executeAgents() collects all agent responses - Multiple agent responses separated by '---' - Returns null if no agents processed input Sidepanel command chat updated: - Displays actual LLM response from agents - Shows 'No agents available' if null response - Response appears in chat as assistant message - ALSO routes to agent box via OutputCoordinator Flow: 1. User types in command chat 2. InputCoordinator finds matching agents 3. Agents execute via LLM 4. Response shown in chat immediately 5. Response also routed to agent box (if configured) Benefits: - User sees immediate feedback in chat - Agent box gets output for persistent display - Dual output: chat (conversation) + agent box (workspace)
Fixed agent loading issues: 1. findAgentsWithoutListeners() now properly scans agents 01-10 - Checks each agent if it exists in SQLite - Filters for agents with reasoning capability - Only includes agents WITHOUT enabled listener sections - Returns all matching always-on agents 2. Added direct LLM fallback when NO agents exist - If no matching agents AND no always-on agents - Calls LLM directly with just user input - No agent instructions/rules applied - Gives standard model response 3. Flow now: a. Try agents with matching listener patterns b. If none, try agents without listeners (always-on) c. If none, direct LLM call (standard response) Why this fixes the issue: - User's agent01 (no listener enabled) will now be found - Agent will execute and output to agent box - If no agents configured, user still gets LLM response Default chat behavior: - When no agents exist: standard model response (no rules) - When agents exist: follows agent instructions
Updated OutputCoordinator behavior: **Before:** - If executionSection.targetOutputAgentBoxId not set → no routing - User had to explicitly configure where output goes **After:** - If executionSection.targetOutputAgentBoxId not set → use default (same agent's box) - executionSection only needed to CHANGE default behavior - Always routes output, even if execution section not configured **Logic:** 1. Check for explicit target (user wants non-default behavior) 2. If none, default to agent box with matching agent number 3. If no matching box, fallback to first available box **Why:** - 'Respond to' selection is optional - for changing defaults - Default behavior: agent outputs to its own agent box - User only configures when they want cross-agent routing Example: - Agent 01 with no 'Respond to' configured → outputs to Agent Box 01 - Agent 01 with 'Respond to: Agent Box 03' → outputs to Agent Box 03
…e button from sidepanel
- Implement resolveTargetAgentBox() to find matching agent boxes in session - Implement loadSessionData() to load session from SQLite via HTTP API - Implement appendToAgentBox() to update agent box output via chrome message - Add AGENT_BOX_OUTPUT_UPDATE message handler in sidepanel - Add handleAgentBoxOutputUpdate() to persist output changes to SQLite - Default routing: output goes to same agent's box if no explicit target set - Rebuild Electron app with /api/llm/chat endpoint Fixes: - HTTP 404 error for /api/llm/chat - Agent box output not displaying when agent has only Reasoning + Execution sections active - Output routing defaults correctly when 'Respond to' is not selected
- Increase timeout to 120 seconds (2 minutes) for slow models - Add Ollama readiness check before processing requests - Add detailed logging for LLM calls - Initialize LLM client on each request to ensure config is current - Return 503 if Ollama is not ready WIP: Investigating agent box positioning issue on session restore
CRITICAL BUG FIX: - handleAgentBoxOutputUpdate was loading entire session from SQLite and saving it back - This was corrupting session data structure on every LLM output - Agent boxes were losing their position, tabIndex, and other fields - Sessions were becoming malformed after any LLM interaction SOLUTION: - Changed handleAgentBoxOutputUpdate to ONLY update UI state (optimistic update) - No longer touches SQLite during output updates - Agent box outputs are ephemeral by design (cleared on reload) - Session structure is now preserved correctly - Only structural changes (add/remove/move boxes) trigger SQLite saves This fixes: - Agent boxes not showing in master tabs after session reload - Display grid behavior changes - Session management corruption
- InputCoordinator now properly detects agents with Reasoning AND Execution enabled - Added detailed logging for always-on agent detection - Checks both capabilities and listener section status - Better console output for debugging agent routing Still TODO: - Add content script listener for UPDATE_AGENT_BOX_OUTPUT message - Optimize LLM performance (high CPU usage) - Test agent box output display
DRASTIC PERFORMANCE IMPROVEMENTS: - Reduced maxTokens from 2000 → 500 (4x faster) - Reduced temperature from 0.7 → 0.3 (more focused, faster) - Added num_ctx: 2048 (context window limit) - Limited num_thread: 4 (prevent CPU overload) - Added top_k: 20, top_p: 0.9 (faster sampling) - Added repeat_penalty: 1.1 (prevent loops) - Reduced timeout from 120s → 60s These settings should prevent: - 100% CPU usage - System freezing - Signal timeouts - Long wait times BEFORE: System freezes, 100% CPU, 2+ minute responses AFTER: Should be ~15-30s responses, manageable CPU usage Note: Mistral 7B is still heavy. If issues persist, recommend switching to: - phi-3-mini (3B parameters, much faster) - or using remote API (OpenAI/Claude)
CRITICAL CHANGES: - Default model changed from mistral:7b to mistral:7b-instruct-q4_0 (75% smaller) - Added model deletion API (deleteModel) - Added detailed model info API (getModelDetails) - Improved hardware detection to check FREE RAM, not just total - Added model recommendations based on actual available RAM - Added 6 opensource models with quantized variants NEW APIS: - DELETE /api/llm/model - Delete installed model - GET /api/llm/models - Get detailed model information - IPC: llm:getModelDetails, llm:deleteModel HARDWARE CHECK IMPROVEMENTS: - Now checks free RAM (more accurate) - Recommends tinyllama for <2GB free - Recommends phi3:mini for 2-4GB free - Recommends mistral Q4 for 4-6GB free - Recommends mistral Q5 for 6-10GB free - Only recommends full models for 10GB+ free MODEL CONFIGS: - mistral:7b-instruct-q4_0 (4GB RAM, 2.6GB disk) - mistral:7b-instruct-q5_K_M (5GB RAM, 3.2GB disk) - mistral:7b (8GB RAM, 4.1GB disk) - phi3:mini (2GB RAM, 2.3GB disk) - tinyllama (1GB RAM, 0.6GB disk) - llama3:8b (8GB RAM, 4.7GB disk)
- Added installedModels state and selectedModel - Added handleDeleteModel function - Added handleInstallModel function with progress tracking - Updated loadLlmStatus to fetch installed models - Added availableModels list with 6 opensource options - Simplified handleAutoInstallLlm to use new functions Next: Replace entire LLM tab UI with model selection dropdown and delete buttons
MAJOR UI CHANGES (Backend Configuration > LLM Tab): ✅ Shows FREE RAM (more accurate than total) ✅ Hardware-based model recommendations ✅ List of installed models with delete buttons ✅ Model selection dropdown with 6 opensource options ✅ Marks recommended models with â� ✅ Shows which models are already installed ✅ Real-time download progress ✅ Model descriptions and requirements NEW FEATURES: - Delete models to free disk space - Install any opensource model (TinyLlama, Phi-3, Mistral Q4/Q5/Full, Llama 3) - See actual hardware capabilities (free vs total RAM) - Get personalized model recommendations - Better warnings for low-end hardware MODEL OPTIONS: 1. TinyLlama (1GB RAM, 0.6GB disk) - Ultra fast 2. Phi-3 Mini (2-3GB RAM, 2.3GB disk) - Very fast 3. Mistral 7B Q4 (4GB RAM, 2.6GB disk) - Default â� 4. Mistral 7B Q5 (5GB RAM, 3.2GB disk) - Better quality 5. Mistral 7B Full (8GB RAM, 4.1GB disk) - Best quality 6. Llama 3 8B (8GB RAM, 4.7GB disk) - Alternative This addresses the user's critical issue: hardware was shown as compatible but full Mistral 7B was too heavy. Now system accurately detects FREE RAM and recommends appropriate quantized models.
- Removed unused 'req' parameter in GET /api/llm/models - Removed unused 'effectiveRamGb' variable in hardware check - Removed unused 'totalRamGb' parameter in generateWarnings All TypeScript errors resolved, ready to rebuild.
…tching REDESIGNED per user request: - Installer shows ADVISORY recommendations, never forces choice - All models selectable with color-coded compatibility indicators - User has freedom to choose any model - One-click model switching in Backend Configuration - Auto-wires selected model to orchestrator config INSTALLER improvements: - 6 opensource models (TinyLlama, Phi-3, Mistral Q4/Q5/Full, Llama 3) - Color-coded: Green (works well), Yellow (may be slow), Red (not recommended) - Recommended tag based on actual free RAM - Advisory messages explain impact on performance - NEVER disables options - user can override recommendations - Auto-updates config after installation BACKEND CONFIGURATION improvements: - Shows ACTIVE model with badge - Use This button to switch models instantly - Auto-updates orchestrator config when switching - Cannot delete active model (safety) - Visual highlight on active model NEW API: POST /api/llm/config - Update model selection This gives users freedom while providing intelligent guidance.
CRITICAL FIX: - Extension was using old BackendSwitcher.tsx without model management UI - Replaced with BackendSwitcherNew.tsx content - Now shows installed models with delete/switch buttons - Model selection dropdown with 6 opensource options - Hardware-based recommendations USER-REPORTED ISSUE: LLM tab in sidepanel was still showing old UI after rebuild. This was because the sidepanel uses BackendSwitcher.tsx, not BackendSwitcherNew.tsx. SOLUTION: Copied complete UI from BackendSwitcherNew to BackendSwitcher to ensure both components have the same functionality. NOW WORKING: - Delete existing models (mistral:7b) - Install new models (quantized versions) - Switch active model - Auto-wires to orchestrator
USER-REQUESTED CHANGES: 1. Allow deleting active model (removed restriction) 2. Added 4 high-end opensource models to selection NEW HIGH-END MODELS: - Llama 3.1 8B (~4.7GB, 8GB RAM) - Mixtral 8x7B (MoE) (~26GB, 32GB RAM) - Llama 3.1 70B (~40GB, 64GB RAM) - Qwen 2 72B (~41GB, 64GB RAM) CHANGES: - Removed isActive check from delete button - Users can now delete any model including active one - Added 4 new models to availableModels list - Updated MODEL_CONFIGS in config.ts - Updated LlmSetupWizard with new models TOTAL MODELS NOW: 10 - Low-end: TinyLlama, Phi-3 Mini - Mid-range: Mistral Q4, Q5, Full - Standard: Llama 3 8B, Llama 3.1 8B - High-end: Mixtral 8x7B, Llama 3.1 70B, Qwen 2 72B
Fixed missing function declaration for generateWarnings that was accidentally removed during earlier refactoring.
| console.error('[HTTP] Error updating LLM config:', error) | ||
| res.status(500).json({ ok: false, error: error.message }) | ||
| } | ||
| }) |
There was a problem hiding this comment.
Bug: LLM config update doesn't persist to disk
The HTTP endpoint for updating LLM configuration calls llmConfigService.update() instead of llmConfigService.save(). According to the config.ts file, update() only modifies the config in memory and does not persist changes to the file. The configuration changes will be lost when the app restarts. The IPC handler in ipc.ts correctly uses save(), but the HTTP endpoint does not. This should call await llmConfigService.save({ modelId }) to ensure changes are persisted.
…ures - Restored original BackendSwitcher.tsx from commit a653cb7 - Preserved original theme colors and DBeaver buttons in Local DB tab - Added ONLY LLM-specific features to LLM tab: * Display installed models with delete option (including active model) * Model selection dropdown with 10 opensource options * Real-time download progress * Hardware-based recommendations with free RAM display * Allow deletion of currently active model - All other tabs (Local DB, etc.) remain unchanged
| } catch (error) { | ||
| console.error('[MAIN] Error initializing LLM services:', error) | ||
| } | ||
|
|
There was a problem hiding this comment.
Bug: Awaited async config operations incomplete before client initialization
The llmConfigService.update({ modelId }) call in the HTTP POST /api/llm/config endpoint is awaited, but llmConfigService.load() is not. This creates a race condition: load() returns a promise that's awaited, but subsequent code may execute before the config file is actually read from disk. In the /api/llm/config endpoint (line 1195), update() is called but save() is never invoked to persist changes. This means configuration updates are lost after app restart since only save() persists to disk. The correct flow should be: call update(), then await save().
Ollama and Mistral 7b integrated in the installer and UI of the sidebar.
Note
Adds local LLM support (Ollama/Mistral) with Electron services and HTTP/IPCs, a first-run setup wizard and model management UI, plus extension-side agent execution and routing wired to the new LLM API.
electron/main/llm/*services: hardware checks, config, Ollama lifecycle, provider-agnostic client, and IPC.electron/main.ts: initialize LLM services, auto-start Ollama, and expose HTTP endpoints (/api/llm/*: status, hardware, start, download, delete model, list models, chat, config).preload.ts: exposewindow.llmAPI (check/status/start/download/chat/config/progress).LlmSetupWizardandFirstRunWizard; integrate inApp.tsx(first-run flow, reopen from Settings).BackendConfigurationwith LLM tab: hardware/status view, install/switch/delete models, progress.InputCoordinator,OutputCoordinator,AgentExecutor,llm/LlmClient, and shared types.grid-script*.js,content-script.tsx).LLM_INTEGRATION.md,LLM_WIZARD_GUIDE.md, implementation and testing docs for agents and coordination layer.Written by Cursor Bugbot for commit f682c89. This will update automatically on new commits. Configure here.