feat: enhance Ollama model handling with cloud support#65
Conversation
- Updated the Ollama model types to include `OllamaModelInfo` and `OllamaModelKind`. - Modified the DashboardPage to utilize the new model structure for better representation of available models. - Implemented cloud model discovery and caching in the new `cloud` module, allowing for seamless integration of cloud models into the existing workflow. - Added fallback logic for cloud model rate limits, ensuring a smooth user experience by automatically switching to the last used local model when necessary. - Enhanced the AppState to track the last local model for improved error handling during cloud interactions.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 19 minutes and 27 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThe PR adds cloud model discovery to Ollama by scraping ollama.com, classifies models as Local or Cloud, and implements automatic fallback to local models when cloud models hit rate limits. It updates the HTTP API, state, agent logic, and frontend components to support model categorization throughout the stack. Changes
Sequence DiagramsequenceDiagram
participant App as App Startup
participant CloudModule as cloud.rs
participant Cache as Model Cache
participant OllamaWebsite as ollama.com
participant Catalog as ModelCatalog
participant Agent as Agent
participant ChatService as chat_with_tools
participant OllamaLocal as Ollama Local/Cloud
App->>CloudModule: spawn list_cloud_models()
CloudModule->>Cache: check cache (TTL: 1h)
alt Cache Valid
Cache-->>CloudModule: return cached models
else Cache Expired/Missing
CloudModule->>OllamaWebsite: scrape cloud search page
OllamaWebsite-->>CloudModule: HTML with model slugs
loop Per-slug concurrent requests
CloudModule->>OllamaWebsite: fetch model detail page
OllamaWebsite-->>CloudModule: extract cloud models
end
CloudModule->>Cache: store discovered models
end
CloudModule-->>App: models discovered (fire-and-forget)
App->>Catalog: build model_catalog()
Catalog->>CloudModule: request cloud models
CloudModule-->>Catalog: return discovered models
Catalog-->>App: ModelInfo[] with Local/Cloud kinds
User->>Agent: send chat request
Agent->>Agent: select cloud model
Agent->>ChatService: call chat_with_cloud_fallback()
ChatService->>OllamaLocal: attempt cloud model chat
alt Success
OllamaLocal-->>ChatService: response
ChatService-->>Agent: return result
else Rate-Limit Error
ChatService->>Catalog: retrieve last_local_model
Catalog-->>ChatService: local fallback model
ChatService->>Agent: log downgrade
Agent->>Agent: update state.preferred_model
ChatService->>OllamaLocal: retry with local model
OllamaLocal-->>ChatService: response
ChatService-->>Agent: return result
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src-tauri/src/modules/ollama/service.rs (1)
56-143:⚠️ Potential issue | 🟠 MajorDon’t return before adding cloud models when Ollama is reachable but locally empty.
A reachable daemon with no active/pulled local models reaches Lines 132-134 and returns
Errbefore Lines 136-143 append discovered cloud models. That makes/v1/ollama/modelsreport unreachable and prevents cloud-only users from selecting cloud models.Track daemon reachability separately from catalog emptiness, then append cloud models when the daemon responded.
🐛 Proposed fix
pub async fn model_catalog(timeout_ms: u64) -> Result<ModelCatalog, String> { let client = http_client(); let timeout = std::time::Duration::from_millis(timeout_ms); let mut active: Option<String> = None; + let mut daemon_reachable = false; match client.get(OLLAMA_PS_URL).timeout(timeout).send().await { Ok(resp) => { if !resp.status().is_success() { log::warn!( "ollama {}: non-success HTTP {}", @@ resp.status() ); } else { + daemon_reachable = true; match resp.json::<serde_json::Value>().await { Ok(body) => { active = body["models"] .as_array() @@ let mut models: Vec<ModelInfo> = Vec::new(); match client.get(OLLAMA_TAGS_URL).timeout(timeout).send().await { Ok(resp) => { if !resp.status().is_success() { log::warn!( @@ resp.status() ); } else { + daemon_reachable = true; match resp.json::<serde_json::Value>().await { Ok(body) => { models = body["models"] .as_array() @@ - // Cloud models are proxied through the local daemon, so if local Ollama - // is unreachable they aren't usable either — keep the original error. - if active.is_none() && models.is_empty() { + // Cloud models are proxied through the local daemon, so if local Ollama + // is unreachable they aren't usable either — keep the original error. + if !daemon_reachable && active.is_none() && models.is_empty() { return Err("ollama unreachable: no active model and no pulled models".to_string()); } - for cloud_name in cloud::list_cloud_models().await { - if !models.iter().any(|m| m.name == cloud_name) { - models.push(ModelInfo { - name: cloud_name, - kind: ModelKind::Cloud, - }); + if daemon_reachable { + for cloud_name in cloud::list_cloud_models().await { + if !models.iter().any(|m| m.name == cloud_name) { + models.push(ModelInfo { + name: cloud_name, + kind: ModelKind::Cloud, + }); + } } } Ok(ModelCatalog { active, models }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src-tauri/src/modules/ollama/service.rs` around lines 56 - 143, The code currently returns early when active.is_none() && models.is_empty(), which happens before cloud::list_cloud_models() is appended; fix by tracking daemon reachability separately and deferring the error return until after adding cloud models. Add a let mut daemon_reachable = false; and set daemon_reachable = true whenever a request to OLLAMA_PS_URL or OLLAMA_TAGS_URL succeeds (i.e., Ok(resp) with resp.status().is_success()), leave active and models logic unchanged, then move the unreachable check to after the loop that calls cloud::list_cloud_models() and change it to if !daemon_reachable && models.is_empty() { return Err(...) } so cloud-only models are included even when local catalog is empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src-tauri/src/modules/ollama/cloud.rs`:
- Around line 105-120: The current refresh in
cloud_models_for_slug/list_cloud_models collects names from parallel detail
fetches and writes Ok(out) even when out is empty, which causes an empty cache
to replace a working catalog; change the function to treat an entirely-empty
result as an error (return Err) so list_cloud_models() can fall back to the
previous cache, and also limit the detail-page fan-out by bounding concurrency
(e.g., use a tokio::sync::Semaphore or FuturesUnordered with buffer_unordered to
cap parallel tasks) when spawning cloud_models_for_slug tasks to avoid request
bursts.
---
Outside diff comments:
In `@src-tauri/src/modules/ollama/service.rs`:
- Around line 56-143: The code currently returns early when active.is_none() &&
models.is_empty(), which happens before cloud::list_cloud_models() is appended;
fix by tracking daemon reachability separately and deferring the error return
until after adding cloud models. Add a let mut daemon_reachable = false; and set
daemon_reachable = true whenever a request to OLLAMA_PS_URL or OLLAMA_TAGS_URL
succeeds (i.e., Ok(resp) with resp.status().is_success()), leave active and
models logic unchanged, then move the unreachable check to after the loop that
calls cloud::list_cloud_models() and change it to if !daemon_reachable &&
models.is_empty() { return Err(...) } so cloud-only models are included even
when local catalog is empty.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: aaea324a-248d-40e9-a852-d319638be302
📒 Files selected for processing (10)
src-tauri/src/app.rssrc-tauri/src/infrastructure/http_server.rssrc-tauri/src/modules/bot/agent.rssrc-tauri/src/modules/ollama/cloud.rssrc-tauri/src/modules/ollama/mod.rssrc-tauri/src/modules/ollama/service.rssrc-tauri/src/shared/state.rssrc/modules/ollama/index.tssrc/modules/ollama/types.tssrc/pages/DashboardPage.tsx
OllamaModelInfoandOllamaModelKind.cloudmodule, allowing for seamless integration of cloud models into the existing workflow.Summary by CodeRabbit