Skip to content

feat: enhance Ollama model handling with cloud support#65

Merged
maximedogawa merged 2 commits into
mainfrom
62-feature-ollama-cloud-model-access
Apr 19, 2026
Merged

feat: enhance Ollama model handling with cloud support#65
maximedogawa merged 2 commits into
mainfrom
62-feature-ollama-cloud-model-access

Conversation

@maximedogawa
Copy link
Copy Markdown
Collaborator

@maximedogawa maximedogawa commented Apr 19, 2026

  • Updated the Ollama model types to include OllamaModelInfo and OllamaModelKind.
  • Modified the DashboardPage to utilize the new model structure for better representation of available models.
  • Implemented cloud model discovery and caching in the new cloud module, allowing for seamless integration of cloud models into the existing workflow.
  • Added fallback logic for cloud model rate limits, ensuring a smooth user experience by automatically switching to the last used local model when necessary.
  • Enhanced the AppState to track the last local model for improved error handling during cloud interactions.

Summary by CodeRabbit

  • New Features
    • Added support for cloud-based models from ollama.com alongside local models.
    • Models now display as either "local" or "cloud" in the UI for clarity.
    • Automatic fallback to your last-used local model when cloud models encounter rate limits, enabling uninterrupted replies.
    • Background discovery and caching of available cloud models on startup.

- Updated the Ollama model types to include `OllamaModelInfo` and `OllamaModelKind`.
- Modified the DashboardPage to utilize the new model structure for better representation of available models.
- Implemented cloud model discovery and caching in the new `cloud` module, allowing for seamless integration of cloud models into the existing workflow.
- Added fallback logic for cloud model rate limits, ensuring a smooth user experience by automatically switching to the last used local model when necessary.
- Enhanced the AppState to track the last local model for improved error handling during cloud interactions.
@maximedogawa maximedogawa linked an issue Apr 19, 2026 that may be closed by this pull request
3 tasks
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 19, 2026

Warning

Rate limit exceeded

@maximedogawa has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 27 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 19 minutes and 27 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ff604ebf-c87c-4ab9-a4c9-b958312f251e

📥 Commits

Reviewing files that changed from the base of the PR and between 780a94d and c8126df.

📒 Files selected for processing (3)
  • src-tauri/src/modules/bot/agent.rs
  • src-tauri/src/modules/ollama/cloud.rs
  • src-tauri/src/modules/ollama/service.rs
📝 Walkthrough

Walkthrough

The PR adds cloud model discovery to Ollama by scraping ollama.com, classifies models as Local or Cloud, and implements automatic fallback to local models when cloud models hit rate limits. It updates the HTTP API, state, agent logic, and frontend components to support model categorization throughout the stack.

Changes

Cohort / File(s) Summary
Cloud Model Discovery
src-tauri/src/modules/ollama/cloud.rs, src-tauri/src/modules/ollama/mod.rs
Introduces new list_cloud_models() function that scrapes ollama.com, extracts cloud model names via regex, caches results with 1-hour TTL, and falls back to cached data on network errors. Module export added to make it available in the ollama module.
Model Classification & Catalog
src-tauri/src/modules/ollama/service.rs
Adds ModelKind enum (Local/Cloud), ModelInfo struct with name and kind, and classification logic. Updates ModelCatalog.models from Vec<String> to Vec<ModelInfo>. Extends catalog building to merge active model with cloud-discovered models and includes rate-limit error detection utility.
HTTP API & DTOs
src-tauri/src/infrastructure/http_server.rs
Introduces OllamaModelDto DTO with name and kind fields. Changes OllamaModelsResponse.models from Vec<String> to Vec<OllamaModelDto>. Updates model lookup to search by name and validate model kind; persists last-selected local model to state when applicable.
App Initialization & State
src-tauri/src/app.rs, src-tauri/src/shared/state.rs
Spawns background async task to prefetch cloud models during startup. Adds last_local_model state field to AppState for persisting fallback targets across sessions.
Agent Rate-Limit Fallback
src-tauri/src/modules/bot/agent.rs
Introduces chat_with_cloud_fallback() helper that wraps the chat function; on cloud rate-limit errors, automatically retrieves and switches to the last local model, logs the downgrade, updates state, and retries. Routes all chat calls through this fallback mechanism.
Frontend Types
src/modules/ollama/types.ts, src/modules/ollama/index.ts
Adds OllamaModelKind type ("local" | "cloud") and OllamaModelInfo struct. Updates OllamaModelsResponse.models type from string[] to OllamaModelInfo[]. Exports new types alongside existing exports.
Frontend UI
src/pages/DashboardPage.tsx
Updates model state to use OllamaModelInfo[]. Changes dropdown rendering to display model names with optional · cloud suffix annotation based on model kind, and uses model.name for option keys and values.

Sequence Diagram

sequenceDiagram
    participant App as App Startup
    participant CloudModule as cloud.rs
    participant Cache as Model Cache
    participant OllamaWebsite as ollama.com
    participant Catalog as ModelCatalog
    participant Agent as Agent
    participant ChatService as chat_with_tools
    participant OllamaLocal as Ollama Local/Cloud

    App->>CloudModule: spawn list_cloud_models()
    CloudModule->>Cache: check cache (TTL: 1h)
    alt Cache Valid
        Cache-->>CloudModule: return cached models
    else Cache Expired/Missing
        CloudModule->>OllamaWebsite: scrape cloud search page
        OllamaWebsite-->>CloudModule: HTML with model slugs
        loop Per-slug concurrent requests
            CloudModule->>OllamaWebsite: fetch model detail page
            OllamaWebsite-->>CloudModule: extract cloud models
        end
        CloudModule->>Cache: store discovered models
    end
    CloudModule-->>App: models discovered (fire-and-forget)

    App->>Catalog: build model_catalog()
    Catalog->>CloudModule: request cloud models
    CloudModule-->>Catalog: return discovered models
    Catalog-->>App: ModelInfo[] with Local/Cloud kinds

    User->>Agent: send chat request
    Agent->>Agent: select cloud model
    Agent->>ChatService: call chat_with_cloud_fallback()
    ChatService->>OllamaLocal: attempt cloud model chat
    
    alt Success
        OllamaLocal-->>ChatService: response
        ChatService-->>Agent: return result
    else Rate-Limit Error
        ChatService->>Catalog: retrieve last_local_model
        Catalog-->>ChatService: local fallback model
        ChatService->>Agent: log downgrade
        Agent->>Agent: update state.preferred_model
        ChatService->>OllamaLocal: retry with local model
        OllamaLocal-->>ChatService: response
        ChatService-->>Agent: return result
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 Cloud models hop in from afar,
When rate-limits strike like a star,
Local fallback saves the day,
Agents chat come what may! ⚡🌩️

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective of the changeset—enhancing Ollama model handling with cloud support through new cloud model discovery, integration, and fallback logic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 62-feature-ollama-cloud-model-access

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src-tauri/src/modules/ollama/service.rs (1)

56-143: ⚠️ Potential issue | 🟠 Major

Don’t return before adding cloud models when Ollama is reachable but locally empty.

A reachable daemon with no active/pulled local models reaches Lines 132-134 and returns Err before Lines 136-143 append discovered cloud models. That makes /v1/ollama/models report unreachable and prevents cloud-only users from selecting cloud models.

Track daemon reachability separately from catalog emptiness, then append cloud models when the daemon responded.

🐛 Proposed fix
 pub async fn model_catalog(timeout_ms: u64) -> Result<ModelCatalog, String> {
     let client = http_client();
     let timeout = std::time::Duration::from_millis(timeout_ms);
 
     let mut active: Option<String> = None;
+    let mut daemon_reachable = false;
     match client.get(OLLAMA_PS_URL).timeout(timeout).send().await {
         Ok(resp) => {
             if !resp.status().is_success() {
                 log::warn!(
                     "ollama {}: non-success HTTP {}",
@@
                     resp.status()
                 );
             } else {
+                daemon_reachable = true;
                 match resp.json::<serde_json::Value>().await {
                     Ok(body) => {
                         active = body["models"]
                             .as_array()
@@
     let mut models: Vec<ModelInfo> = Vec::new();
     match client.get(OLLAMA_TAGS_URL).timeout(timeout).send().await {
         Ok(resp) => {
             if !resp.status().is_success() {
                 log::warn!(
@@
                     resp.status()
                 );
             } else {
+                daemon_reachable = true;
                 match resp.json::<serde_json::Value>().await {
                     Ok(body) => {
                         models = body["models"]
                             .as_array()
@@
-    // Cloud models are proxied through the local daemon, so if local Ollama
-    // is unreachable they aren't usable either — keep the original error.
-    if active.is_none() && models.is_empty() {
+    // Cloud models are proxied through the local daemon, so if local Ollama
+    // is unreachable they aren't usable either — keep the original error.
+    if !daemon_reachable && active.is_none() && models.is_empty() {
         return Err("ollama unreachable: no active model and no pulled models".to_string());
     }
 
-    for cloud_name in cloud::list_cloud_models().await {
-        if !models.iter().any(|m| m.name == cloud_name) {
-            models.push(ModelInfo {
-                name: cloud_name,
-                kind: ModelKind::Cloud,
-            });
+    if daemon_reachable {
+        for cloud_name in cloud::list_cloud_models().await {
+            if !models.iter().any(|m| m.name == cloud_name) {
+                models.push(ModelInfo {
+                    name: cloud_name,
+                    kind: ModelKind::Cloud,
+                });
+            }
         }
     }
 
     Ok(ModelCatalog { active, models })
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src-tauri/src/modules/ollama/service.rs` around lines 56 - 143, The code
currently returns early when active.is_none() && models.is_empty(), which
happens before cloud::list_cloud_models() is appended; fix by tracking daemon
reachability separately and deferring the error return until after adding cloud
models. Add a let mut daemon_reachable = false; and set daemon_reachable = true
whenever a request to OLLAMA_PS_URL or OLLAMA_TAGS_URL succeeds (i.e., Ok(resp)
with resp.status().is_success()), leave active and models logic unchanged, then
move the unreachable check to after the loop that calls
cloud::list_cloud_models() and change it to if !daemon_reachable &&
models.is_empty() { return Err(...) } so cloud-only models are included even
when local catalog is empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src-tauri/src/modules/ollama/cloud.rs`:
- Around line 105-120: The current refresh in
cloud_models_for_slug/list_cloud_models collects names from parallel detail
fetches and writes Ok(out) even when out is empty, which causes an empty cache
to replace a working catalog; change the function to treat an entirely-empty
result as an error (return Err) so list_cloud_models() can fall back to the
previous cache, and also limit the detail-page fan-out by bounding concurrency
(e.g., use a tokio::sync::Semaphore or FuturesUnordered with buffer_unordered to
cap parallel tasks) when spawning cloud_models_for_slug tasks to avoid request
bursts.

---

Outside diff comments:
In `@src-tauri/src/modules/ollama/service.rs`:
- Around line 56-143: The code currently returns early when active.is_none() &&
models.is_empty(), which happens before cloud::list_cloud_models() is appended;
fix by tracking daemon reachability separately and deferring the error return
until after adding cloud models. Add a let mut daemon_reachable = false; and set
daemon_reachable = true whenever a request to OLLAMA_PS_URL or OLLAMA_TAGS_URL
succeeds (i.e., Ok(resp) with resp.status().is_success()), leave active and
models logic unchanged, then move the unreachable check to after the loop that
calls cloud::list_cloud_models() and change it to if !daemon_reachable &&
models.is_empty() { return Err(...) } so cloud-only models are included even
when local catalog is empty.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aaea324a-248d-40e9-a852-d319638be302

📥 Commits

Reviewing files that changed from the base of the PR and between 985cfff and 780a94d.

📒 Files selected for processing (10)
  • src-tauri/src/app.rs
  • src-tauri/src/infrastructure/http_server.rs
  • src-tauri/src/modules/bot/agent.rs
  • src-tauri/src/modules/ollama/cloud.rs
  • src-tauri/src/modules/ollama/mod.rs
  • src-tauri/src/modules/ollama/service.rs
  • src-tauri/src/shared/state.rs
  • src/modules/ollama/index.ts
  • src/modules/ollama/types.ts
  • src/pages/DashboardPage.tsx

Comment thread src-tauri/src/modules/ollama/cloud.rs Outdated
@maximedogawa maximedogawa merged commit 4e27524 into main Apr 19, 2026
1 check passed
@maximedogawa maximedogawa deleted the 62-feature-ollama-cloud-model-access branch April 19, 2026 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Ollama Cloud Model access

1 participant