feat: enhance Ollama model handling with cloud support by maximedogawa · Pull Request #65 · pengine-ai/pengine

maximedogawa · 2026-04-19T00:20:56Z

Updated the Ollama model types to include OllamaModelInfo and OllamaModelKind.
Modified the DashboardPage to utilize the new model structure for better representation of available models.
Implemented cloud model discovery and caching in the new cloud module, allowing for seamless integration of cloud models into the existing workflow.
Added fallback logic for cloud model rate limits, ensuring a smooth user experience by automatically switching to the last used local model when necessary.
Enhanced the AppState to track the last local model for improved error handling during cloud interactions.

Summary by CodeRabbit

New Features
- Added support for cloud-based models from ollama.com alongside local models.
- Models now display as either "local" or "cloud" in the UI for clarity.
- Automatic fallback to your last-used local model when cloud models encounter rate limits, enabling uninterrupted replies.
- Background discovery and caching of available cloud models on startup.

- Updated the Ollama model types to include `OllamaModelInfo` and `OllamaModelKind`. - Modified the DashboardPage to utilize the new model structure for better representation of available models. - Implemented cloud model discovery and caching in the new `cloud` module, allowing for seamless integration of cloud models into the existing workflow. - Added fallback logic for cloud model rate limits, ensuring a smooth user experience by automatically switching to the last used local model when necessary. - Enhanced the AppState to track the last local model for improved error handling during cloud interactions.

coderabbitai · 2026-04-19T00:21:11Z

Warning

Rate limit exceeded

@maximedogawa has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 27 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 19 minutes and 27 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ff604ebf-c87c-4ab9-a4c9-b958312f251e

📥 Commits

Reviewing files that changed from the base of the PR and between 780a94d and c8126df.

📒 Files selected for processing (3)

src-tauri/src/modules/bot/agent.rs
src-tauri/src/modules/ollama/cloud.rs
src-tauri/src/modules/ollama/service.rs

📝 Walkthrough

Walkthrough

The PR adds cloud model discovery to Ollama by scraping ollama.com, classifies models as Local or Cloud, and implements automatic fallback to local models when cloud models hit rate limits. It updates the HTTP API, state, agent logic, and frontend components to support model categorization throughout the stack.

Changes

Cohort / File(s)	Summary
Cloud Model Discovery `src-tauri/src/modules/ollama/cloud.rs`, `src-tauri/src/modules/ollama/mod.rs`	Introduces new `list_cloud_models()` function that scrapes ollama.com, extracts cloud model names via regex, caches results with 1-hour TTL, and falls back to cached data on network errors. Module export added to make it available in the ollama module.
Model Classification & Catalog `src-tauri/src/modules/ollama/service.rs`	Adds `ModelKind` enum (Local/Cloud), `ModelInfo` struct with name and kind, and classification logic. Updates `ModelCatalog.models` from `Vec<String>` to `Vec<ModelInfo>`. Extends catalog building to merge active model with cloud-discovered models and includes rate-limit error detection utility.
HTTP API & DTOs `src-tauri/src/infrastructure/http_server.rs`	Introduces `OllamaModelDto` DTO with name and kind fields. Changes `OllamaModelsResponse.models` from `Vec<String>` to `Vec<OllamaModelDto>`. Updates model lookup to search by name and validate model kind; persists last-selected local model to state when applicable.
App Initialization & State `src-tauri/src/app.rs`, `src-tauri/src/shared/state.rs`	Spawns background async task to prefetch cloud models during startup. Adds `last_local_model` state field to `AppState` for persisting fallback targets across sessions.
Agent Rate-Limit Fallback `src-tauri/src/modules/bot/agent.rs`	Introduces `chat_with_cloud_fallback()` helper that wraps the chat function; on cloud rate-limit errors, automatically retrieves and switches to the last local model, logs the downgrade, updates state, and retries. Routes all chat calls through this fallback mechanism.
Frontend Types `src/modules/ollama/types.ts`, `src/modules/ollama/index.ts`	Adds `OllamaModelKind` type (`"local" \| "cloud"`) and `OllamaModelInfo` struct. Updates `OllamaModelsResponse.models` type from `string[]` to `OllamaModelInfo[]`. Exports new types alongside existing exports.
Frontend UI `src/pages/DashboardPage.tsx`	Updates model state to use `OllamaModelInfo[]`. Changes dropdown rendering to display model names with optional `· cloud` suffix annotation based on model kind, and uses `model.name` for option keys and values.

Sequence Diagram

sequenceDiagram
    participant App as App Startup
    participant CloudModule as cloud.rs
    participant Cache as Model Cache
    participant OllamaWebsite as ollama.com
    participant Catalog as ModelCatalog
    participant Agent as Agent
    participant ChatService as chat_with_tools
    participant OllamaLocal as Ollama Local/Cloud

    App->>CloudModule: spawn list_cloud_models()
    CloudModule->>Cache: check cache (TTL: 1h)
    alt Cache Valid
        Cache-->>CloudModule: return cached models
    else Cache Expired/Missing
        CloudModule->>OllamaWebsite: scrape cloud search page
        OllamaWebsite-->>CloudModule: HTML with model slugs
        loop Per-slug concurrent requests
            CloudModule->>OllamaWebsite: fetch model detail page
            OllamaWebsite-->>CloudModule: extract cloud models
        end
        CloudModule->>Cache: store discovered models
    end
    CloudModule-->>App: models discovered (fire-and-forget)

    App->>Catalog: build model_catalog()
    Catalog->>CloudModule: request cloud models
    CloudModule-->>Catalog: return discovered models
    Catalog-->>App: ModelInfo[] with Local/Cloud kinds

    User->>Agent: send chat request
    Agent->>Agent: select cloud model
    Agent->>ChatService: call chat_with_cloud_fallback()
    ChatService->>OllamaLocal: attempt cloud model chat
    
    alt Success
        OllamaLocal-->>ChatService: response
        ChatService-->>Agent: return result
    else Rate-Limit Error
        ChatService->>Catalog: retrieve last_local_model
        Catalog-->>ChatService: local fallback model
        ChatService->>Agent: log downgrade
        Agent->>Agent: update state.preferred_model
        ChatService->>OllamaLocal: retry with local model
        OllamaLocal-->>ChatService: response
        ChatService-->>Agent: return result
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

update: ddd folder structure for backend and frontend #8: Introduces the foundational Ollama service/HTTP refactor that this PR extends with cloud model discovery, model classification types, and rate-limit fallback resilience logic.

Poem

🐰 Cloud models hop in from afar,
When rate-limits strike like a star,
Local fallback saves the day,
Agents chat come what may! ⚡🌩️

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main objective of the changeset—enhancing Ollama model handling with cloud support through new cloud model discovery, integration, and fallback logic.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 62-feature-ollama-cloud-model-access

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src-tauri/src/modules/ollama/service.rs (1)

56-143: ⚠️ Potential issue | 🟠 Major

Don’t return before adding cloud models when Ollama is reachable but locally empty.

A reachable daemon with no active/pulled local models reaches Lines 132-134 and returns Err before Lines 136-143 append discovered cloud models. That makes /v1/ollama/models report unreachable and prevents cloud-only users from selecting cloud models.

Track daemon reachability separately from catalog emptiness, then append cloud models when the daemon responded.

🐛 Proposed fix

 pub async fn model_catalog(timeout_ms: u64) -> Result<ModelCatalog, String> {
     let client = http_client();
     let timeout = std::time::Duration::from_millis(timeout_ms);
 
     let mut active: Option<String> = None;
+    let mut daemon_reachable = false;
     match client.get(OLLAMA_PS_URL).timeout(timeout).send().await {
         Ok(resp) => {
             if !resp.status().is_success() {
                 log::warn!(
                     "ollama {}: non-success HTTP {}",
@@
                     resp.status()
                 );
             } else {
+                daemon_reachable = true;
                 match resp.json::<serde_json::Value>().await {
                     Ok(body) => {
                         active = body["models"]
                             .as_array()
@@
     let mut models: Vec<ModelInfo> = Vec::new();
     match client.get(OLLAMA_TAGS_URL).timeout(timeout).send().await {
         Ok(resp) => {
             if !resp.status().is_success() {
                 log::warn!(
@@
                     resp.status()
                 );
             } else {
+                daemon_reachable = true;
                 match resp.json::<serde_json::Value>().await {
                     Ok(body) => {
                         models = body["models"]
                             .as_array()
@@
-    // Cloud models are proxied through the local daemon, so if local Ollama
-    // is unreachable they aren't usable either — keep the original error.
-    if active.is_none() && models.is_empty() {
+    // Cloud models are proxied through the local daemon, so if local Ollama
+    // is unreachable they aren't usable either — keep the original error.
+    if !daemon_reachable && active.is_none() && models.is_empty() {
         return Err("ollama unreachable: no active model and no pulled models".to_string());
     }
 
-    for cloud_name in cloud::list_cloud_models().await {
-        if !models.iter().any(|m| m.name == cloud_name) {
-            models.push(ModelInfo {
-                name: cloud_name,
-                kind: ModelKind::Cloud,
-            });
+    if daemon_reachable {
+        for cloud_name in cloud::list_cloud_models().await {
+            if !models.iter().any(|m| m.name == cloud_name) {
+                models.push(ModelInfo {
+                    name: cloud_name,
+                    kind: ModelKind::Cloud,
+                });
+            }
         }
     }
 
     Ok(ModelCatalog { active, models })
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src-tauri/src/modules/ollama/service.rs` around lines 56 - 143, The code
currently returns early when active.is_none() && models.is_empty(), which
happens before cloud::list_cloud_models() is appended; fix by tracking daemon
reachability separately and deferring the error return until after adding cloud
models. Add a let mut daemon_reachable = false; and set daemon_reachable = true
whenever a request to OLLAMA_PS_URL or OLLAMA_TAGS_URL succeeds (i.e., Ok(resp)
with resp.status().is_success()), leave active and models logic unchanged, then
move the unreachable check to after the loop that calls
cloud::list_cloud_models() and change it to if !daemon_reachable &&
models.is_empty() { return Err(...) } so cloud-only models are included even
when local catalog is empty.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src-tauri/src/modules/ollama/cloud.rs`:
- Around line 105-120: The current refresh in
cloud_models_for_slug/list_cloud_models collects names from parallel detail
fetches and writes Ok(out) even when out is empty, which causes an empty cache
to replace a working catalog; change the function to treat an entirely-empty
result as an error (return Err) so list_cloud_models() can fall back to the
previous cache, and also limit the detail-page fan-out by bounding concurrency
(e.g., use a tokio::sync::Semaphore or FuturesUnordered with buffer_unordered to
cap parallel tasks) when spawning cloud_models_for_slug tasks to avoid request
bursts.

---

Outside diff comments:
In `@src-tauri/src/modules/ollama/service.rs`:
- Around line 56-143: The code currently returns early when active.is_none() &&
models.is_empty(), which happens before cloud::list_cloud_models() is appended;
fix by tracking daemon reachability separately and deferring the error return
until after adding cloud models. Add a let mut daemon_reachable = false; and set
daemon_reachable = true whenever a request to OLLAMA_PS_URL or OLLAMA_TAGS_URL
succeeds (i.e., Ok(resp) with resp.status().is_success()), leave active and
models logic unchanged, then move the unreachable check to after the loop that
calls cloud::list_cloud_models() and change it to if !daemon_reachable &&
models.is_empty() { return Err(...) } so cloud-only models are included even
when local catalog is empty.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: aaea324a-248d-40e9-a852-d319638be302

📥 Commits

Reviewing files that changed from the base of the PR and between 985cfff and 780a94d.

📒 Files selected for processing (10)

src-tauri/src/app.rs
src-tauri/src/infrastructure/http_server.rs
src-tauri/src/modules/bot/agent.rs
src-tauri/src/modules/ollama/cloud.rs
src-tauri/src/modules/ollama/mod.rs
src-tauri/src/modules/ollama/service.rs
src-tauri/src/shared/state.rs
src/modules/ollama/index.ts
src/modules/ollama/types.ts
src/pages/DashboardPage.tsx

maximedogawa linked an issue Apr 19, 2026 that may be closed by this pull request

[Feature] Ollama Cloud Model access #62

Closed

3 tasks

coderabbitai Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread src-tauri/src/modules/ollama/cloud.rs Outdated

update: fix review changes

c8126df

maximedogawa merged commit 4e27524 into main Apr 19, 2026
1 check passed

maximedogawa deleted the 62-feature-ollama-cloud-model-access branch April 19, 2026 01:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enhance Ollama model handling with cloud support#65

feat: enhance Ollama model handling with cloud support#65
maximedogawa merged 2 commits into
mainfrom
62-feature-ollama-cloud-model-access

maximedogawa commented Apr 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 19, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

maximedogawa commented Apr 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

maximedogawa commented Apr 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 19, 2026 •

edited

Loading