Skip to content

feat(local_ai): default to cloud fallback on <8GB RAM devices#589

Merged
senamakel merged 9 commits into
tinyhumansai:mainfrom
senamakel:feat/local-ai
Apr 16, 2026
Merged

feat(local_ai): default to cloud fallback on <8GB RAM devices#589
senamakel merged 9 commits into
tinyhumansai:mainfrom
senamakel:feat/local-ai

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented Apr 16, 2026

Summary

  • Adds a soft RAM floor (8 GB) for local AI: devices below 8 GB default to disabled (cloud fallback) instead of downloading and running the 1B param Gemma model via Ollama.
  • The user can still opt into local AI on a low-RAM device — this is a default, not a hard gate.
  • On sufficient-RAM devices, both paths are available: the primary "Continue" button bootstraps local AI, with a subtle "Skip — use cloud AI instead" link.

What changed

Layer File Change
Rust (presets) src/openhuman/local_ai/presets.rs MIN_RAM_GB_FOR_LOCAL_AI = 8, device_supports_local_ai(), should_default_to_cloud_fallback()
Rust (bootstrap) src/openhuman/local_ai/service/bootstrap.rs When no tier explicitly selected + RAM < 8 GB → effective_config.local_ai.enabled = false
Rust (RPC) src/openhuman/local_ai/schemas.rs Presets endpoint returns recommend_disabled: bool
Frontend (types) app/src/utils/tauriCommands/localAi.ts Added recommend_disabled field to PresetsResponse
Frontend (bootstrap) app/src/utils/localAiBootstrap.ts Skips preset application + asset download when recommend_disabled
Frontend (onboarding) app/src/pages/onboarding/steps/LocalAIStep.tsx Two UIs: "AI — Cloud Mode" (low RAM) vs standard Ollama setup (sufficient RAM)
Tests presets, bootstrap, LocalAIStep 7 new test cases covering RAM floor, cloud fallback, user override

Test plan

  • Rust unit tests: cargo test --lib -- openhuman::local_ai::presets::tests openhuman::local_ai::service::bootstrap::tests (all 16 pass)
  • Frontend unit tests: yarn test:unit -- src/pages/onboarding/steps/__tests__/LocalAIStep.test.tsx (all 6 pass)
  • Manual: on a >= 8 GB device, onboarding shows standard Ollama setup with "Skip" link
  • Manual: simulate < 8 GB (or use env override) to verify cloud fallback UI appears as default
  • Verify explicit tier selection on a low-RAM device is still honored (user override path)

Summary by CodeRabbit

  • New Features
    • Added device RAM detection to automatically recommend cloud AI fallback on systems below 8GB RAM.
    • Added "Skip — use cloud AI instead" button during local AI onboarding.
    • When local AI is not recommended, users can still opt to "Use local AI anyway" to override the recommendation.

- Introduced a new `draft_sent` field in the `StreamingState` struct to track when a draft message has been posted, decoupling the existence of a draft from the ability to edit it.
- Updated the `flush_streaming_edit` function to set `draft_sent` upon posting a draft, ensuring that duplicate messages are not sent if the backend fails to return an ID.
- Modified the `finalize_channel_reply` function to prevent sending a fresh message if a draft was already posted without an ID, improving user experience by avoiding duplicate bubbles.

These changes enhance the robustness of message handling during streaming edits, ensuring a smoother interaction for users.
- Added logic to determine if the device is below the RAM threshold, defaulting to a cloud AI fallback if necessary.
- Introduced a new UI for low-RAM devices, informing users about the cloud mode and providing options to continue with cloud AI or force-enable local AI.
- Updated the `ensureRecommendedLocalAiPresetIfNeeded` function to return a `recommend_disabled` flag based on device RAM.
- Enhanced tests to cover new cloud fallback UI and local AI consent handling.

These changes improve the onboarding experience by providing clearer options based on device capabilities.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

Warning

Rate limit exceeded

@senamakel has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 49 minutes and 18 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 49 minutes and 18 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: de52dd2e-6d8a-46b4-bf58-a6d8c62ec373

📥 Commits

Reviewing files that changed from the base of the PR and between 3c63864 and 482c24d.

📒 Files selected for processing (10)
  • app/src/components/BottomTabBar.tsx
  • app/src/components/settings/panels/LocalModelPanel.tsx
  • app/src/components/settings/panels/local-model/DeviceCapabilitySection.tsx
  • app/src/pages/onboarding/steps/LocalAIStep.tsx
  • app/src/pages/onboarding/steps/__tests__/LocalAIStep.test.tsx
  • app/src/utils/localAiBootstrap.ts
  • app/src/utils/tauriCommands/localAi.ts
  • src/openhuman/local_ai/presets.rs
  • src/openhuman/local_ai/schemas.rs
  • src/openhuman/local_ai/service/bootstrap.rs
📝 Walkthrough

Walkthrough

This change introduces a RAM-based recommendation system for Local AI onboarding. When devices have insufficient RAM (below 8 GB minimum), the system disables local AI recommendations, displays a cloud fallback UI, and allows users to opt-in to local AI anyway. Backend services check device RAM thresholds and return flags; the frontend component probes this availability and conditionally renders UI paths accordingly.

Changes

Cohort / File(s) Summary
Backend RAM Detection & Schema
src/openhuman/local_ai/presets.rs, src/openhuman/local_ai/schemas.rs
Added MIN_RAM_GB_FOR_LOCAL_AI constant (8 GB), plus device_supports_local_ai() and should_default_to_cloud_fallback() functions to evaluate device RAM. Schema handler now computes and includes recommend_disabled flag in preset responses.
Bootstrap Service Configuration
src/openhuman/local_ai/service/bootstrap.rs
Enhanced config_with_recommended_tier_if_unselected to short-circuit when device RAM is below threshold and no tier is explicitly selected, disabling local AI and logging device RAM state. Refactored tests to use shared test_device(ram_gb) helper and added low-RAM coverage.
Frontend Type & Utility Layer
app/src/utils/tauriCommands/localAi.ts, app/src/utils/localAiBootstrap.ts
Extended PresetsResponse with optional recommend_disabled field. Updated ensureRecommendedLocalAiPresetIfNeeded to detect disabled recommendations and return null tier state without applying presets.
Onboarding Component & Tests
app/src/pages/onboarding/steps/LocalAIStep.tsx, app/src/pages/onboarding/steps/__tests__/LocalAIStep.test.tsx
Added state and effect hook to probe preset availability on mount. Introduced conditional UI: when recommendations disabled, renders cloud-mode fallback with "Continue with Cloud" and "Use local AI anyway" options; otherwise shows original local-AI UI with "Skip" option. Extended mock and added test cases for low-RAM behavior.

Sequence Diagram

sequenceDiagram
    participant Component as LocalAIStep<br/>(Frontend)
    participant Util as ensureRecommended<br/>LocalAiPresetIfNeeded
    participant Backend as Backend Service<br/>(Presets/Bootstrap)
    participant Device as Device<br/>Profile

    Component->>Component: Component mounts
    Component->>Util: Call ensureRecommendedLocalAiPresetIfNeeded()
    Util->>Backend: Request preset recommendations
    Backend->>Device: Check device.total_ram_gb()
    Device-->>Backend: Return RAM (e.g., 4 GB or 16 GB)
    Backend->>Backend: Evaluate RAM >= 8 GB threshold
    Backend-->>Util: Return {recommend_disabled: true/false, ...}
    Util-->>Component: Resolve with preset state
    Component->>Component: Update recommendDisabled state
    alt Recommend Disabled (RAM < 8 GB)
        Component->>Component: Render Cloud Fallback UI
        Note over Component: Show "Continue with Cloud"<br/>+ "Use local AI anyway"
    else Recommend Enabled (RAM ≥ 8 GB)
        Component->>Component: Render Local AI UI
        Note over Component: Show "Continue with Local AI"<br/>+ "Skip — use cloud AI instead"
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • Al629176

Poem

🐰 A bunny hops through code so bright,
RAM checks guide the onboarding flight,
Eight gigs or bust—a threshold fair,
Cloud falls back when memory's spare,
But users still may choose their way, 🌥️✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: introducing a soft 8GB RAM floor for local AI, defaulting to cloud fallback on lower-RAM devices while preserving user override capability.
Docstring Coverage ✅ Passed Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…option

- Remove MVP ceiling that clamped selection to the 2-4 GB tier, in bootstrap
  and in the apply_preset RPC — users can now pick any tier.
- Add special "disabled" tier string to apply_preset that toggles
  config.local_ai.enabled = false so the app uses the cloud summarizer.
- Presets RPC now returns local_ai_enabled so the UI can render the
  currently-active state correctly.
- Rewrite DeviceCapabilitySection: tiers are now clickable buttons that
  call apply_preset; adds a "Disabled — Cloud fallback" card at the top
  marked "Recommended" on low-RAM devices and "Active" when enabled=false.
- Remove the MVP message copy from the model tier panel.
- Remove unread-count badge from the bottom tab bar.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/openhuman/local_ai/schemas.rs (1)

762-770: Update the controller schema metadata for the new response field.

handle_local_ai_presets now returns recommend_disabled, but schemas("local_ai_presets") still documents only presets/recommended/current tier. That leaves generated docs and schema consumers behind the actual API shape.

As per coding guidelines, "Add concise rustdoc / code comments where the flow is not obvious; update AGENTS.md, architecture docs, or feature docs when repository rules or user-visible behavior change".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/local_ai/schemas.rs` around lines 762 - 770, The schema
metadata for the "local_ai_presets" response needs to be updated to include the
new boolean field recommend_disabled; locate the schemas registration for
"local_ai_presets" in src/openhuman/local_ai/schemas.rs (the function or map
that builds schemas("local_ai_presets")) and add a concise schema entry and
rustdoc describing recommend_disabled (boolean, indicates whether
recommend-to-cloud fallback is disabled) alongside existing entries for
presets/recommended/current/selected_tier/device, and update any example
responses or documentation comments in the same file so generated docs and
consumers reflect the new field.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/pages/onboarding/steps/LocalAIStep.tsx`:
- Around line 21-35: The effect currently calls
ensureRecommendedLocalAiPresetIfNeeded('[LocalAIStep:probe]') which
mutates/saves presets on mount; change it to a read-only presets probe (call the
non-mutating fetch/get function that returns presets instead of
ensureRecommendedLocalAiPresetIfNeeded), use its result to setRecommendDisabled
(and preserve the existing cancelled cleanup and error fallback), and remove any
code that writes or persists presets from this useEffect so that actual preset
application remains only inside handleConsent (keep handleConsent as the place
that calls ensureRecommendedLocalAiPresetIfNeeded or other mutating/apply
logic).

In `@app/src/utils/localAiBootstrap.ts`:
- Around line 73-86: The short-circuit on presets.recommend_disabled returns
without persisting a tier, which prevents the explicit low-RAM opt-in from being
recorded and causes config_with_recommended_tier_if_unselected() to keep
disabling local AI; instead, when recommend_disabled is true but the user has
explicitly opted-in, persist the chosen tier before bootstrapping by setting
selectedTier to recommendedTier (and hadSelectedTier = true / appliedTier as
appropriate) or split the probe path from the mutating apply path so the
read-only probe still returns without mutating but the opt-in flow calls the
persist/apply logic; update the logic around recommend_disabled in the function
that returns {presets, recommendedTier, selectedTier, hadSelectedTier,
appliedTier} and ensure bootstrapLocalAiWithRecommendedPreset() sees the
persisted selectedTier so config_with_recommended_tier_if_unselected() will not
revert the choice.

In `@src/openhuman/local_ai/service/bootstrap.rs`:
- Around line 284-288: Change the tracing::info! call that logs
device.total_ram_gb() vs MIN_RAM_GB_FOR_LOCAL_AI to tracing::debug! (keep the
same structured fields and message), i.e., replace the info-level checkpoint in
bootstrap.rs that references device.total_ram_gb() and
crate::openhuman::local_ai::presets::MIN_RAM_GB_FOR_LOCAL_AI with a debug-level
trace so this development-flow diagnostic follows the repo logging policy.

---

Nitpick comments:
In `@src/openhuman/local_ai/schemas.rs`:
- Around line 762-770: The schema metadata for the "local_ai_presets" response
needs to be updated to include the new boolean field recommend_disabled; locate
the schemas registration for "local_ai_presets" in
src/openhuman/local_ai/schemas.rs (the function or map that builds
schemas("local_ai_presets")) and add a concise schema entry and rustdoc
describing recommend_disabled (boolean, indicates whether recommend-to-cloud
fallback is disabled) alongside existing entries for
presets/recommended/current/selected_tier/device, and update any example
responses or documentation comments in the same file so generated docs and
consumers reflect the new field.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e5817e15-9b02-411e-8fff-c047261e04b2

📥 Commits

Reviewing files that changed from the base of the PR and between c26ca79 and 3c63864.

📒 Files selected for processing (7)
  • app/src/pages/onboarding/steps/LocalAIStep.tsx
  • app/src/pages/onboarding/steps/__tests__/LocalAIStep.test.tsx
  • app/src/utils/localAiBootstrap.ts
  • app/src/utils/tauriCommands/localAi.ts
  • src/openhuman/local_ai/presets.rs
  • src/openhuman/local_ai/schemas.rs
  • src/openhuman/local_ai/service/bootstrap.rs

Comment thread app/src/pages/onboarding/steps/LocalAIStep.tsx
Comment thread app/src/utils/localAiBootstrap.ts Outdated
Comment thread src/openhuman/local_ai/service/bootstrap.rs Outdated
senamakel and others added 5 commits April 16, 2026 10:09
…et handling

- Removed the `ensureRecommendedLocalAiPresetIfNeeded` function call from `LocalAIStep`, replacing it with `openhumanLocalAiPresets` to directly fetch preset information.
- Updated the logic in `ensureRecommendedLocalAiPresetIfNeeded` to ensure the recommended tier is applied correctly based on user consent, improving clarity in the local AI setup process.
- Enhanced tests to mock the new `openhumanLocalAiPresets` function, ensuring coverage for low-RAM device scenarios and local AI consent handling.
- Updated documentation in the `schemas.rs` file to reflect changes in the data structure returned by the presets RPC.

These changes improve the onboarding experience by providing a more direct and efficient method for managing local AI presets based on device capabilities.
- Refactored the mock for `openhumanLocalAiPresets` to enhance readability and maintainability.
- Ensured the mock returns consistent device information and preset details for testing scenarios.
- This change supports better test coverage and clarity in the LocalAIStep component's behavior during onboarding.
hadSelectedTier / selectedTier represent the incoming state ("was a tier
already selected before this call"), not the post-apply state. Setting
both to the just-applied tier broke the existing localAiBootstrap
contract and the unit test that encodes it.

The Rust-side persistence fix is independent: removing the
recommend_disabled short-circuit ensures openhumanLocalAiApplyPreset
actually writes the selected tier to disk on the opt-in path, so
config_with_recommended_tier_if_unselected() honors the user's choice.
@senamakel senamakel merged commit 7db7140 into tinyhumansai:main Apr 16, 2026
8 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
…mansai#589)

* Enhance draft message handling in streaming edits

- Introduced a new `draft_sent` field in the `StreamingState` struct to track when a draft message has been posted, decoupling the existence of a draft from the ability to edit it.
- Updated the `flush_streaming_edit` function to set `draft_sent` upon posting a draft, ensuring that duplicate messages are not sent if the backend fails to return an ID.
- Modified the `finalize_channel_reply` function to prevent sending a fresh message if a draft was already posted without an ID, improving user experience by avoiding duplicate bubbles.

These changes enhance the robustness of message handling during streaming edits, ensuring a smoother interaction for users.

* feat(onboarding): enhance LocalAIStep for low-RAM device handling

- Added logic to determine if the device is below the RAM threshold, defaulting to a cloud AI fallback if necessary.
- Introduced a new UI for low-RAM devices, informing users about the cloud mode and providing options to continue with cloud AI or force-enable local AI.
- Updated the `ensureRecommendedLocalAiPresetIfNeeded` function to return a `recommend_disabled` flag based on device RAM.
- Enhanced tests to cover new cloud fallback UI and local AI consent handling.

These changes improve the onboarding experience by providing clearer options based on device capabilities.

* style: apply prettier formatting to LocalAIStep

* feat(local_ai): unlock model selection + add Disabled/cloud fallback option

- Remove MVP ceiling that clamped selection to the 2-4 GB tier, in bootstrap
  and in the apply_preset RPC — users can now pick any tier.
- Add special "disabled" tier string to apply_preset that toggles
  config.local_ai.enabled = false so the app uses the cloud summarizer.
- Presets RPC now returns local_ai_enabled so the UI can render the
  currently-active state correctly.
- Rewrite DeviceCapabilitySection: tiers are now clickable buttons that
  call apply_preset; adds a "Disabled — Cloud fallback" card at the top
  marked "Recommended" on low-RAM devices and "Active" when enabled=false.
- Remove the MVP message copy from the model tier panel.
- Remove unread-count badge from the bottom tab bar.

* refactor(onboarding): streamline LocalAIStep and update local AI preset handling

- Removed the `ensureRecommendedLocalAiPresetIfNeeded` function call from `LocalAIStep`, replacing it with `openhumanLocalAiPresets` to directly fetch preset information.
- Updated the logic in `ensureRecommendedLocalAiPresetIfNeeded` to ensure the recommended tier is applied correctly based on user consent, improving clarity in the local AI setup process.
- Enhanced tests to mock the new `openhumanLocalAiPresets` function, ensuring coverage for low-RAM device scenarios and local AI consent handling.
- Updated documentation in the `schemas.rs` file to reflect changes in the data structure returned by the presets RPC.

These changes improve the onboarding experience by providing a more direct and efficient method for managing local AI presets based on device capabilities.

* test(LocalAIStep): improve mock implementation for local AI presets

- Refactored the mock for `openhumanLocalAiPresets` to enhance readability and maintainability.
- Ensured the mock returns consistent device information and preset details for testing scenarios.
- This change supports better test coverage and clarity in the LocalAIStep component's behavior during onboarding.

* fix(local_ai): preserve hadSelectedTier semantics in bootstrap return

hadSelectedTier / selectedTier represent the incoming state ("was a tier
already selected before this call"), not the post-apply state. Setting
both to the just-applied tier broke the existing localAiBootstrap
contract and the unit test that encodes it.

The Rust-side persistence fix is independent: removing the
recommend_disabled short-circuit ensures openhumanLocalAiApplyPreset
actually writes the selected tier to disk on the opt-in path, so
config_with_recommended_tier_if_unselected() honors the user's choice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant