fix(local-ai): cap local models to 1b preset by senamakel · Pull Request #1029 · tinyhumansai/openhuman

senamakel · 2026-04-29T18:37:52Z

Summary

Restrict local AI preset selection to the single 1B Gemma preset in the core presets RPC and apply-preset path.
Change local AI defaults and allowlist fallbacks to gemma3:1b-it-qat, with vision disabled and lightweight embeddings.
Keep GUI preset/bootstrap expectations aligned with the new single-preset behavior.
Add regression tests that verify only the 1B preset is exposed and larger tiers are rejected.

Problem

The app still exposed larger local model tiers even though the 1B model is sufficient for local summarization.
Larger models increase battery and resource usage, and they created a mismatch between the intended lightweight local-AI experience and what the GUI/core allowed.
Some core defaults still pointed at larger local models, so config/bootstrap behavior was not fully aligned with the intended cap.

Solution

Update local AI config defaults and model ID fallbacks to prefer the 1B Gemma preset and lightweight embeddings.
Return only MVP-allowed presets from local_ai_presets, and reject non-allowed larger tiers in local_ai_apply_preset and env-tier overrides.
Adjust the preset metadata copy so the GUI presents the 1B preset as the only local model choice.
Update app and Rust tests to assert the new ram_2_4gb / 1B-only behavior.

Submission Checklist

Unit tests — Vitest (app/) and cargo test (core) for the changed local AI preset/bootstrap paths
E2E / integration — Not added; this change is configuration/preset gating and was covered with targeted core RPC-facing tests plus app unit tests
N/A — If truly not applicable, say why (e.g. change is documentation-only)
Doc comments — Updated local AI preset/model comments where the build-level cap changed
Inline comments — Existing comments cover the non-obvious gating/allowlist behavior; no new inline comments were needed beyond that

(Any feature related checklist can go in here)

Impact

Desktop local AI now consistently defaults to the 1B text-only preset when enabled.
The GUI no longer offers larger local-model tiers, and the core rejects attempts to opt into them through the preset RPC or OPENHUMAN_LOCAL_AI_TIER.
This reduces battery/runtime cost for local summarization and removes drift between UI and backend behavior.

Summary by CodeRabbit

Configuration Updates
- Local AI now defaults to the lighter 1B Gemma 3 model for improved performance
- Vision disabled by default
- Embedding model switched to all-minilm for efficiency
- Preset system simplified to a single MVP-optimized 1B tier
Behavior Changes
- Unsupported or custom model tiers are rejected with clear messaging and blocked in this build
Tests
- Unit and e2e tests updated to expect the single MVP 1B preset and tier behavior

coderabbitai · 2026-04-29T18:38:06Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 792e750a-ef76-415c-a1aa-4b6fe1bbfcd2

📥 Commits

Reviewing files that changed from the base of the PR and between 34d2aec and 5585c25.

📒 Files selected for processing (1)

tests/json_rpc_e2e.rs

📝 Walkthrough

Walkthrough

The PR restricts local AI presets to a single MVP tier (ram_2_4gb), updates default model IDs to lighter choices, and adds runtime and env-override validation to reject custom or non-MVP tiers; tests, schemas, and handler responses are updated accordingly.

Changes

Cohort / File(s)	Summary
Configuration Validation `src/openhuman/config/schema/load.rs`	Env override parsing now rejects `ModelTier::Custom` and any tier not allowed for this build (`!tier.is_mvp_allowed()`); only MVP-allowed tiers trigger preset application; warning logs updated.
Default Model Identifiers `src/openhuman/config/schema/local_ai.rs`, `src/openhuman/local_ai/model_ids.rs`	Default/fallback model IDs changed: chat → `gemma3:1b-it-qat`, vision → `""` (disabled), embedding → `all-minilm:latest`.
Preset System & Tier Management `src/openhuman/local_ai/presets.rs`	Preset exposure constrained to a single MVP tier (`ram_2_4gb`); `recommend_tier` returns `MVP_MAX_TIER`; preset metadata updated to reflect Gemma 3 1B and disabled vision; tests adjusted.
API Schemas & Handlers `src/openhuman/local_ai/schemas.rs`	`local_ai_apply_preset` schema now advertises only `ram_2_4gb`/`disabled`; handlers return MVP-only presets and reject non-MVP tiers at runtime with explicit error messages.
Tests & Frontend Mock `app/src/utils/__tests__/localAiBootstrap.test.ts`, `src/openhuman/local_ai/schemas_tests.rs`, `tests/json_rpc_e2e.rs`	Mocks and tests updated to expect a single `ram_2_4gb` preset, reflect new model/quantization fields and applied tier, and assert handler rejection of unsupported larger tiers.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API_Handler as "Local AI Handler"
    participant Presets as "Presets Module"
    participant Config as "Config Loader/Applier"

    Client->>API_Handler: POST /local_ai/apply_preset { tier: "ram_2_4gb" }
    API_Handler->>Presets: validate_tier(tier)
    Presets-->>API_Handler: OK (tier is MVP-allowed)
    API_Handler->>Config: apply_preset_to_config(tier)
    Config->>Config: reject if ModelTier::Custom or !is_mvp_allowed()
    Config-->>API_Handler: applied config / result
    API_Handler-->>Client: 200 OK (appliedTier: "ram_2_4gb")

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Local AI: unified preset bootstrap, first-run Home flow, and resilient model pulls #304: Overlaps local_ai preset/tier surface changes and selected_tier/application bootstrap flow.
Add RAM-tiered local AI presets #425: Modifies model defaults and preset/model-id resolution paths that intersect with these updates.
feat(local_ai): default to cloud fallback on <8GB RAM devices #589: Alters presets and schema logic around which tiers are exposed and validation helpers used here.

Poem

🐰 I hopped through code on silent paws,

Picked one small tier with gentle laws,
Gemma 1B, vision tucked away,
Light and snug to save the day,
Ram_2_4gb—my cozy play.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: restricting local AI models to a 1B preset, which is the core objective across all modified files.
Docstring Coverage	✅ Passed	Docstring coverage is 95.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Review rate limit: 3/5 reviews remaining, refill in 19 minutes and 23 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

src/openhuman/local_ai/presets.rs (1)
197-203: Consider removing unused computation.

The ram_gb value is computed on line 199 but the recommendation logic no longer uses it (always returns MVP_MAX_TIER). While it provides context in the debug log, you could simplify this if the RAM-based recommendation is not planned to return.
♻️ Optional: Inline the log if RAM-based logic is permanently removed
 pub fn recommend_tier(device: &DeviceProfile) -> ModelTier {
-    let ram_gb = device.total_ram_gb();
     let tier = MVP_MAX_TIER;
-    tracing::debug!(ram_gb, ?tier, "[local_ai] recommended model tier");
+    tracing::debug!(ram_gb = device.total_ram_gb(), ?tier, "[local_ai] recommended model tier");
     tier
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/local_ai/presets.rs` around lines 197 - 203, The function
recommend_tier computes ram_gb from DeviceProfile but never uses it and always
returns MVP_MAX_TIER; either remove the unused ram_gb binding and its inclusion
in the tracing::debug call or restore RAM-based logic—update the function
recommend_tier to either drop ram_gb and simplify the debug message to not
reference ram_gb, or implement the intended tier selection using ram_gb so the
variable is used before returning MVP_MAX_TIER.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/openhuman/local_ai/presets.rs`:
- Around line 197-203: The function recommend_tier computes ram_gb from
DeviceProfile but never uses it and always returns MVP_MAX_TIER; either remove
the unused ram_gb binding and its inclusion in the tracing::debug call or
restore RAM-based logic—update the function recommend_tier to either drop ram_gb
and simplify the debug message to not reference ram_gb, or implement the
intended tier selection using ram_gb so the variable is used before returning
MVP_MAX_TIER.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 961fd556-69d9-473b-9762-acd11c229993

📥 Commits

Reviewing files that changed from the base of the PR and between 74074cd and e0c68f0.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

app/src/utils/__tests__/localAiBootstrap.test.ts
src/openhuman/config/schema/load.rs
src/openhuman/config/schema/local_ai.rs
src/openhuman/local_ai/model_ids.rs
src/openhuman/local_ai/presets.rs
src/openhuman/local_ai/schemas.rs
src/openhuman/local_ai/schemas_tests.rs

fix(local-ai): cap local models to 1b preset

e0c68f0

senamakel requested a review from a team April 29, 2026 18:37

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

senamakel and others added 3 commits April 29, 2026 15:37

Merge branch 'main' into fix/models

34d2aec

Merge branch 'main' into pr/1029

b298a48

test(local-ai): align json_rpc_e2e presets test with 1B-only MVP cap

5585c25

senamakel merged commit ac97a96 into tinyhumansai:main Apr 30, 2026
13 of 15 checks passed

coderabbitai Bot mentioned this pull request May 6, 2026

feat(local-ai): off by default, scoped usage flags, faster auto-updates #1263

Merged

12 tasks

coderabbitai Bot mentioned this pull request May 13, 2026

feat(local_ai): unify memory embeddings — cloud Voyage default + live local toggle + dev infra #1640

Merged

12 tasks

AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026

fix(local-ai): cap local models to 1b preset (tinyhumansai#1029)

99e39e2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(local-ai): cap local models to 1b preset#1029

fix(local-ai): cap local models to 1b preset#1029
senamakel merged 4 commits into
tinyhumansai:mainfrom
senamakel:fix/models

senamakel commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senamakel commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

senamakel commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading