Skip to content

docs(local-ai): document local chat routing#2383

Merged
graycyrus merged 3 commits into
tinyhumansai:mainfrom
aqilaziz:codex/2375-local-chat-reasoning-docs
May 22, 2026
Merged

docs(local-ai): document local chat routing#2383
graycyrus merged 3 commits into
tinyhumansai:mainfrom
aqilaziz:codex/2375-local-chat-reasoning-docs

Conversation

@aqilaziz
Copy link
Copy Markdown
Contributor

@aqilaziz aqilaziz commented May 21, 2026

Summary

  • update Local AI docs to stop claiming chat/reasoning must stay cloud
  • document chat_provider = ollama: and reasoning_provider = ollama: as explicit opt-ins
  • clarify that heavy hints stay cloud unless the matching workload provider is routed locally

Refs #2375

Testing

  • git diff --check

Summary by CodeRabbit

  • Documentation
    • Clarified Local AI is opt-in and controlled via workload provider settings
    • Added explicit guidance for routing Chat and Reasoning to local providers
    • Explained lightweight/medium vs heavy workload routing behavior, cloud defaults, and local-runtime reachability requirement
    • Showed unified provider-field pattern (e.g., chat_provider, reasoning_provider) with a TOML example
    • Documented legacy flag behavior and migration compatibility for older opt-in booleans

Review Change Stack

@aqilaziz aqilaziz requested a review from a team May 21, 2026 00:43
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d6ad12da-1460-41fd-8ef6-58f044df7f19

📥 Commits

Reviewing files that changed from the base of the PR and between 09218fa and 0c98da6.

📒 Files selected for processing (1)
  • gitbooks/features/model-routing/local-ai.md

📝 Walkthrough

Walkthrough

The Local AI docs are revised to state local AI is opt-in, add Chat and Reasoning as routable workloads, clarify cloud-default behavior, refine lightweight/medium vs heavy routing heuristics, and document unified *_provider configuration with a TOML example.

Changes

Local AI routing opt-in and provider configuration

Layer / File(s) Summary
Intro and local workloads table
gitbooks/features/model-routing/local-ai.md
Local AI intro and "What runs local when you turn it on" table mark local execution as opt-in and add on-device Chat and Reasoning.
Cloud-defaults and router heuristics
gitbooks/features/model-routing/local-ai.md
Clarifies what stays in the cloud by default and refines router behavior: lightweight/medium hints may prefer local only when local_ai.runtime_enabled is true and provider reachable; heavy hints default to cloud unless explicitly routed.
Unified *_provider fields and TOML example
gitbooks/features/model-routing/local-ai.md
Documents unified workload *_provider fields (e.g., chat_provider, reasoning_provider), shows TOML example, and states cloud-default behavior for unset/blank/cloud/non-local values plus provider-health dependency.
Legacy flag behavior and migration notes
gitbooks/features/model-routing/local-ai.md
Adds "Legacy flag behavior": local_ai.usage.* booleans are used only during preset application/initial migration and do not override *_provider routing after migration; updates remaining sections to match unified routing framing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through docs where routes are penned,
Opt-in paths for chat and thought ascend,
TOML maps the way — ollama, if you plea,
Cloud waits by default until you set it free.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'docs(local-ai): document local chat routing' accurately describes the main change—expanding Local AI documentation to cover local chat routing with explicit opt-in configuration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
gitbooks/features/model-routing/local-ai.md (2)

39-39: 💤 Low value

Consider clarifying "when local AI is enabled".

The phrase "when local AI is enabled" could be more precise. Does it mean local_ai.runtime_enabled = true, or does it require specific local_ai.usage.* flags to be set? Given the dual configuration mechanism (usage flags and provider fields), explicit criteria would reduce ambiguity.

That said, the linked router reference may already provide this detail.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gitbooks/features/model-routing/local-ai.md` at line 39, Clarify the
ambiguous phrase "when local AI is enabled" by specifying the exact
configuration checks the router uses: state that the router picks a local
provider only if local_ai.runtime_enabled is true and the relevant
local_ai.usage.* flag for the hint category (or the corresponding provider
field) allows that workload, and note that Ollama (or the configured local
provider) must be reachable; update the sentence referencing the router to
explicitly mention these config keys (local_ai.runtime_enabled, local_ai.usage.*
and provider fields) so readers know which flags must be set for the local
preference to apply.

23-24: 💤 Low value

Consider aligning the Implementation column with other rows.

The Chat and Reasoning rows show configuration field syntax in the Implementation column, while other rows reference actual source file paths. For consistency, consider either:

  • Referencing the router/provider code that handles these workloads (e.g., src/openhuman/providers/router.rs or similar)
  • Or adding a note that routing is config-driven and pointing to the Opting in section

The configuration syntax is already well-documented in lines 71-78.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gitbooks/features/model-routing/local-ai.md` around lines 23 - 24, The
Implementation column for the Chat and Reasoning rows should be made consistent
with other rows: instead of just showing the config syntax
(Config::chat_provider and Config::reasoning_provider), update the cell to
either reference the module/function that performs routing for those workloads
(the router/provider implementation that dispatches to local models) or add a
short note pointing readers to the "Opting in" section that explains
config-driven routing; locate the routing code by searching for the
router/provider implementation that handles chat/reasoning and update the
markdown to mention that symbol or the "Opting in" section accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@gitbooks/features/model-routing/local-ai.md`:
- Around line 71-78: Explain in the doc that the boolean flags
local_ai.usage.embeddings, local_ai.usage.heartbeat,
local_ai.usage.learning_reflection, and local_ai.usage.subconscious are feature
toggles that enable/disable use of local AI for those capability groups, while
the provider fields (embeddings_provider, heartbeat_provider, learning_provider,
subconscious_provider and the other *_provider fields such as agentic_provider,
coding_provider, memory_provider) select the specific runtime when the
corresponding feature is enabled; state the precedence (if a usage flag is false
the local provider is ignored even if a *_provider is set; if the usage flag is
true and the *_provider is unset/blank or set to "cloud" the cloud/default route
is used; setting a provider (e.g., embeddings_provider="ollama:...") with the
usage flag true routes that workload on-device), give a short recommendation to
either set both (enable flag + provider) or just set the flag to false to force
cloud, and add a brief sentence listing/pointing out agentic_provider,
coding_provider, memory_provider as additional capability-specific providers
with a link or reference to the detailed provider docs.

---

Nitpick comments:
In `@gitbooks/features/model-routing/local-ai.md`:
- Line 39: Clarify the ambiguous phrase "when local AI is enabled" by specifying
the exact configuration checks the router uses: state that the router picks a
local provider only if local_ai.runtime_enabled is true and the relevant
local_ai.usage.* flag for the hint category (or the corresponding provider
field) allows that workload, and note that Ollama (or the configured local
provider) must be reachable; update the sentence referencing the router to
explicitly mention these config keys (local_ai.runtime_enabled, local_ai.usage.*
and provider fields) so readers know which flags must be set for the local
preference to apply.
- Around line 23-24: The Implementation column for the Chat and Reasoning rows
should be made consistent with other rows: instead of just showing the config
syntax (Config::chat_provider and Config::reasoning_provider), update the cell
to either reference the module/function that performs routing for those
workloads (the router/provider implementation that dispatches to local models)
or add a short note pointing readers to the "Opting in" section that explains
config-driven routing; locate the routing code by searching for the
router/provider implementation that handles chat/reasoning and update the
markdown to mention that symbol or the "Opting in" section accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1a03e392-b534-4ae7-b0e7-6302aac5d4ab

📥 Commits

Reviewing files that changed from the base of the PR and between 369a392 and 1038d75.

📒 Files selected for processing (1)
  • gitbooks/features/model-routing/local-ai.md

Comment thread gitbooks/features/model-routing/local-ai.md Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
gitbooks/features/model-routing/local-ai.md (3)

78-80: ⚡ Quick win

Complete the precedence documentation for legacy flags vs provider fields.

Lines 78-80 state that *_provider fields are the source of truth and legacy flags don't override them after migration, which partially addresses the past review comment. However, the detailed precedence rules are still not fully explained.

Consider adding a brief clarification of the interaction:

  • When is a legacy usage.* flag consulted?
  • What happens if local_ai.usage.embeddings = false but embeddings_provider = "ollama:all-minilm"?
  • Should users set both or just one?
📋 Suggested addition after line 80

Add a subsection that clarifies the precedence:

### Legacy flag behavior

The `local_ai.usage.*` booleans (e.g., `local_ai.usage.embeddings`) are only used during preset application and initial migration. After migration, the corresponding `*_provider` field (e.g., `embeddings_provider`) becomes the sole routing control:
- If `embeddings_provider = "ollama:all-minilm"` → routes to Ollama regardless of the legacy flag
- If `embeddings_provider` is unset, blank, or `"cloud"` → routes to cloud regardless of the legacy flag

**Recommendation**: Use the `*_provider` fields directly in your configuration. The legacy `usage.*` flags are maintained for backward compatibility with existing presets.

Based on learnings from past review: The relationship between local_ai.usage.* flags and *_provider fields needs clarification including precedence rules and user guidance.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gitbooks/features/model-routing/local-ai.md` around lines 78 - 80, Add a
concise "Legacy flag behavior" subsection after the existing paragraph
clarifying that Config::workload_local_model (and related routing logic) treats
the `*_provider` fields as the definitive source of truth once migration/presets
complete; explain that `local_ai.usage.*` booleans are only consulted during
preset application/initial migration and do not override provider fields
afterwards, give two explicit examples (e.g., `embeddings_provider =
"ollama:all-minilm"` routes on-device regardless of `local_ai.usage.embeddings =
false`, and an unset/blank/"cloud" `*_provider` forces cloud routing regardless
of legacy flags), and add a short user recommendation to prefer setting
`*_provider` fields directly.

39-39: ⚡ Quick win

Consider breaking the routing logic into separate sentences for clarity.

This sentence describes two distinct routing behaviors (lightweight/medium vs heavy hints) with multiple conditions in a single statement. Breaking it into two or three sentences would improve readability.

📝 Suggested restructure
-For **lightweight or medium chat hints** (`hint:reaction`, `hint:classify`, `hint:format`, `hint:sentiment`, `hint:summarize`, `hint:medium`, `hint:tool_lite`), the [router](README.md) can prefer the local provider only when `local_ai.runtime_enabled = true` and the configured local provider is reachable. Heavy hints (`hint:reasoning`, `hint:agentic`, `hint:coding`) stay cloud unless the matching workload provider field is explicitly configured locally.
+For **lightweight or medium chat hints** (`hint:reaction`, `hint:classify`, `hint:format`, `hint:sentiment`, `hint:summarize`, `hint:medium`, `hint:tool_lite`), the [router](README.md) can prefer the local provider when `local_ai.runtime_enabled = true` and the configured local provider is reachable.
+
+**Heavy hints** (`hint:reasoning`, `hint:agentic`, `hint:coding`) always stay cloud by default. They only route locally when the matching workload provider field (e.g., `reasoning_provider`, `agentic_provider`, `coding_provider`) is explicitly configured with a local provider string.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gitbooks/features/model-routing/local-ai.md` at line 39, Split the long
sentence into two clearer sentences: one that states routing for
lightweight/medium chat hints (list the hint tokens: hint:reaction,
hint:classify, hint:format, hint:sentiment, hint:summarize, hint:medium,
hint:tool_lite) is allowed to prefer the local provider only when
local_ai.runtime_enabled = true and the configured local provider is reachable,
and a second sentence that states heavy hints (hint:reasoning, hint:agentic,
hint:coding) remain cloud by default unless the matching workload provider field
is explicitly configured locally; keep the same terminology (“router”, “local
provider reachable”, and “matching workload provider field”) so the revised
lines in local-ai.md are easier to scan and understand.

88-90: 💤 Low value

Optional: Vary sentence structure to avoid repetition.

Three consecutive bullet points begin with "You," which creates a slightly repetitive rhythm. Consider varying the structure for one or two of them.

✏️ Possible variations
 - You ingest large volumes of email / chat and want **embeddings to never leave the machine**.
 - You want **summary-tree building** to work offline.
-- You're privacy-sensitive about background reflection ("subconscious") loops.
+- **Background reflection** ("subconscious") loops require privacy-sensitive handling.

or

-- You ingest large volumes of email / chat and want **embeddings to never leave the machine**.
+- **Large volumes of email / chat** ingestion where embeddings should never leave the machine.
 - You want **summary-tree building** to work offline.
 - You're privacy-sensitive about background reflection ("subconscious") loops.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@gitbooks/features/model-routing/local-ai.md` around lines 88 - 90, The three
consecutive bullets all start with "You," creating repetition; change one or two
bullet leads to vary sentence structure - e.g., reword "You ingest large volumes
of email / chat and want **embeddings to never leave the machine**" to "Keep
embeddings local when ingesting large volumes of email/chat," or change "You
want **summary-tree building** to work offline" to "Enable summary‑tree building
to run offline," and/or convert "You're privacy‑sensitive about background
reflection ("subconscious") loops" to "Privacy‑sensitive users may want to
disable background reflection ("subconscious") loops" so the list reads less
repetitive while preserving meaning.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@gitbooks/features/model-routing/local-ai.md`:
- Around line 78-80: Add a concise "Legacy flag behavior" subsection after the
existing paragraph clarifying that Config::workload_local_model (and related
routing logic) treats the `*_provider` fields as the definitive source of truth
once migration/presets complete; explain that `local_ai.usage.*` booleans are
only consulted during preset application/initial migration and do not override
provider fields afterwards, give two explicit examples (e.g.,
`embeddings_provider = "ollama:all-minilm"` routes on-device regardless of
`local_ai.usage.embeddings = false`, and an unset/blank/"cloud" `*_provider`
forces cloud routing regardless of legacy flags), and add a short user
recommendation to prefer setting `*_provider` fields directly.
- Line 39: Split the long sentence into two clearer sentences: one that states
routing for lightweight/medium chat hints (list the hint tokens: hint:reaction,
hint:classify, hint:format, hint:sentiment, hint:summarize, hint:medium,
hint:tool_lite) is allowed to prefer the local provider only when
local_ai.runtime_enabled = true and the configured local provider is reachable,
and a second sentence that states heavy hints (hint:reasoning, hint:agentic,
hint:coding) remain cloud by default unless the matching workload provider field
is explicitly configured locally; keep the same terminology (“router”, “local
provider reachable”, and “matching workload provider field”) so the revised
lines in local-ai.md are easier to scan and understand.
- Around line 88-90: The three consecutive bullets all start with "You,"
creating repetition; change one or two bullet leads to vary sentence structure -
e.g., reword "You ingest large volumes of email / chat and want **embeddings to
never leave the machine**" to "Keep embeddings local when ingesting large
volumes of email/chat," or change "You want **summary-tree building** to work
offline" to "Enable summary‑tree building to run offline," and/or convert
"You're privacy‑sensitive about background reflection ("subconscious") loops" to
"Privacy‑sensitive users may want to disable background reflection
("subconscious") loops" so the list reads less repetitive while preserving
meaning.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6e10e340-4ff9-4fc9-b989-d76cba5911da

📥 Commits

Reviewing files that changed from the base of the PR and between 1038d75 and 09218fa.

📒 Files selected for processing (1)
  • gitbooks/features/model-routing/local-ai.md

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice work!

@graycyrus graycyrus merged commit 70ef427 into tinyhumansai:main May 22, 2026
27 checks passed
senamakel pushed a commit to aqilaziz/openhuman that referenced this pull request May 23, 2026
Co-authored-by: Aqil Aziz <aqilaziz@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants