Skip to content

feat(search): unified web-access firewall (fetch + browser) with tri-state Allowed-websites control + friendly block errors#2704

Merged
M3gA-Mind merged 10 commits into
tinyhumansai:mainfrom
sanil-23:feat/web-access-allowlist-settings
May 26, 2026
Merged

feat(search): unified web-access firewall (fetch + browser) with tri-state Allowed-websites control + friendly block errors#2704
M3gA-Mind merged 10 commits into
tinyhumansai:mainfrom
sanil-23:feat/web-access-allowlist-settings

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented May 26, 2026

Summary

Settings → Search gains a single "Allowed websites" control that governs every way the agent reaches the web on-disk, backed by [http_request].allowed_domains via the existing config.get/update_search_settings RPC:

  • Tri-state access mode — Allow all · Custom · Block all (segmented control). Replaces the earlier binary "Allow all" toggle; "Block all" is now a first-class, labelled state instead of the hidden "leave the box empty" path. Typed hosts are preserved when switching modes.
  • Unified fetch + browser firewall. The browser tool (browser_open + browser) now shares the same host list via a new tools::ops::browser_allowed_domains() helper instead of the now-deprecated [browser].allowed_domains. The "*" allow-all wildcard is stripped for the browser, so a fetch-side "Allow all" never silently opens a real Chromium (JS + logged-in sessions) to every site. Browser allow-all stays gated by OPENHUMAN_BROWSER_ALLOW_ALL + browser.enabledfail-safe: unifying can only narrow the browser's reach, never widen it.
  • Default ["*"] (allow all public sites) so web research works out of the box. The SSRF guard still blocks local/private hosts regardless. url_guard's matcher treats "*" as allow-all.
  • Friendly web_fetch/curl block errors that point the user/LLM to the setting instead of leaking config.toml/internal keys.
  • Web search is not gated by this list (it's controlled by the engine + API key only); the UI hint says so.
  • i18n synced to all locales — the new access-mode keys are added to every *-1 chunk file, not just English.

Problem

The agent's research flow uses web_fetch, gated by the SSRF-hardened host allowlist in url_guard. Two issues made this a wall:

  1. The allowlist had no UI and no RPC — editable only by hand in config.toml, and the default was empty (every fresh user's first web search failed).
  2. The block errors were dev-oriented; the subagent circuit-breaker then surfaced a generic "web search isn't available" with no path to fix it.

On top of that, OpenHuman had two separate web-access firewalls[http_request].allowed_domains (fetch/curl) and [browser].allowed_domains (the browser tool) — with opposite defaults and no shared UI, so "what can the agent reach on the web?" had no single answer. And url_guard's matcher had no * wildcard, so there was no clean "allow everything" option.

Solution

  • config/ops.rs + config/schemas.rs: extend SearchSettingsPatch / SearchSettingsUpdate and get_search_settings with allowed_domains + allow_all, reading/writing [http_request].allowed_domains. allow_all=true collapses to ["*"]; false drops the wildcard.
  • config/schema/tools.rs: HttpRequestConfig defaults allowed_domains to ["*"]; [browser].allowed_domains marked deprecated (still parsed, no longer gates navigation).
  • url_guard.rs: host_matches_allowlist treats "*" as allow-all (public hosts only — local/private still rejected); both block errors rewritten to name the host, cause, and exact fix.
  • tools/ops.rs: new browser_allowed_domains() helper feeds the browser tool the shared fetch list with "*" stripped; web_fetch/curl/http_request keep treating "*" as allow-all.
  • SearchPanel.tsx: the "Allowed websites" section is now a tri-state Allow all / Custom / Block all segmented control (the host editor + Save appear only in Custom). Hint copy clarifies the list covers fetch + browser, not search.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — added/updated tests target the changed lines: url_guard wildcard + SSRF, HttpRequestConfig default, apply_search_settings round-trip, browser_allowed_domains (shares fetch list minus "*"), and the rewritten SearchPanel suite (each mode + host-preservation). CI coverage.yml enforces the gate.
  • N/A: no docs/TEST-COVERAGE-MATRIX.md feature row maps to this Settings/allowlist control — flag in review if a row is wanted.
  • N/A: no matrix feature IDs apply to this change.
  • No new external network dependencies introduced (this change governs outbound access; it adds none).
  • N/A: does not touch release-cut surfaces in docs/RELEASE-MANUAL-SMOKE.md.
  • N/A: no tracking issue — feature requested directly; nothing to auto-close.

Impact

  • Desktop (core + React). Behavior change: web_fetch/curl is allowed by default (["*"]) where it was effectively blocked; local/private hosts remain blocked by the SSRF guard. The browser tool now derives its allowlist from the same list (minus "*").
  • Security posture: one place to reason about the agent's web reach. The browser tool can only ever be narrowed by this unification — it never inherits fetch's blanket "*", and still requires browser.enabled + OPENHUMAN_BROWSER_ALLOW_ALL to go wide. Broader default outbound reach for fetch (public hosts only) is a deliberate zero-friction tradeoff that the new UI makes easy to scope down.

Related

  • Closes: N/A (no tracking issue)
  • Resolved follow-ups from the original PR scope: ✅ folded [browser].allowed_domains into the unified "Allowed websites" control; ✅ synced the new i18n keys to all non-English locale chunks.
  • Follow-up PR(s)/TODOs: remove the now-dead settings.search.allowAllAria i18n key (cosmetic).

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: feat/web-access-allowlist-settings
  • Commit SHA: 11c6e9f

Validation Run

  • pnpm exec prettier --check — clean on changed frontend files (full repo check skipped due to known CRLF drift).
  • pnpm typecheck — clean (tsc --noEmit, exit 0).
  • pnpm i18n:check — missing 0 / extra 0 / drifted 0 across all locales (new keys present in every chunk-1).
  • Focused tests: pnpm debug unit SearchPanel → 7 passed.
  • Rust: cargo check --manifest-path Cargo.toml --lib --tests — exit 0 (reused the local dev build's cached sys-crate artifacts in the shared target).
  • N/A: Tauri fmt/check — no app/src-tauri source changes (core + frontend only).

Validation Blocked

  • command: cargo test (run the new browser_allowed_domains unit test)
  • error: linking the Rust test binary needs the MSVC env (vcvars/SDK/linker) that the Windows dev wrapper sets up; the bare shell lacks it, and cargo check --tests (which passed) does not link.
  • impact: the browser_allowed_domains test compiles but was not executed locally — it's a pure deterministic filter, and CI runs it. Everything else above was verified locally.

Behavior Changes

  • Intended: single tri-state web-access control; browser tool shares the fetch allowlist (minus "*"); web_fetch/curl default to allow-all; friendlier block errors.
  • User-visible: Settings → Search now shows Allow all / Custom / Block all; web research works without config; clearer guidance when a host is blocked. No change to a vault/registration surface.

Parity Contract

  • Legacy preserved: exact-host + subdomain matching unchanged; SSRF blocklist (loopback/RFC1918/link-local/…) unchanged and still enforced even with "*". allow_all=false + empty list reproduces the old block-by-default behavior.
  • Browser parity: with "*" stripped, an explicit-host list applies to the browser exactly as before; the browser never gains blanket access from a fetch-side toggle, and browser.enabled + OPENHUMAN_BROWSER_ALLOW_ALL gates are untouched.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none
  • Canonical PR: this
  • Resolution: N/A

Summary by CodeRabbit

  • New Features

    • Website allowlist settings: three-mode selector (Allow all / Custom / Block all) with multiline domain editor and save control; browser and fetch use the unified allowlist.
  • Bug Fixes

    • Clearer "allowed websites" messages; defaults now allow public sites by default while still blocking local/private addresses.
  • Tests

    • Comprehensive unit and UI tests for loading, mode switching, editing, normalization, and persistence.
  • Documentation

    • Added translation strings for the new UI across many languages.

Review Change Stack

…k errors

- Add an "Allowed websites" section to Settings → Search: a host allowlist
  (one per line) plus an "Allow all sites" toggle, backed by
  [http_request].allowed_domains via config get/update_search_settings.
- Default the allowlist to ["*"] (allow all public sites) so web research
  works out of the box; the SSRF guard still blocks local/private hosts.
- url_guard: support "*" allow-all in the host matcher (mirrors the browser
  tool), and rewrite both block errors to be user/LLM-friendly — pointing to
  the new setting instead of leaking config.toml internals, so the agent can
  relay an actionable message instead of "web search isn't available".
- Tests: matcher wildcard (allows public hosts, still blocks localhost/
  private); HttpRequestConfig default = ["*"]; apply_search_settings
  round-trip (set/trim/dedupe list, allow_all on/off); SearchPanel UI
  (load list, toggle allow-all, save edited list, hide editor when allow-all).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 requested a review from a team May 26, 2026 17:08
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an "Allowed websites" allowlist end-to-end: RPC types and schema, backend normalization/persistence with "*" wildcard semantics and audit logging, HTTP defaults, browser allowlist derivation, URL validation/messages, frontend Settings UI + tests, and i18n keys.

Changes

Allowed Websites Allowlisting Feature

Layer / File(s) Summary
Type contracts and RPC schema
app/src/utils/tauriCommands/config.ts, src/openhuman/config/schemas.rs, src/openhuman/config/ops.rs
SearchSettingsUpdate/SearchSettings and SearchSettingsPatch expose allowed_domains and allow_all for RPC and controller handling.
Backend persistence and ops
src/openhuman/config/ops.rs, src/openhuman/config/ops_tests.rs
apply_search_settings normalizes allowed_domains (trim, drop blanks, sort, dedupe) and applies allow_all by setting ["*"] or removing it; get_search_settings reports allowed_domains and allow_all; tests added.
HTTP config defaults
src/openhuman/config/schema/tools.rs
HttpRequestConfig defaults allowed_domains to ["*"] via helper + manual Default impl; unit test asserts defaults.
Browser tool allowlist derivation
src/openhuman/tools/ops.rs, src/openhuman/tools/ops_tests.rs
Adds browser_allowed_domains(...) to derive browser navigation allowlist from HTTP allowed_domains by stripping "*", and wires it into browser tool registration; tests added.
URL validation with wildcard enforcement
src/openhuman/tools/impl/network/url_guard.rs, src/openhuman/tools/impl/network/http_request_tests.rs
host_matches_allowlist treats "*" as explicit allow-all while retaining SSRF/local-private blocking; error messages updated to "allowed websites"; tests adjusted/added.
Frontend UI state and rendering
app/src/components/settings/panels/SearchPanel.tsx, app/src/utils/tauriCommands/config.ts, app/src/lib/i18n/en.ts
SearchPanel adds AccessMode and allowedText state, one-time init from settings (excluding "*"), selectMode/persistAllowedDomains handlers, a radio group for access mode, textarea editor for custom mode, and save/persist wiring; controls disabled while saving.
Frontend UI tests
app/src/components/settings/panels/SearchPanel.test.tsx
Vitest/RTL tests verify initial mode derivation, toggles persisting allow_all or empty lists, domain-list saving with normalization, editor visibility rules, and editor-state retention across mode switches.
i18n translations
app/src/lib/i18n/chunks/*, app/src/lib/i18n/en.ts
Adds settings.search.allowedSites* and settings.search.access* translation keys across locale chunks for label, ARIA text, hint, placeholder, and save button.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant SearchPanel
  participant TauriRPC as Tauri RPC
  participant ConfigOps as Config Ops
  User->>SearchPanel: toggle allow mode / edit textarea / click Save
  SearchPanel->>TauriRPC: send SearchSettingsUpdate (allowed_domains?, allow_all?)
  TauriRPC->>ConfigOps: apply SearchSettingsPatch
  ConfigOps-->>TauriRPC: return updated SearchSettings (allowed_domains, allow_all)
  TauriRPC-->>SearchPanel: updated settings
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • graycyrus
  • senamakel
  • M3gA-Mind

Poem

🐰 A rabbit typed hosts, one per line,
Wildcard "*" made the public web fine.
Toggle and save, the settings obey,
SSRF guards still block the stray.
Hooray — the browse list hops into play!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: a unified web-access firewall with a tri-state control for allowed websites in search settings, covering the primary objectives and key features of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team. labels May 26, 2026
The six `settings.search.allowedSites*` keys were only added to en-1.ts.
The i18n coverage gate requires en.ts (canonical English map) to match the
en-N chunks, and every locale to carry the same keys. Add them to en.ts and
mirror English-fallback values into all 12 locale chunks (via
scripts/i18n-mirror-missing.mjs). `pnpm i18n:check` now passes (missing/extra/
drift = 0). Real translations can be filled in later.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/tools/impl/network/url_guard.rs (1)

607-619: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix stale assertion string after error-message rename.

This test still checks for "allowed_domains", but the validator now returns "allowed websites", so this assertion will fail.

Suggested fix
-                err.contains("allowed_domains"),
+                err.contains("allowed websites"),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/tools/impl/network/url_guard.rs` around lines 607 - 619, Test
ssrf_alternate_notations_rejected_by_validate_url fails because it asserts
err.contains("allowed_domains") but validate_url now returns "allowed websites";
update the assertion string to match the validator's current message. Locate the
test function ssrf_alternate_notations_rejected_by_validate_url and change the
expected substring in the assert! from "allowed_domains" to "allowed websites"
(or use a constant/error enum if available from validate_url) so the assertion
checks the current error text returned by validate_url when using the allow
vector.
🧹 Nitpick comments (1)
app/src/components/settings/panels/SearchPanel.tsx (1)

129-158: 💤 Low value

Consider frontend validation for domain format (optional UX improvement).

The persistAllowedDomains function accepts raw textarea input, splits by newline, trims, and filters empty lines. The backend normalizes and validates domains, but adding basic frontend validation (e.g., checking for invalid characters, spaces, schemes) could provide immediate user feedback and prevent round-trip errors.

Example validation pattern
 const persistAllowedDomains = async () => {
   if (!settings || status.kind === 'saving') return;
   const domains = allowedText
     .split('\n')
     .map(s => s.trim())
     .filter(Boolean);
+  // Optional: basic domain format check
+  const invalid = domains.find(d => /\s|:\/\//.test(d));
+  if (invalid) {
+    setStatus({ kind: 'error', message: `Invalid domain format: ${invalid}` });
+    return;
+  }
   setStatus({ kind: 'saving' });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/components/settings/panels/SearchPanel.tsx` around lines 129 - 158,
persistAllowedDomains currently sends whatever is in allowedText to the backend;
add client-side validation before calling openhumanUpdateSearchSettings: parse
allowedText into domains (as you're already doing), validate each entry for
basic domain format (no URL schemes like http:// or https://, no spaces or
paths, only allowed hostname characters and optional wildcard/punycode if
supported) using a simple regex, and if any domain is invalid call setStatus({
kind: 'error', message: 'Invalid domain: <domain>' }) and return early; only
call openhumanUpdateSearchSettings({ allowed_domains: domains, allow_all: false
}) when all domains pass validation and keep the existing status transitions and
refresh logic that uses openhumanGetSearchSettings and setSettings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/config/ops.rs`:
- Around line 1009-1028: Add structured, grep-friendly state-transition logs
around the allowlist update branches (the blocks handling update.allowed_domains
and update.allow_all) that record correlation fields (config path identifier),
the before and after counts of config.http_request.allowed_domains, and the
wildcard state (whether "*" is present) but do NOT log raw domain values; emit
one log before mutation and one after for each branch (use the project's
structured logger, e.g., info! or debug! with key=value pairs like
path="http_request.allowed_domains" before_count=?, after_count=?,
before_wildcard=?, after_wildcard=?, operation="set_domains" /
"toggle_allow_all") so reviewers can grep transitions without exposing domains.

---

Outside diff comments:
In `@src/openhuman/tools/impl/network/url_guard.rs`:
- Around line 607-619: Test ssrf_alternate_notations_rejected_by_validate_url
fails because it asserts err.contains("allowed_domains") but validate_url now
returns "allowed websites"; update the assertion string to match the validator's
current message. Locate the test function
ssrf_alternate_notations_rejected_by_validate_url and change the expected
substring in the assert! from "allowed_domains" to "allowed websites" (or use a
constant/error enum if available from validate_url) so the assertion checks the
current error text returned by validate_url when using the allow vector.

---

Nitpick comments:
In `@app/src/components/settings/panels/SearchPanel.tsx`:
- Around line 129-158: persistAllowedDomains currently sends whatever is in
allowedText to the backend; add client-side validation before calling
openhumanUpdateSearchSettings: parse allowedText into domains (as you're already
doing), validate each entry for basic domain format (no URL schemes like http://
or https://, no spaces or paths, only allowed hostname characters and optional
wildcard/punycode if supported) using a simple regex, and if any domain is
invalid call setStatus({ kind: 'error', message: 'Invalid domain: <domain>' })
and return early; only call openhumanUpdateSearchSettings({ allowed_domains:
domains, allow_all: false }) when all domains pass validation and keep the
existing status transitions and refresh logic that uses
openhumanGetSearchSettings and setSettings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 327af0c8-385a-439b-980d-7f88a14628c8

📥 Commits

Reviewing files that changed from the base of the PR and between 87f8ef4 and 8de264c.

📒 Files selected for processing (10)
  • app/src/components/settings/panels/SearchPanel.test.tsx
  • app/src/components/settings/panels/SearchPanel.tsx
  • app/src/lib/i18n/chunks/en-1.ts
  • app/src/utils/tauriCommands/config.ts
  • src/openhuman/config/ops.rs
  • src/openhuman/config/ops_tests.rs
  • src/openhuman/config/schema/tools.rs
  • src/openhuman/config/schemas.rs
  • src/openhuman/tools/impl/network/http_request_tests.rs
  • src/openhuman/tools/impl/network/url_guard.rs

Comment thread src/openhuman/config/ops.rs
The i18n-mirror-missing.mjs output wasn't Prettier-formatted, failing the
Type Check workflow's format:check step. Run prettier --write on the 12
locale -1.ts chunks (only the newly mirrored keys reflow).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/lib/i18n/chunks/ar-1.ts`:
- Around line 1322-1327: The new translation entries (e.g.
"settings.search.allowedSitesLabel", "settings.search.allowAllAria",
"settings.search.allowedSitesHint", "settings.search.allowedSitesAllOn",
"settings.search.allowedSitesPlaceholder", "settings.search.allowedSitesSave")
are not formatted according to Prettier and fail CI; run Prettier on the file
(app/src/lib/i18n/chunks/ar-1.ts) or run your repo's formatting script (e.g.
prettier --write or npm run format) to normalize spacing, commas and quotes,
then re-commit the formatted file.

In `@app/src/lib/i18n/chunks/bn-1.ts`:
- Around line 1332-1337: The new translation keys
("settings.search.allowedSitesLabel", "settings.search.allowAllAria",
"settings.search.allowedSitesHint", "settings.search.allowedSitesAllOn",
"settings.search.allowedSitesPlaceholder", "settings.search.allowedSitesSave")
are not Prettier-formatted; run Prettier on the file containing these keys
(app/src/lib/i18n/chunks/bn-1.ts) or format the object to match project Prettier
rules (spacing, trailing commas, consistent quotes) and then re-stage/commit the
result (or run prettier --write . and update the commit) so the hunk passes
prettier --check.

In `@app/src/lib/i18n/chunks/es-1.ts`:
- Around line 1347-1352: The new i18n block is unformatted and failing prettier
checks; run Prettier on app/src/lib/i18n/chunks/es-1.ts (or format just the
added keys) so the entries like "settings.search.allowedSitesLabel",
"settings.search.allowedSitesHint", "settings.search.allowedSitesPlaceholder",
and "settings.search.allowedSitesSave" follow the project's Prettier style
(consistent spacing, trailing commas, and line breaks) and then re-run prettier
--check before committing.

In `@app/src/lib/i18n/chunks/fr-1.ts`:
- Around line 1352-1357: The new translation block with keys like
"settings.search.allowedSitesLabel", "settings.search.allowedSitesHint",
"settings.search.allowedSitesAllOn", "settings.search.allowedSitesPlaceholder",
and "settings.search.allowedSitesSave" is not formatted to the repo’s Prettier
config; run Prettier (e.g. prettier --write or your repo's format script) on
app/src/lib/i18n/chunks/fr-1.ts or reformat that block so it matches the
project's Prettier output and then re-run prettier --check before pushing.

In `@app/src/lib/i18n/chunks/hi-1.ts`:
- Around line 1330-1335: The added translations block (keys like
"settings.search.allowedSitesLabel", "settings.search.allowedSitesPlaceholder",
"settings.search.allowedSitesSave", etc.) is not formatted to project Prettier
rules; run Prettier over this file or reformat this hunk to match the repo style
(e.g., wrap/align quotes, commas and indentation) so that `prettier --check`
passes, then stage the formatted hi-1.ts changes before merging.

In `@app/src/lib/i18n/chunks/id-1.ts`:
- Around line 1335-1340: The added translation entries (e.g.,
"settings.search.allowedSitesLabel", "settings.search.allowedSitesHint",
"settings.search.allowedSitesPlaceholder", "settings.search.allowedSitesSave")
are not formatted per the repo Prettier rules; reformat that block (or the whole
file) using the repository Prettier config (e.g., run the project's format
script or `prettier --write` for ts/tsx/js/jsx) so the hunk matches CI
formatting before merging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c152e314-8fd3-419f-9848-19d62acb3443

📥 Commits

Reviewing files that changed from the base of the PR and between 8de264c and 4761d97.

📒 Files selected for processing (13)
  • app/src/lib/i18n/chunks/ar-1.ts
  • app/src/lib/i18n/chunks/bn-1.ts
  • app/src/lib/i18n/chunks/de-1.ts
  • app/src/lib/i18n/chunks/es-1.ts
  • app/src/lib/i18n/chunks/fr-1.ts
  • app/src/lib/i18n/chunks/hi-1.ts
  • app/src/lib/i18n/chunks/id-1.ts
  • app/src/lib/i18n/chunks/it-1.ts
  • app/src/lib/i18n/chunks/ko-1.ts
  • app/src/lib/i18n/chunks/pt-1.ts
  • app/src/lib/i18n/chunks/ru-1.ts
  • app/src/lib/i18n/chunks/zh-CN-1.ts
  • app/src/lib/i18n/en.ts
✅ Files skipped from review due to trivial changes (6)
  • app/src/lib/i18n/chunks/pt-1.ts
  • app/src/lib/i18n/chunks/zh-CN-1.ts
  • app/src/lib/i18n/chunks/de-1.ts
  • app/src/lib/i18n/chunks/it-1.ts
  • app/src/lib/i18n/chunks/ko-1.ts
  • app/src/lib/i18n/chunks/ru-1.ts

Comment thread app/src/lib/i18n/chunks/ar-1.ts Outdated
Comment thread app/src/lib/i18n/chunks/bn-1.ts Outdated
Comment thread app/src/lib/i18n/chunks/es-1.ts Outdated
Comment thread app/src/lib/i18n/chunks/fr-1.ts Outdated
Comment thread app/src/lib/i18n/chunks/hi-1.ts Outdated
Comment thread app/src/lib/i18n/chunks/id-1.ts Outdated
The friendly block errors pointed to "Settings → Search → Allowed websites",
but the panel lives under Developer Options. Correct to the real nav path:
Settings → Developer Options → Search engine → Allowed websites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 marked this pull request as draft May 26, 2026 17:29
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/tools/impl/network/url_guard.rs (1)

60-65: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update the stale allowlist assertion in the same file.

This wording no longer includes "allowed_domains", so ssrf_alternate_notations_rejected_by_validate_url at Lines 607-619 will now fail even though the rejection is correct. Please switch that assertion to "allowed websites" or another stable signal.

🧪 Suggested follow-up
-            assert!(
-                err.contains("allowed_domains"),
-                "Expected allowlist rejection for {notation}, got: {err}"
-            );
+            assert!(
+                err.contains("allowed websites"),
+                "Expected allowlist rejection for {notation}, got: {err}"
+            );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/tools/impl/network/url_guard.rs` around lines 60 - 65, The test
failure is caused by the error message in the host rejection path using a stale
token; update the error string emitted when host_matches_allowlist(&host,
allowed_domains) fails so it contains the stable phrase "allowed websites" (or
similar stable wording) instead of the old "allowed_domains" token so
ssrf_alternate_notations_rejected_by_validate_url can match; locate the
rejection in url_guard.rs (the block that calls anyhow::bail! when
host_matches_allowlist returns false) and modify the message text to include
"allowed websites" and preserve the existing guidance about Settings → Developer
Options → Search engine → Allowed websites.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/openhuman/tools/impl/network/url_guard.rs`:
- Around line 60-65: The test failure is caused by the error message in the host
rejection path using a stale token; update the error string emitted when
host_matches_allowlist(&host, allowed_domains) fails so it contains the stable
phrase "allowed websites" (or similar stable wording) instead of the old
"allowed_domains" token so ssrf_alternate_notations_rejected_by_validate_url can
match; locate the rejection in url_guard.rs (the block that calls anyhow::bail!
when host_matches_allowlist returns false) and modify the message text to
include "allowed websites" and preserve the existing guidance about Settings →
Developer Options → Search engine → Allowed websites.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3c5d26af-a4b9-4299-ae1f-d8470ce9ae7b

📥 Commits

Reviewing files that changed from the base of the PR and between 228b533 and 19d3222.

📒 Files selected for processing (1)
  • src/openhuman/tools/impl/network/url_guard.rs

…ebsites control

Settings → Search "Allowed websites" now exposes the access policy as a
three-way choice — Allow all | Custom | Block all — instead of a binary
"Allow all" toggle whose only "block everything" path was an empty box.
Block-all is now a first-class, labelled state.

Under the hood the two web-access firewalls are unified onto a single host
list (http_request.allowed_domains):

- The browser tool (browser_open + browser) now reads that shared list via
  a new tools::ops::browser_allowed_domains() helper, instead of the
  now-deprecated [browser].allowed_domains. The "*" allow-all wildcard is
  stripped for the browser, so a fetch-side "Allow all" never silently
  opens a real Chromium (with JS + logged-in sessions) to every site.
  Browser allow-all stays gated by OPENHUMAN_BROWSER_ALLOW_ALL and
  browser.enabled — net effect is fail-safe: unifying can only narrow the
  browser's reach, never widen it.
- web_fetch / curl / http_request keep treating "*" as allow-all.
- Web search is intentionally NOT gated by this list (hint copy updated).

i18n: add accessModeAria / accessAllowAll / accessCustom / accessBlockAll /
accessBlockAllHint across en.ts and all locale chunk-1 files.

Tests: rewrite SearchPanel.test.tsx for the three modes (incl. preserving
typed hosts across mode switches); add a Rust unit test for
browser_allowed_domains (shares the fetch list minus "*").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 changed the title feat(search): manage web-access allowlist in Settings + friendly block errors feat(search): unified web-access firewall (fetch + browser) with tri-state Allowed-websites control + friendly block errors May 26, 2026
@sanil-23 sanil-23 marked this pull request as ready for review May 26, 2026 18:08
…tations

Addresses CodeRabbit review on tinyhumansai#2704: emit a structured state-transition
log when [http_request].allowed_domains is updated via the search-settings
RPC. Records before/after host counts + the allow-all wildcard flag only —
never the raw hosts (redaction rule), matching the repo's debug-logging policy.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good architecture — unifying two separate web-access firewalls into one settings-backed control is the right move, and the fail-safe design (browser never inherits "*" from fetch via browser_allowed_domains()) is solid. The initializedRef pattern in the frontend prevents settings refreshes from clobbering in-progress host edits. Tests cover all tri-state transitions on both sides.

Blockers:

  1. CI is red. Frontend Unit Tests and Rust Core Tests + Quality are both failing on the latest commit. Need green before this can move forward.
  2. Verify the settings navigation path in the url_guard block-error messages — see inline comment.

Minor notes (not blocking):

  • Non-English i18n chunks still carry the old hint: "Leave empty to block all web access" — now misleading since "Block all" is an explicit toggle. Worth syncing the fallback text in a follow-up.
  • Orphaned settings.search.allowAllAria key — already noted in the PR description for follow-up, 👍.
Area Files Verdict
Rust config ops.rs, schemas.rs, schema/tools.rs, ops_tests.rs ✓ clean — defaults, RPC, round-trip test
Rust tools tools/ops.rs, tools/ops_tests.rs browser_allowed_domains well-tested
Rust url_guard url_guard.rs, http_request_tests.rs wildcard + SSRF ✓ — error-message path needs check
Frontend SearchPanel.tsx, .test.tsx, config.ts ✓ solid
i18n 13 chunks + en.ts keys synced, hint-text drift in non-EN (minor)

CodeRabbit dedup: skipped transition-log suggestion (ops.rs:1028) and Prettier formatting (addressed in 228b533).

"Network tool is enabled but no allowed_domains are configured. Add [http_request].allowed_domains in config.toml"
"I'm not allowed to open any website yet — the allowed websites list is empty. \
Add the site you need (or turn on \"Allow all sites\") under \
Settings → Developer Options → Search engine → Allowed websites, then ask me again."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] The hardcoded path Settings → Developer Options → Search engine → Allowed websites appears in both block-error messages (here and line 65). Please verify it matches the actual UI navigation — from the SearchPanel component it looks like the control might live under Settings → Search → Allowed websites instead.

If the path is wrong, users will hunt for a setting that doesn't exist at that location. Worth a quick manual check since this is the primary user-facing guidance when a fetch gets blocked.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 3b10576. The path was wrong, though the accurate one is Settings → Advanced → Search engine → Allowed websites (not Settings → Search): SearchPanel is registered as the search entry inside DeveloperOptionsPanel, and that section renders as "Advanced" (settings.developerOptions = "Advanced"). Both block-error messages now point there.

Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall: well-structured PR with solid Rust + Vitest coverage and correct security semantics (browser can only be narrowed, never widened). Two issues below need fixing before merge; the rest are suggestions.

Comment on lines +48 to +50
"I'm not allowed to open any website yet — the allowed websites list is empty. \
Add the site you need (or turn on \"Allow all sites\") under \
Settings → Developer Options → Search engine → Allowed websites, then ask me again."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug — wrong settings path in error message.

This says Settings → Developer Options → Search engine → Allowed websites, but the new control lives at Settings → Search → Allowed websites (it's in SearchPanel, not under a "Developer Options" subtree). A user following this error won't find the setting.

Same issue on line 64.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 3b10576. The path was wrong, though the accurate one is Settings → Advanced → Search engine → Allowed websites (not Settings → Search): SearchPanel is registered as the search entry inside DeveloperOptionsPanel, and that section renders as "Advanced" (settings.developerOptions = "Advanced"). Both block-error messages now point there.

Comment on lines +62 to +64
"I'm not allowed to open '{host}' — it isn't in your allowed websites. \
Add it (or turn on \"Allow all sites\") under \
Settings → Developer Options → Search engine → Allowed websites, then ask me again."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same wrong path as the empty-list error above — Developer Options → Search engine doesn't match the actual panel location.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 3b10576. The path was wrong, though the accurate one is Settings → Advanced → Search engine → Allowed websites (not Settings → Search): SearchPanel is registered as the search entry inside DeveloperOptionsPanel, and that section renders as "Advanced" (settings.developerOptions = "Advanced"). Both block-error messages now point there.

Comment thread app/src/lib/i18n/en.ts Outdated
'settings.search.placeholderParallel': 'pk_...',
'settings.search.placeholderBrave': 'BSA...',
'settings.search.allowedSitesLabel': 'Allowed websites',
'settings.search.allowAllAria': 'Allow all sites',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead i18n keysettings.search.allowAllAria is never referenced in the UI. The segmented control uses settings.search.accessModeAria instead. The PR body even flags this as cleanup debt. It's also been added to all 13 non-English locale chunk files. Better to drop it here and not ship it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3b10576 — removed settings.search.allowAllAria from en.ts and all 14 locale chunk files (UI uses accessModeAria). pnpm i18n:check stays green (parity 0/0/0).

Comment on lines +175 to +179
const persistAllowedDomains = () => {
const domains = allowedText
.split('\n')
.map(s => s.trim())
.filter(Boolean);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No hostname validation. split('\n').trim().filter(Boolean) doesn't check that entries look like hostnames. A user entering https://reuters.com or reuters.com/path will get an entry that never matches url_guard's host-based comparison. Consider stripping a leading https?:// and any path component, or at minimum showing an inline hint (e.g. the placeholder already does this well — a validation note in the hint copy for Custom mode would help).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 3b10576persistAllowedDomains now runs each entry through a normalizeAllowedHost helper that strips a leading scheme and any path/query/fragment (https://reuters.com/marketsreuters.com) before persisting, so pasted URLs match url_guard's host comparison. Added a unit test covering it.

curl and (when enabled) the browser tool. Web search is not
gated by this list. */}
<div className="rounded-xl border border-stone-200 dark:border-neutral-800 bg-white dark:bg-neutral-900 p-3 space-y-2">
<label className="text-xs font-semibold text-stone-700 dark:text-neutral-200">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessibility: <label> not associated with any input. This element has no htmlFor referencing an input's id, so screen readers will announce it as an orphan label. Since it acts as a section heading rather than a form label, a <p> or <span> with font-semibold would be semantically cleaner here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3b10576 — it's a section heading, so changed the orphan <label> (no htmlFor) to a <p> with the same classes.

Comment on lines +168 to +170
fn default_http_allowed_domains() -> Vec<String> {
vec!["*".to_string()]
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behavior change for existing users — worth calling out explicitly. Previously HttpRequestConfig had no allowed_domains default (empty = block-all). After this PR, any existing config.toml that omits the key will get ["*"] on next read, silently opening outbound web access.

The SSRF guard limits the blast radius to public hosts, and the new Settings UI makes it easy to lock down, so this tradeoff is reasonable — but it should appear as a note in the release changelog so users who relied on the restrictive default are aware.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — this is intentional and called out in the PR's Impact + Parity Contract sections (default flips to ["*"] = public hosts only; SSRF guard still blocks local/private; allow_all=false + empty list reproduces the old block-by-default). Agreed it warrants a changelog line at release-cut; flagging for the release notes rather than a code change here.

Comment on lines +1022 to +1028
if let Some(allow_all) = update.allow_all {
if allow_all {
config.http_request.allowed_domains = vec!["*".to_string()];
} else {
config.http_request.allowed_domains.retain(|d| d != "*");
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch field ordering: allow_all wins over allowed_domains in the same call. Because allow_all is evaluated after allowed_domains, a combined patch like { allowed_domains: ["github.com"], allow_all: true } silently discards the explicit list and writes ["*"]. The Rust docstring documents this, but SearchSettingsUpdate in config.ts doesn't. Worth a one-line note there so callers know not to send both fields together with conflicting intent.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 3b10576 — added a note to SearchSettingsUpdate in config.ts that allow_all is applied after allowed_domains server-side, so it wins in a combined patch and callers shouldn't send both with conflicting intent.

…i#2704)

main HEAD (87f8ef4) is red on checks unrelated to this feature; bundling
the fixes here per request so tinyhumansai#2704's CI can go green. Each is a logical
merge conflict (both source PRs green alone, broken in combination):

- config/ops_tests.rs: tinyhumansai#2631 grew AutonomySettingsPatch to 7 fields, but
  apply_autonomy_settings_updates_action_budget still used a bare 1-field
  literal (E0063). Add ..Default::default().
- loopbackOauthListener.test.ts: tinyhumansai#2690 changed DEFAULT_TIMEOUT_SECS 300->60,
  but the bind-fail test still asserted 300. Update to 60.

Verified: cargo check --tests compiles; loopbackOauthListener 9/9 pass.

Co-Authored-By: Claude <noreply@anthropic.com>
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 26, 2026
sanil-23 and others added 3 commits May 26, 2026 20:51
…error wording (tinyhumansai#2704)

This PR rewrote the fetch/curl block errors from the dev-oriented
"...allowed_domains..." to the user-facing "...allowed websites...".
Three pre-existing tests asserted on the old substring and only surfaced
once the unrelated main compile error (E0063) was cleared in this branch,
unmasking the Rust test run:
- curl::execute_rejects_allowlist_miss
- curl::validate_url_rejects_disallowed_domain
- url_guard::ssrf_alternate_notations_rejected_by_validate_url
Each now asserts "allowed websites". (browser_open errors use a separate
message path and still say allowed_domains — left unchanged.)

Co-Authored-By: Claude <noreply@anthropic.com>
…mansai#2704)

Two more tests surfaced once the E0063 compile error was cleared (the whole
Rust suite was masked on both this branch and main). Neither is touched by
this PR; both encode pre-summarization-v1-tier behavior and the new values
are the intended behavior, not regressions:

- memory::chat::build_chat_runtime_defaults_to_openhuman_resolved_model:
  build_chat_runtime resolves the "summarization" role, which now has a
  dedicated tier -> summarization-v1 (was the generic reasoning-v1 default).
- inference::provider::factory::make_openhuman_backend_forwards_unknown_hint_verbatim:
  hint:summarization is now a canonical hint (-> summarization-v1), so it is
  dropped from the unknown-hint verbatim-passthrough list.

Verified via the mock harness that the remaining bare-run failures (composio
*via_mock* + memory-singleton parallel race) pass with scripts/test-rust-with-mock.sh.

Co-Authored-By: Claude <noreply@anthropic.com>
…alization, a11y (tinyhumansai#2704)

Addresses graycyrus + M3gA-Mind review feedback:
- url_guard block errors pointed at "Settings → Developer Options → Search
  engine"; the control's actual visible path is "Settings → Advanced →
  Search engine → Allowed websites" (SearchPanel is nested under the Advanced
  menu — settings.developerOptions renders as "Advanced"). Both reviewers
  flagged the old path was wrong; corrected to the accurate path.
- Dropped dead i18n key settings.search.allowAllAria (UI uses accessModeAria)
  from en.ts + all 14 locale chunk files — clears the cleanup debt the PR body
  flagged and keeps i18n parity green.
- SearchPanel: normalize pasted entries to bare hosts (strip scheme + path)
  so "https://reuters.com/markets" matches url_guard's host comparison;
  added a unit test.
- SearchPanel: orphan <label> (no htmlFor) → <p> section heading (a11y).
- config.ts: documented that allow_all overrides allowed_domains in a combined
  patch.

Co-Authored-By: Claude <noreply@anthropic.com>
@M3gA-Mind M3gA-Mind merged commit 77c15cb into tinyhumansai:main May 26, 2026
28 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants