security: harden gateway auth defaults and restrict auto-pair#123
security: harden gateway auth defaults and restrict auto-pair#123kakuteki wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
…VIDIA#117) - dangerouslyDisableDeviceAuth now defaults to false; only enabled when NEMOCLAW_DISABLE_DEVICE_AUTH=1 is explicitly set (removes the "temporary" bypass from commit 5fb629f that was never reverted) - allowInsecureAuth is now derived from the CHAT_UI_URL scheme: true for http (local dev), false for https (production/remote) - auto-pair watcher now only approves known client types (openclaw-control-ui, webchat) instead of unconditionally approving every pending device request
0516af4 to
6f603c7
Compare
📝 WalkthroughWalkthroughThe script now makes insecure authentication settings configurable via environment variables instead of hardcoded values, and introduces client/mode whitelisting for auto-pairing device approval to restrict which clients can self-register. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 145-151: The skipped entries are left in pending and reprocessed;
update the handling in the device validation branch (the checks using device,
clientId/clientMode, ALLOWED_CLIENTS and ALLOWED_MODES) to explicitly mark or
reject the request so it won't stay pending: call the existing rejection routine
(e.g. reject_request(requestId) or send a failure response) when device is not a
dict or when client_id/client_mode are not allowed, or alternatively record the
requestId into a handled_requests set and ensure the watcher filters
handled_requests out before the if pending: path; reference the variables
device, clientId/clientMode, client_id/client_mode, ALLOWED_CLIENTS,
ALLOWED_MODES, requestId and pending to locate where to insert the
rejection/marking logic.
- Around line 46-50: The current logic sets allow_insecure = parsed.scheme !=
'https', which enables insecure auth for malformed or unsupported schemes;
change this to only allow insecure when the parsed.scheme is explicitly 'http'
(e.g., allow_insecure = parsed.scheme == 'http'), and treat any other value
(empty or unknown scheme) as secure/closed (allow_insecure = False) or abort
early if you prefer stricter behavior; update the code that builds
gateway['controlUi'] (the allowInsecure and dangerouslyDisableDeviceAuth fields)
so only an explicit 'http' origin enables allowInsecure, referencing
disable_device_auth, parsed.scheme, gateway['controlUi'], allowInsecure, and
dangerouslyDisableDeviceAuth.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5cc4f01e-c4ce-4b1b-b808-15ed22d53c8d
📒 Files selected for processing (1)
scripts/nemoclaw-start.sh
| disable_device_auth = os.environ.get('NEMOCLAW_DISABLE_DEVICE_AUTH', '') == '1' | ||
| allow_insecure = parsed.scheme != 'https' | ||
| gateway['controlUi'] = { | ||
| 'allowInsecureAuth': True, | ||
| 'dangerouslyDisableDeviceAuth': True, | ||
| 'allowInsecureAuth': allow_insecure, | ||
| 'dangerouslyDisableDeviceAuth': disable_device_auth, |
There was a problem hiding this comment.
Fail closed for invalid CHAT_UI_URL values.
allow_insecure = parsed.scheme != 'https' enables insecure auth for any non-HTTPS input, including malformed values like example.com or unsupported schemes. That weakens the new default unless the caller gets the URL exactly right. Only an explicit http://... origin should enable this path; everything else should disable insecure auth or abort early.
🔐 Suggested fix
chat_ui_url = os.environ.get('CHAT_UI_URL', 'http://127.0.0.1:18789')
parsed = urlparse(chat_ui_url)
-chat_origin = f"{parsed.scheme}://{parsed.netloc}" if parsed.scheme and parsed.netloc else 'http://127.0.0.1:18789'
local_origin = f'http://127.0.0.1:{os.environ.get("PUBLIC_PORT", "18789")}'
+if parsed.scheme in {'http', 'https'} and parsed.netloc:
+ chat_origin = f"{parsed.scheme}://{parsed.netloc}"
+ allow_insecure = parsed.scheme == 'http'
+else:
+ chat_origin = local_origin
+ allow_insecure = False
origins = [local_origin]
if chat_origin not in origins:
origins.append(chat_origin)
gateway = cfg.setdefault('gateway', {})
gateway['mode'] = 'local'
disable_device_auth = os.environ.get('NEMOCLAW_DISABLE_DEVICE_AUTH', '') == '1'
-allow_insecure = parsed.scheme != 'https'
gateway['controlUi'] = {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/nemoclaw-start.sh` around lines 46 - 50, The current logic sets
allow_insecure = parsed.scheme != 'https', which enables insecure auth for
malformed or unsupported schemes; change this to only allow insecure when the
parsed.scheme is explicitly 'http' (e.g., allow_insecure = parsed.scheme ==
'http'), and treat any other value (empty or unknown scheme) as secure/closed
(allow_insecure = False) or abort early if you prefer stricter behavior; update
the code that builds gateway['controlUi'] (the allowInsecure and
dangerouslyDisableDeviceAuth fields) so only an explicit 'http' origin enables
allowInsecure, referencing disable_device_auth, parsed.scheme,
gateway['controlUi'], allowInsecure, and dangerouslyDisableDeviceAuth.
| if not isinstance(device, dict): | ||
| continue | ||
| client_id = device.get('clientId', '') | ||
| client_mode = device.get('clientMode', '') | ||
| if client_id not in ALLOWED_CLIENTS and client_mode not in ALLOWED_MODES: | ||
| print(f'[auto-pair] rejected unknown client={client_id} mode={client_mode}') | ||
| continue |
There was a problem hiding this comment.
Skipped requests never leave pending.
These continue paths only ignore or log bad entries. Because those requests remain pending, the watcher reprocesses them every second and never reaches the convergence path while they exist. Please reject them explicitly, or at least record handled requestIds and filter them out before the if pending: branch.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/nemoclaw-start.sh` around lines 145 - 151, The skipped entries are
left in pending and reprocessed; update the handling in the device validation
branch (the checks using device, clientId/clientMode, ALLOWED_CLIENTS and
ALLOWED_MODES) to explicitly mark or reject the request so it won't stay
pending: call the existing rejection routine (e.g. reject_request(requestId) or
send a failure response) when device is not a dict or when client_id/client_mode
are not allowed, or alternatively record the requestId into a handled_requests
set and ensure the watcher filters handled_requests out before the if pending:
path; reference the variables device, clientId/clientMode,
client_id/client_mode, ALLOWED_CLIENTS, ALLOWED_MODES, requestId and pending to
locate where to insert the rejection/marking logic.
|
Hi maintainers, could you please approve the CI workflow run for this PR? I've rebased onto the latest main. Thank you! |
|
Thanks for addressing the security concern and providing a fix for the gateway auth defaults, this is a significant improvement to the security posture of NemoClaw. |
cv
left a comment
There was a problem hiding this comment.
The security intent here is correct — device auth should default to enabled, and insecure auth should derive from the URL scheme. But the implementation is superseded by other in-flight work.
Stale against main
Based on the old inline Python config writer in nemoclaw-start.sh. Main has since locked openclaw.json to root-owned mode 444 — the inline config write will fail at runtime.
Auto-pair filtering superseded by #690
The clientId/clientMode allowlist is nearly identical to what #690 does. #690 also adds timeout reduction (600s→180s), one-shot exit after first approval, and token cleanup on start/exit. Note: we flagged in our #690 review that clientId is client-supplied and spoofable (the gateway stores connectParams.client.id verbatim — no server-side validation). The same issue applies to this PR's ALLOWED_CLIENTS check.
Contradicts #114
PR #114 unconditionally sets dangerouslyDisableDeviceAuth: true for remote access. This PR defaults it to false. These take opposite positions — they need to be reconciled before either merges.
Conflicts with #721
The gateway isolation PR restructures the entire entrypoint with gosu privilege separation, config integrity hashing, and PATH hardening. Both the inline config block and the auto-pair watcher are significantly changed in #721.
Recommendation
The auth hardening (points 1 and 2) is worth preserving but needs to be re-implemented against the current entrypoint structure. The auto-pair changes (point 3) should be dropped in favor of #690 which is more comprehensive. Consider rebasing and scoping this PR to just the auth defaults after #690 and #721 land.
…VIDIA#117) Fixes NVIDIA#117 — supersedes PR NVIDIA#123 which targeted stale code paths. Dockerfile: secure-by-default controlUi config - dangerouslyDisableDeviceAuth now defaults to false (was hardcoded True) - allowInsecureAuth derived from CHAT_UI_URL scheme (false for https) - New build-arg NEMOCLAW_DISABLE_DEVICE_AUTH=0 as opt-in escape hatch - Env var promoted safely via ENV (no ARG interpolation into Python) nemoclaw-start.sh: auto-pair client whitelisting - Only approve devices with known clientId (openclaw-control-ui) or clientMode (webchat); reject and log unknown clients - Validate device is a dict before accessing fields - Log client identity on approval for audit trail Tests (13 new) - Dockerfile: no hardcoded True for auth settings, env var derivation, ARG default, ENV promotion before RUN layer - Start script: whitelist definitions, rejection logic, dict validation, client identity logging, no unconditional approval pattern
- allowInsecureAuth: use explicit http allowlist (parsed.scheme == 'http') instead of https denylist (parsed.scheme != 'https') so malformed or unknown schemes default to secure (CodeRabbit finding on NVIDIA#123) - Add HANDLED set to track rejected/approved requestIds so rejected devices are not reprocessed every poll cycle (CodeRabbit finding on NVIDIA#123) - Document that clientId/clientMode are client-supplied and spoofable; the allowlist is defense-in-depth, not a trust boundary (@cv review on NVIDIA#123) - Reference PR NVIDIA#690 for the more comprehensive auto-pair fix (one-shot exit, timeout reduction, token cleanup)
) Fixes #117 — supersedes PR #123 which targeted stale code paths. Dockerfile: secure-by-default controlUi config - dangerouslyDisableDeviceAuth now defaults to false (was hardcoded True) - allowInsecureAuth derived from CHAT_UI_URL scheme (false for https) - New build-arg NEMOCLAW_DISABLE_DEVICE_AUTH=0 as opt-in escape hatch - Env var promoted safely via ENV (no ARG interpolation into Python) nemoclaw-start.sh: auto-pair client whitelisting - Only approve devices with known clientId (openclaw-control-ui) or clientMode (webchat); reject and log unknown clients - Validate device is a dict before accessing fields - Log client identity on approval for audit trail Tests (13 new) - Dockerfile: no hardcoded True for auth settings, env var derivation, ARG default, ENV promotion before RUN layer - Start script: whitelist definitions, rejection logic, dict validation, client identity logging, no unconditional approval pattern
- allowInsecureAuth: use explicit http allowlist (parsed.scheme == 'http') instead of https denylist (parsed.scheme != 'https') so malformed or unknown schemes default to secure (CodeRabbit finding on #123) - Add HANDLED set to track rejected/approved requestIds so rejected devices are not reprocessed every poll cycle (CodeRabbit finding on #123) - Document that clientId/clientMode are client-supplied and spoofable; the allowlist is defense-in-depth, not a trust boundary (@cv review on #123) - Reference PR #690 for the more comprehensive auto-pair fix (one-shot exit, timeout reduction, token cleanup)
…VIDIA#117) Fixes NVIDIA#117 — supersedes PR NVIDIA#123 which targeted stale code paths. Dockerfile: secure-by-default controlUi config - dangerouslyDisableDeviceAuth now defaults to false (was hardcoded True) - allowInsecureAuth derived from CHAT_UI_URL scheme (false for https) - New build-arg NEMOCLAW_DISABLE_DEVICE_AUTH=0 as opt-in escape hatch - Env var promoted safely via ENV (no ARG interpolation into Python) nemoclaw-start.sh: auto-pair client whitelisting - Only approve devices with known clientId (openclaw-control-ui) or clientMode (webchat); reject and log unknown clients - Validate device is a dict before accessing fields - Log client identity on approval for audit trail Tests (13 new) - Dockerfile: no hardcoded True for auth settings, env var derivation, ARG default, ENV promotion before RUN layer - Start script: whitelist definitions, rejection logic, dict validation, client identity logging, no unconditional approval pattern
- allowInsecureAuth: use explicit http allowlist (parsed.scheme == 'http') instead of https denylist (parsed.scheme != 'https') so malformed or unknown schemes default to secure (CodeRabbit finding on NVIDIA#123) - Add HANDLED set to track rejected/approved requestIds so rejected devices are not reprocessed every poll cycle (CodeRabbit finding on NVIDIA#123) - Document that clientId/clientMode are client-supplied and spoofable; the allowlist is defense-in-depth, not a trust boundary (@cv review on NVIDIA#123) - Reference PR NVIDIA#690 for the more comprehensive auto-pair fix (one-shot exit, timeout reduction, token cleanup)
…1217) ## Summary Fixes #117 — Insecure authentication configuration in gateway setup. > **Supersedes PR #123**, which targeted the correct security issues but patched code paths that have since moved. Recommend closing #123 in favor of this PR. --- ## Background NemoClaw runs an AI assistant inside a locked-down sandbox. The **gateway** is the front door to that sandbox, and the **Control UI** is a web dashboard users open in their browser to interact with the assistant. There were three security problems with that front door: ### Problem 1: Device auth was permanently disabled 🔓 The config had `dangerouslyDisableDeviceAuth: True` hardcoded. This meant any device could connect to the gateway without proving its identity. It was added as a temporary workaround (commit `5fb629f`) and never reverted. While the sandbox network isolation mitigates this today, it becomes dangerous if combined with LAN-bind changes or cloudflared tunnels in remote deployments — resulting in an unauthenticated, publicly reachable dashboard. **Fix:** Default to `false` (device auth enabled). Developers who genuinely need it disabled for headless/dev environments can set `NEMOCLAW_DISABLE_DEVICE_AUTH=1` as a build arg. ### Problem 2: Insecure auth was always allowed, even over HTTPS 🌐 `allowInsecureAuth: True` was hardcoded, meaning insecure (non-HTTPS) authentication was permitted regardless of deployment context. For local development on `http://127.0.0.1` this is fine, but for production/remote deployments over `https://` it defeats the purpose of TLS. **Fix:** Derive it from the `CHAT_UI_URL` scheme. If the URL is `http://` → allow insecure auth (local dev). If `https://` → block it (production). ### Problem 3: The auto-pair bouncer let everyone in 🚪 The `start_auto_pair` watcher automatically approved *every* pending device pairing request with zero validation. A rogue or unexpected client type could pair with the gateway unchallenged. **Fix:** Only approve devices with recognized `clientId` (`openclaw-control-ui`) or `clientMode` (`webchat`). Unknown clients are rejected and logged. --- ## Why this PR exists (instead of merging #123) PR #123 (opened Mar 17) correctly identified all three issues and proposed the right fixes. However, since then, commit `0f8eedd` ("make openclaw.json immutable at runtime, move config to build time") moved the gateway `controlUi` configuration from `scripts/nemoclaw-start.sh` into the **Dockerfile**. The config is now baked at image build time and verified via SHA-256 hash at startup. PR #123's changes to the start script's Python config block would patch dead code — that block no longer exists. This PR applies the same security logic where the code actually lives now: | Issue | PR #123 patched | This PR patches | |-------|----------------|----------------| | `dangerouslyDisableDeviceAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | `allowInsecureAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | Auto-pair whitelisting | `scripts/nemoclaw-start.sh` ✅ | `scripts/nemoclaw-start.sh` ✅ | --- ## Changes ### `Dockerfile` (+8/-2) - `dangerouslyDisableDeviceAuth` → derived from `NEMOCLAW_DISABLE_DEVICE_AUTH` env var (default `0` = secure) - `allowInsecureAuth` → derived from `CHAT_UI_URL` scheme (`false` when `https://`) - New `ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0` with `ENV` promotion (follows existing C-2 safe pattern — no ARG interpolation into Python source) ### `scripts/nemoclaw-start.sh` (+12/-4) - Auto-pair only approves `ALLOWED_CLIENTS = {'openclaw-control-ui'}` and `ALLOWED_MODES = {'webchat'}` - Validates `isinstance(device, dict)` before field access - Rejects unknown clients with `[auto-pair] rejected unknown client=... mode=...` log - Logs `client=` on approval for audit trail - Documents `NEMOCLAW_DISABLE_DEVICE_AUTH` in script header ### `test/security-c2-dockerfile-injection.test.js` (+6 tests) - No hardcoded `True` for `dangerouslyDisableDeviceAuth` - No hardcoded `True` for `allowInsecureAuth` - Env var derivation for both settings - `ARG` defaults to `0` - `ENV` promotion before `RUN` layer ### `test/nemoclaw-start.test.js` (+7 tests) - Whitelist definitions (`ALLOWED_CLIENTS`, `ALLOWED_MODES`) - Rejection logic for unknown clients - Dict type validation - Client identity in approval logs - No unconditional approval pattern - `NEMOCLAW_DISABLE_DEVICE_AUTH` documented --- ## Why no Brev E2E test These changes are **deterministic config values and simple conditional logic**, not runtime behavior with environmental ambiguity: 1. **Dockerfile config** — Two Python booleans evaluated at build time: `os.environ.get(...) == '1'` and `parsed.scheme != 'https'`. If the source is correct, the output is correct. 2. **Auto-pair whitelisting** — Python set membership check. No shell interpretation, no SSH pipeline, no quoting edge cases. This is fundamentally different from something like the telegram bridge injection fix, where you *must* send real payloads through a real shell on a real sandbox to prove metacharacters don't get interpreted. Our changes have no gap between "looks right in source" and "works right at runtime." The 13 source-pattern tests provide sufficient confidence. If deeper verification is desired in the future, a **build-time unit test** that runs the Dockerfile's Python snippet and asserts the JSON output would be the appropriate next step — a regular CI test, not a Brev instance. --- ## Test plan - [x] All 752 tests pass (739 existing + 13 new) - [x] `shellcheck` ✅ - [x] `hadolint` ✅ - [x] `shfmt` / `prettier` ✅ - [x] `gitleaks` ✅ - [ ] Default onboarding: gateway starts with device auth enabled, Control UI pairing works via auto-pair - [ ] `NEMOCLAW_DISABLE_DEVICE_AUTH=1` build: device auth is disabled (backward-compatible escape hatch) - [ ] `CHAT_UI_URL=https://...` build: `allowInsecureAuth` is `false` - [ ] `CHAT_UI_URL=http://127.0.0.1:18789` (default): `allowInsecureAuth` is `true` - [ ] Unknown device type connecting to gateway: auto-pair rejects and logs - [ ] `openclaw-control-ui` / `webchat` devices: auto-pair approves as before <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Security** * Device authentication can be toggled at build time; UI insecure-auth is allowed only when the control UI URL uses http. * **Bug Fixes** * Auto-pair processing now validates inputs, rejects unknown client/mode, logs identifiers, and deduplicates requests to avoid repeated approvals/rejections. * **Tests** * Added tests for auto-pair validation, logging, de-duplication, and configurable auth behavior. * **Chores** * Sandbox/setup flow now enforces the build-time device-auth setting for images. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…1217) ## Summary Fixes #117 — Insecure authentication configuration in gateway setup. > **Supersedes PR #123**, which targeted the correct security issues but patched code paths that have since moved. Recommend closing #123 in favor of this PR. --- ## Background NemoClaw runs an AI assistant inside a locked-down sandbox. The **gateway** is the front door to that sandbox, and the **Control UI** is a web dashboard users open in their browser to interact with the assistant. There were three security problems with that front door: ### Problem 1: Device auth was permanently disabled 🔓 The config had `dangerouslyDisableDeviceAuth: True` hardcoded. This meant any device could connect to the gateway without proving its identity. It was added as a temporary workaround (commit `5fb629f`) and never reverted. While the sandbox network isolation mitigates this today, it becomes dangerous if combined with LAN-bind changes or cloudflared tunnels in remote deployments — resulting in an unauthenticated, publicly reachable dashboard. **Fix:** Default to `false` (device auth enabled). Developers who genuinely need it disabled for headless/dev environments can set `NEMOCLAW_DISABLE_DEVICE_AUTH=1` as a build arg. ### Problem 2: Insecure auth was always allowed, even over HTTPS 🌐 `allowInsecureAuth: True` was hardcoded, meaning insecure (non-HTTPS) authentication was permitted regardless of deployment context. For local development on `http://127.0.0.1` this is fine, but for production/remote deployments over `https://` it defeats the purpose of TLS. **Fix:** Derive it from the `CHAT_UI_URL` scheme. If the URL is `http://` → allow insecure auth (local dev). If `https://` → block it (production). ### Problem 3: The auto-pair bouncer let everyone in 🚪 The `start_auto_pair` watcher automatically approved *every* pending device pairing request with zero validation. A rogue or unexpected client type could pair with the gateway unchallenged. **Fix:** Only approve devices with recognized `clientId` (`openclaw-control-ui`) or `clientMode` (`webchat`). Unknown clients are rejected and logged. --- ## Why this PR exists (instead of merging #123) PR #123 (opened Mar 17) correctly identified all three issues and proposed the right fixes. However, since then, commit `0f8eedd` ("make openclaw.json immutable at runtime, move config to build time") moved the gateway `controlUi` configuration from `scripts/nemoclaw-start.sh` into the **Dockerfile**. The config is now baked at image build time and verified via SHA-256 hash at startup. PR #123's changes to the start script's Python config block would patch dead code — that block no longer exists. This PR applies the same security logic where the code actually lives now: | Issue | PR #123 patched | This PR patches | |-------|----------------|----------------| | `dangerouslyDisableDeviceAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | `allowInsecureAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | Auto-pair whitelisting | `scripts/nemoclaw-start.sh` ✅ | `scripts/nemoclaw-start.sh` ✅ | --- ## Changes ### `Dockerfile` (+8/-2) - `dangerouslyDisableDeviceAuth` → derived from `NEMOCLAW_DISABLE_DEVICE_AUTH` env var (default `0` = secure) - `allowInsecureAuth` → derived from `CHAT_UI_URL` scheme (`false` when `https://`) - New `ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0` with `ENV` promotion (follows existing C-2 safe pattern — no ARG interpolation into Python source) ### `scripts/nemoclaw-start.sh` (+12/-4) - Auto-pair only approves `ALLOWED_CLIENTS = {'openclaw-control-ui'}` and `ALLOWED_MODES = {'webchat'}` - Validates `isinstance(device, dict)` before field access - Rejects unknown clients with `[auto-pair] rejected unknown client=... mode=...` log - Logs `client=` on approval for audit trail - Documents `NEMOCLAW_DISABLE_DEVICE_AUTH` in script header ### `test/security-c2-dockerfile-injection.test.js` (+6 tests) - No hardcoded `True` for `dangerouslyDisableDeviceAuth` - No hardcoded `True` for `allowInsecureAuth` - Env var derivation for both settings - `ARG` defaults to `0` - `ENV` promotion before `RUN` layer ### `test/nemoclaw-start.test.js` (+7 tests) - Whitelist definitions (`ALLOWED_CLIENTS`, `ALLOWED_MODES`) - Rejection logic for unknown clients - Dict type validation - Client identity in approval logs - No unconditional approval pattern - `NEMOCLAW_DISABLE_DEVICE_AUTH` documented --- ## Why no Brev E2E test These changes are **deterministic config values and simple conditional logic**, not runtime behavior with environmental ambiguity: 1. **Dockerfile config** — Two Python booleans evaluated at build time: `os.environ.get(...) == '1'` and `parsed.scheme != 'https'`. If the source is correct, the output is correct. 2. **Auto-pair whitelisting** — Python set membership check. No shell interpretation, no SSH pipeline, no quoting edge cases. This is fundamentally different from something like the telegram bridge injection fix, where you *must* send real payloads through a real shell on a real sandbox to prove metacharacters don't get interpreted. Our changes have no gap between "looks right in source" and "works right at runtime." The 13 source-pattern tests provide sufficient confidence. If deeper verification is desired in the future, a **build-time unit test** that runs the Dockerfile's Python snippet and asserts the JSON output would be the appropriate next step — a regular CI test, not a Brev instance. --- ## Test plan - [x] All 752 tests pass (739 existing + 13 new) - [x] `shellcheck` ✅ - [x] `hadolint` ✅ - [x] `shfmt` / `prettier` ✅ - [x] `gitleaks` ✅ - [ ] Default onboarding: gateway starts with device auth enabled, Control UI pairing works via auto-pair - [ ] `NEMOCLAW_DISABLE_DEVICE_AUTH=1` build: device auth is disabled (backward-compatible escape hatch) - [ ] `CHAT_UI_URL=https://...` build: `allowInsecureAuth` is `false` - [ ] `CHAT_UI_URL=http://127.0.0.1:18789` (default): `allowInsecureAuth` is `true` - [ ] Unknown device type connecting to gateway: auto-pair rejects and logs - [ ] `openclaw-control-ui` / `webchat` devices: auto-pair approves as before <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Security** * Device authentication can be toggled at build time; UI insecure-auth is allowed only when the control UI URL uses http. * **Bug Fixes** * Auto-pair processing now validates inputs, rejects unknown client/mode, logs identifiers, and deduplicates requests to avoid repeated approvals/rejections. * **Tests** * Added tests for auto-pair validation, logging, de-duplication, and configurable auth behavior. * **Chores** * Sandbox/setup flow now enforces the build-time device-auth setting for images. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
|
Hi @kakuteki — thank you for this contribution. Your PR was the first to identify and propose fixes for all three security issues in the gateway auth configuration: the hardcoded Since this PR was opened, the codebase has shifted — commit We had several PRs in flight tackling overlapping parts of the same problem (#690, #1149, and this one). To consolidate approaches and target the code where it now lives, we merged #1217 ( We referred to your implementation while building the consolidated fix, and the input from your proposal directly shaped the final design. Closing this in favor of the merged approach in #1217. The linked issue #117 is already closed as well. Thank you again for the security contribution — we hope to see more from you on NemoClaw! 🙏 |
|
Closing — superseded by #1217 (see comment above). |
…VIDIA#1217) ## Summary Fixes NVIDIA#117 — Insecure authentication configuration in gateway setup. > **Supersedes PR NVIDIA#123**, which targeted the correct security issues but patched code paths that have since moved. Recommend closing NVIDIA#123 in favor of this PR. --- ## Background NemoClaw runs an AI assistant inside a locked-down sandbox. The **gateway** is the front door to that sandbox, and the **Control UI** is a web dashboard users open in their browser to interact with the assistant. There were three security problems with that front door: ### Problem 1: Device auth was permanently disabled 🔓 The config had `dangerouslyDisableDeviceAuth: True` hardcoded. This meant any device could connect to the gateway without proving its identity. It was added as a temporary workaround (commit `5fb629f`) and never reverted. While the sandbox network isolation mitigates this today, it becomes dangerous if combined with LAN-bind changes or cloudflared tunnels in remote deployments — resulting in an unauthenticated, publicly reachable dashboard. **Fix:** Default to `false` (device auth enabled). Developers who genuinely need it disabled for headless/dev environments can set `NEMOCLAW_DISABLE_DEVICE_AUTH=1` as a build arg. ### Problem 2: Insecure auth was always allowed, even over HTTPS 🌐 `allowInsecureAuth: True` was hardcoded, meaning insecure (non-HTTPS) authentication was permitted regardless of deployment context. For local development on `http://127.0.0.1` this is fine, but for production/remote deployments over `https://` it defeats the purpose of TLS. **Fix:** Derive it from the `CHAT_UI_URL` scheme. If the URL is `http://` → allow insecure auth (local dev). If `https://` → block it (production). ### Problem 3: The auto-pair bouncer let everyone in 🚪 The `start_auto_pair` watcher automatically approved *every* pending device pairing request with zero validation. A rogue or unexpected client type could pair with the gateway unchallenged. **Fix:** Only approve devices with recognized `clientId` (`openclaw-control-ui`) or `clientMode` (`webchat`). Unknown clients are rejected and logged. --- ## Why this PR exists (instead of merging NVIDIA#123) PR NVIDIA#123 (opened Mar 17) correctly identified all three issues and proposed the right fixes. However, since then, commit `0f8eedd` ("make openclaw.json immutable at runtime, move config to build time") moved the gateway `controlUi` configuration from `scripts/nemoclaw-start.sh` into the **Dockerfile**. The config is now baked at image build time and verified via SHA-256 hash at startup. PR NVIDIA#123's changes to the start script's Python config block would patch dead code — that block no longer exists. This PR applies the same security logic where the code actually lives now: | Issue | PR NVIDIA#123 patched | This PR patches | |-------|----------------|----------------| | `dangerouslyDisableDeviceAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | `allowInsecureAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | Auto-pair whitelisting | `scripts/nemoclaw-start.sh` ✅ | `scripts/nemoclaw-start.sh` ✅ | --- ## Changes ### `Dockerfile` (+8/-2) - `dangerouslyDisableDeviceAuth` → derived from `NEMOCLAW_DISABLE_DEVICE_AUTH` env var (default `0` = secure) - `allowInsecureAuth` → derived from `CHAT_UI_URL` scheme (`false` when `https://`) - New `ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0` with `ENV` promotion (follows existing C-2 safe pattern — no ARG interpolation into Python source) ### `scripts/nemoclaw-start.sh` (+12/-4) - Auto-pair only approves `ALLOWED_CLIENTS = {'openclaw-control-ui'}` and `ALLOWED_MODES = {'webchat'}` - Validates `isinstance(device, dict)` before field access - Rejects unknown clients with `[auto-pair] rejected unknown client=... mode=...` log - Logs `client=` on approval for audit trail - Documents `NEMOCLAW_DISABLE_DEVICE_AUTH` in script header ### `test/security-c2-dockerfile-injection.test.js` (+6 tests) - No hardcoded `True` for `dangerouslyDisableDeviceAuth` - No hardcoded `True` for `allowInsecureAuth` - Env var derivation for both settings - `ARG` defaults to `0` - `ENV` promotion before `RUN` layer ### `test/nemoclaw-start.test.js` (+7 tests) - Whitelist definitions (`ALLOWED_CLIENTS`, `ALLOWED_MODES`) - Rejection logic for unknown clients - Dict type validation - Client identity in approval logs - No unconditional approval pattern - `NEMOCLAW_DISABLE_DEVICE_AUTH` documented --- ## Why no Brev E2E test These changes are **deterministic config values and simple conditional logic**, not runtime behavior with environmental ambiguity: 1. **Dockerfile config** — Two Python booleans evaluated at build time: `os.environ.get(...) == '1'` and `parsed.scheme != 'https'`. If the source is correct, the output is correct. 2. **Auto-pair whitelisting** — Python set membership check. No shell interpretation, no SSH pipeline, no quoting edge cases. This is fundamentally different from something like the telegram bridge injection fix, where you *must* send real payloads through a real shell on a real sandbox to prove metacharacters don't get interpreted. Our changes have no gap between "looks right in source" and "works right at runtime." The 13 source-pattern tests provide sufficient confidence. If deeper verification is desired in the future, a **build-time unit test** that runs the Dockerfile's Python snippet and asserts the JSON output would be the appropriate next step — a regular CI test, not a Brev instance. --- ## Test plan - [x] All 752 tests pass (739 existing + 13 new) - [x] `shellcheck` ✅ - [x] `hadolint` ✅ - [x] `shfmt` / `prettier` ✅ - [x] `gitleaks` ✅ - [ ] Default onboarding: gateway starts with device auth enabled, Control UI pairing works via auto-pair - [ ] `NEMOCLAW_DISABLE_DEVICE_AUTH=1` build: device auth is disabled (backward-compatible escape hatch) - [ ] `CHAT_UI_URL=https://...` build: `allowInsecureAuth` is `false` - [ ] `CHAT_UI_URL=http://127.0.0.1:18789` (default): `allowInsecureAuth` is `true` - [ ] Unknown device type connecting to gateway: auto-pair rejects and logs - [ ] `openclaw-control-ui` / `webchat` devices: auto-pair approves as before <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Security** * Device authentication can be toggled at build time; UI insecure-auth is allowed only when the control UI URL uses http. * **Bug Fixes** * Auto-pair processing now validates inputs, rejects unknown client/mode, logs identifiers, and deduplicates requests to avoid repeated approvals/rejections. * **Tests** * Added tests for auto-pair validation, logging, de-duplication, and configurable auth behavior. * **Chores** * Sandbox/setup flow now enforces the build-time device-auth setting for images. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…VIDIA#1217) ## Summary Fixes NVIDIA#117 — Insecure authentication configuration in gateway setup. > **Supersedes PR NVIDIA#123**, which targeted the correct security issues but patched code paths that have since moved. Recommend closing NVIDIA#123 in favor of this PR. --- ## Background NemoClaw runs an AI assistant inside a locked-down sandbox. The **gateway** is the front door to that sandbox, and the **Control UI** is a web dashboard users open in their browser to interact with the assistant. There were three security problems with that front door: ### Problem 1: Device auth was permanently disabled 🔓 The config had `dangerouslyDisableDeviceAuth: True` hardcoded. This meant any device could connect to the gateway without proving its identity. It was added as a temporary workaround (commit `5fb629f`) and never reverted. While the sandbox network isolation mitigates this today, it becomes dangerous if combined with LAN-bind changes or cloudflared tunnels in remote deployments — resulting in an unauthenticated, publicly reachable dashboard. **Fix:** Default to `false` (device auth enabled). Developers who genuinely need it disabled for headless/dev environments can set `NEMOCLAW_DISABLE_DEVICE_AUTH=1` as a build arg. ### Problem 2: Insecure auth was always allowed, even over HTTPS 🌐 `allowInsecureAuth: True` was hardcoded, meaning insecure (non-HTTPS) authentication was permitted regardless of deployment context. For local development on `http://127.0.0.1` this is fine, but for production/remote deployments over `https://` it defeats the purpose of TLS. **Fix:** Derive it from the `CHAT_UI_URL` scheme. If the URL is `http://` → allow insecure auth (local dev). If `https://` → block it (production). ### Problem 3: The auto-pair bouncer let everyone in 🚪 The `start_auto_pair` watcher automatically approved *every* pending device pairing request with zero validation. A rogue or unexpected client type could pair with the gateway unchallenged. **Fix:** Only approve devices with recognized `clientId` (`openclaw-control-ui`) or `clientMode` (`webchat`). Unknown clients are rejected and logged. --- ## Why this PR exists (instead of merging NVIDIA#123) PR NVIDIA#123 (opened Mar 17) correctly identified all three issues and proposed the right fixes. However, since then, commit `0f8eedd` ("make openclaw.json immutable at runtime, move config to build time") moved the gateway `controlUi` configuration from `scripts/nemoclaw-start.sh` into the **Dockerfile**. The config is now baked at image build time and verified via SHA-256 hash at startup. PR NVIDIA#123's changes to the start script's Python config block would patch dead code — that block no longer exists. This PR applies the same security logic where the code actually lives now: | Issue | PR NVIDIA#123 patched | This PR patches | |-------|----------------|----------------| | `dangerouslyDisableDeviceAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | `allowInsecureAuth` | `scripts/nemoclaw-start.sh` (removed) | `Dockerfile` (build-time config) | | Auto-pair whitelisting | `scripts/nemoclaw-start.sh` ✅ | `scripts/nemoclaw-start.sh` ✅ | --- ## Changes ### `Dockerfile` (+8/-2) - `dangerouslyDisableDeviceAuth` → derived from `NEMOCLAW_DISABLE_DEVICE_AUTH` env var (default `0` = secure) - `allowInsecureAuth` → derived from `CHAT_UI_URL` scheme (`false` when `https://`) - New `ARG NEMOCLAW_DISABLE_DEVICE_AUTH=0` with `ENV` promotion (follows existing C-2 safe pattern — no ARG interpolation into Python source) ### `scripts/nemoclaw-start.sh` (+12/-4) - Auto-pair only approves `ALLOWED_CLIENTS = {'openclaw-control-ui'}` and `ALLOWED_MODES = {'webchat'}` - Validates `isinstance(device, dict)` before field access - Rejects unknown clients with `[auto-pair] rejected unknown client=... mode=...` log - Logs `client=` on approval for audit trail - Documents `NEMOCLAW_DISABLE_DEVICE_AUTH` in script header ### `test/security-c2-dockerfile-injection.test.js` (+6 tests) - No hardcoded `True` for `dangerouslyDisableDeviceAuth` - No hardcoded `True` for `allowInsecureAuth` - Env var derivation for both settings - `ARG` defaults to `0` - `ENV` promotion before `RUN` layer ### `test/nemoclaw-start.test.js` (+7 tests) - Whitelist definitions (`ALLOWED_CLIENTS`, `ALLOWED_MODES`) - Rejection logic for unknown clients - Dict type validation - Client identity in approval logs - No unconditional approval pattern - `NEMOCLAW_DISABLE_DEVICE_AUTH` documented --- ## Why no Brev E2E test These changes are **deterministic config values and simple conditional logic**, not runtime behavior with environmental ambiguity: 1. **Dockerfile config** — Two Python booleans evaluated at build time: `os.environ.get(...) == '1'` and `parsed.scheme != 'https'`. If the source is correct, the output is correct. 2. **Auto-pair whitelisting** — Python set membership check. No shell interpretation, no SSH pipeline, no quoting edge cases. This is fundamentally different from something like the telegram bridge injection fix, where you *must* send real payloads through a real shell on a real sandbox to prove metacharacters don't get interpreted. Our changes have no gap between "looks right in source" and "works right at runtime." The 13 source-pattern tests provide sufficient confidence. If deeper verification is desired in the future, a **build-time unit test** that runs the Dockerfile's Python snippet and asserts the JSON output would be the appropriate next step — a regular CI test, not a Brev instance. --- ## Test plan - [x] All 752 tests pass (739 existing + 13 new) - [x] `shellcheck` ✅ - [x] `hadolint` ✅ - [x] `shfmt` / `prettier` ✅ - [x] `gitleaks` ✅ - [ ] Default onboarding: gateway starts with device auth enabled, Control UI pairing works via auto-pair - [ ] `NEMOCLAW_DISABLE_DEVICE_AUTH=1` build: device auth is disabled (backward-compatible escape hatch) - [ ] `CHAT_UI_URL=https://...` build: `allowInsecureAuth` is `false` - [ ] `CHAT_UI_URL=http://127.0.0.1:18789` (default): `allowInsecureAuth` is `true` - [ ] Unknown device type connecting to gateway: auto-pair rejects and logs - [ ] `openclaw-control-ui` / `webchat` devices: auto-pair approves as before <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Security** * Device authentication can be toggled at build time; UI insecure-auth is allowed only when the control UI URL uses http. * **Bug Fixes** * Auto-pair processing now validates inputs, rejects unknown client/mode, logs identifiers, and deduplicates requests to avoid repeated approvals/rejections. * **Tests** * Added tests for auto-pair validation, logging, de-duplication, and configurable auth behavior. * **Chores** * Sandbox/setup flow now enforces the build-time device-auth setting for images. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Fixes #117 — Insecure authentication configuration in gateway setup.
The gateway's
controlUiconfig unconditionally setdangerouslyDisableDeviceAuth: TrueandallowInsecureAuth: True. This was added as a temporary workaround (commit5fb629f, message: "Disable deviceauth temporarily") but was never reverted. While the currentmainbranch is protected by sandbox network isolation andgateway.mode = 'local', these settings become dangerous if combined with a LAN-bind change (e.g. PR #114) or a cloudflared tunnel in remote deployments — resulting in an unauthenticated, publicly reachable dashboard.Changes
1.
dangerouslyDisableDeviceAuth— default tofalseDevice-pairing auth is now enabled by default. For development or headless environments that genuinely need it disabled, set
NEMOCLAW_DISABLE_DEVICE_AUTH=1explicitly.2.
allowInsecureAuth— derive from URL schemeInstead of unconditionally allowing insecure auth, it is now enabled only when
CHAT_UI_URLuseshttp://(local development). Whenhttps://is configured (production/remote), insecure auth is automatically disabled.3.
start_auto_pair— restrict to known client typesThe auto-pair watcher previously approved every pending device request unconditionally. It now only approves devices with recognized
clientId(openclaw-control-ui) orclientMode(webchat). Unknown clients are logged and rejected.Test plan
NEMOCLAW_DISABLE_DEVICE_AUTH=1: device auth is disabled (backward-compatible escape hatch)CHAT_UI_URL=https://...:allowInsecureAuthisfalseCHAT_UI_URL=http://127.0.0.1:18789(default):allowInsecureAuthistrueopenclaw-control-ui/webchatdevices: auto-pair approves as beforeSummary by CodeRabbit
New Features
Improvements