fix: detect Docker Desktop socket on macOS#224
fix: detect Docker Desktop socket on macOS#224brianwtaylor wants to merge 4 commits intoNVIDIA:mainfrom
Conversation
a43532e to
702d05c
Compare
Extract detectContainerSocket() with dependency injection and add Podman socket candidates (machine, rootless, QEMU). Colima sockets are checked first to preserve existing behavior. Add Podman fallback to isDockerRunning() with host-gateway caveat warning. Closes #116 Signed-off-by: Brian Taylor <brian@briantaylor.xyz> Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
702d05c to
3441151
Compare
📝 WalkthroughWalkthroughAdds a full interactive onboarding flow, broadens container socket detection (Docker, Colima, Podman) with tests, introduces gateway/sandbox/NIM/OpenClaw setup logic, extends CLI commands, and adds a Telegram sandbox bridge and related helpers. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as "CLI (nemoclaw)"
participant Onb as "onboard.js"
participant Runner as "runner.js (detectContainerSocket)"
participant Gateway as "Gateway"
participant Sandbox
User->>CLI: run `onboard`
CLI->>Onb: invoke onboard()
Onb->>Runner: detectContainerSocket(opts)
Runner-->>Onb: socket path / runtime or null
Onb->>Gateway: startGateway(runtime)
Gateway-->>Onb: health OK
Onb->>Sandbox: createSandbox(name, buildCtx)
Sandbox-->>Onb: sandbox ready
Onb->>Sandbox: setupNim()/setupInference()/setupOpenclaw()
Sandbox-->>Onb: services registered (openshell, NIM)
Onb->>CLI: printDashboard(status)
CLI-->>User: display onboarding summary
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Important Merge conflicts detected (Beta)
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
bin/lib/runner.js (1)
12-13: Update the function docstring to include Docker Desktop in the priority order.The implementation now checks Colima, then Docker Desktop, then Podman; the comment should match that.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/runner.js` around lines 12 - 13, Update the function docstring that currently states the priority as "Colima first, then Podman" to reflect the actual implementation order (Colima, then Docker Desktop, then Podman); locate the comment above the detectContainerRuntimeSocket function (or the block starting "Detect a container runtime socket") and edit the description to list Colima, Docker Desktop, then Podman, and mention it returns the socket path or null.bin/lib/onboard.js (1)
30-34: Differentiate runtime type instead of returning a plain boolean.This path returns
truefor Podman, but downstream messaging still states Docker is running. Consider returning"docker" | "podman" | nullso preflight output and follow-up guidance stay accurate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 30 - 34, The Podman branch in bin/lib/onboard.js currently returns a plain boolean, causing downstream messaging to incorrectly report Docker; change the function’s return contract to return "docker" | "podman" | null instead of true/false: when runCapture("podman info", ...) succeeds return "podman", ensure the Docker detection branch returns "docker", and return null when neither runtime is found; update any callers that use this function (and the preflight/messaging logic) to branch on the returned string so output and guidance correctly reference the detected runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 30-34: The Podman branch in bin/lib/onboard.js currently returns a
plain boolean, causing downstream messaging to incorrectly report Docker; change
the function’s return contract to return "docker" | "podman" | null instead of
true/false: when runCapture("podman info", ...) succeeds return "podman", ensure
the Docker detection branch returns "docker", and return null when neither
runtime is found; update any callers that use this function (and the
preflight/messaging logic) to branch on the returned string so output and
guidance correctly reference the detected runtime.
In `@bin/lib/runner.js`:
- Around line 12-13: Update the function docstring that currently states the
priority as "Colima first, then Podman" to reflect the actual implementation
order (Colima, then Docker Desktop, then Podman); locate the comment above the
detectContainerRuntimeSocket function (or the block starting "Detect a container
runtime socket") and edit the description to list Colima, Docker Desktop, then
Podman, and mention it returns the socket path or null.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 061ade7a-a5ff-4871-881e-c7ff5bf96fe2
📒 Files selected for processing (3)
bin/lib/onboard.jsbin/lib/runner.jstest/runner.test.js
…iority docs - Return "docker" | "podman" | null from detectContainerRuntime() so preflight messaging accurately reflects which runtime was found - Fix detectContainerSocket docstring to list Docker Desktop in the priority order (Colima → Docker Desktop → Podman) - Add JSDoc to run() and runCapture() for docstring coverage
Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
bin/lib/onboard.js (1)
478-493:⚠️ Potential issue | 🟡 MinorStep 6 is a no-op placeholder.
setupOpenclawprints a success message but performs no actual setup. If OpenClaw gateway launch is deferred, consider documenting when it actually happens.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 478 - 493, The setupOpenclaw function is a no-op that prints a misleading success message; update it to either perform the actual OpenClaw gateway launch or clearly mark the step as deferred. Locate setupOpenclaw and either implement the real launch flow (invoke the sandbox connect/create commands or call the helper that runs the gateway, handle errors and await completion, and keep the existing step(6, 7, ...) usage) or change the console message to indicate the launch is deferred and add a TODO comment documenting where/when the gateway is actually started (and avoid printing a false success).scripts/telegram-bridge.js (2)
33-33:⚠️ Potential issue | 🟡 Minor
activeSessionsmap is declared but never populated.The
activeSessionsMap is initialized and/resetdeletes from it, but no code ever adds entries. This appears to be dead code or incomplete session tracking.Consider either implementing session history tracking or removing this unused declaration.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/telegram-bridge.js` at line 33, The Map activeSessions is declared but never populated, so either implement session tracking where messages are processed or remove the unused Map and its reset logic: if you want session history, add entries to activeSessions keyed by chatId inside the main message handler (e.g., where messages are received/handled — look for the bot.on('message') or onText handler) and update/clear history in the '/reset' command handler; otherwise remove the activeSessions declaration and any code that deletes from it (the '/reset' branch) to eliminate dead code.
112-127:⚠️ Potential issue | 🟠 MajorAPI key visible in process list; temp file not cleaned on spawn error.
- The
NVIDIA_API_KEYis interpolated into the SSH command string, making it visible inpsoutput on both local and remote hosts.- If
spawnemits anerrorevent (e.g., SSH binary not found), the temp config file atconfPathis never deleted.🔒 Suggested improvements
Pass the API key via SSH's
SendEnvor write it to a temp file that's sourced remotely. For the cleanup issue:proc.on("error", (err) => { + try { require("fs").unlinkSync(confPath); } catch {} resolve(`Error: ${err.message}`); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/telegram-bridge.js` around lines 112 - 127, In runAgentInSandbox avoid embedding API_KEY in the remote command (so it doesn't appear in ps) and ensure confPath is always removed on spawn errors: instead of interpolating NVIDIA_API_KEY into cmd, pass the secret via a secure channel (e.g., configure SSH to SendEnv and pass API_KEY in the child process env, or write the key to a temporary file keyPath, securely scp it to the remote side and source it there) and remove direct "'${API_KEY}'" interpolation in cmd; also attach handlers to proc ('error', 'exit', 'close') and wrap spawn in try/catch/finally to always unlink confPath (and keyPath if created) on error or completion so temp files are cleaned even when ssh binary is missing.
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
179-187: Colima socket detection duplicates logic fromrunner.js.This inline socket detection mirrors the Colima paths in
detectContainerSocket. Consider reusing the centralized function to avoid divergence if socket paths change.♻️ Optional: Reuse detectContainerSocket
+const { detectContainerSocket } = require("./runner"); + // CoreDNS fix — always run. k3s-inside-Docker has broken DNS on all platforms. -const home = process.env.HOME || "/tmp"; -const colimaSocket = [ - path.join(home, ".colima/default/docker.sock"), - path.join(home, ".config/colima/default/docker.sock"), -].find((s) => fs.existsSync(s)); -if (colimaSocket) { +const detectedSocket = detectContainerSocket(); +const isColima = detectedSocket && detectedSocket.includes(".colima"); +if (isColima) { console.log(" Patching CoreDNS for Colima...");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 179 - 187, This duplicates the Colima socket path detection; instead call the centralized detectContainerSocket function (import/require detectContainerSocket from the module that defines it) and use its result to decide whether to run the CoreDNS patch. Replace the inline path array and fs.existsSync logic with a call to detectContainerSocket(), then if the returned socket path is truthy and indicates Colima (e.g., contains "colima" or matches the Colima socket patterns), run the existing run(...) invocation unchanged; keep the run call and the ignoreError option as-is so behavior remains identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@bin/lib/onboard.js`:
- Around line 478-493: The setupOpenclaw function is a no-op that prints a
misleading success message; update it to either perform the actual OpenClaw
gateway launch or clearly mark the step as deferred. Locate setupOpenclaw and
either implement the real launch flow (invoke the sandbox connect/create
commands or call the helper that runs the gateway, handle errors and await
completion, and keep the existing step(6, 7, ...) usage) or change the console
message to indicate the launch is deferred and add a TODO comment documenting
where/when the gateway is actually started (and avoid printing a false success).
In `@scripts/telegram-bridge.js`:
- Line 33: The Map activeSessions is declared but never populated, so either
implement session tracking where messages are processed or remove the unused Map
and its reset logic: if you want session history, add entries to activeSessions
keyed by chatId inside the main message handler (e.g., where messages are
received/handled — look for the bot.on('message') or onText handler) and
update/clear history in the '/reset' command handler; otherwise remove the
activeSessions declaration and any code that deletes from it (the '/reset'
branch) to eliminate dead code.
- Around line 112-127: In runAgentInSandbox avoid embedding API_KEY in the
remote command (so it doesn't appear in ps) and ensure confPath is always
removed on spawn errors: instead of interpolating NVIDIA_API_KEY into cmd, pass
the secret via a secure channel (e.g., configure SSH to SendEnv and pass API_KEY
in the child process env, or write the key to a temporary file keyPath, securely
scp it to the remote side and source it there) and remove direct "'${API_KEY}'"
interpolation in cmd; also attach handlers to proc ('error', 'exit', 'close')
and wrap spawn in try/catch/finally to always unlink confPath (and keyPath if
created) on error or completion so temp files are cleaned even when ssh binary
is missing.
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 179-187: This duplicates the Colima socket path detection; instead
call the centralized detectContainerSocket function (import/require
detectContainerSocket from the module that defines it) and use its result to
decide whether to run the CoreDNS patch. Replace the inline path array and
fs.existsSync logic with a call to detectContainerSocket(), then if the returned
socket path is truthy and indicates Colima (e.g., contains "colima" or matches
the Colima socket patterns), run the existing run(...) invocation unchanged;
keep the run call and the ignoreError option as-is so behavior remains
identical.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f89ab487-f0dc-4b37-b757-a72529eeac72
📒 Files selected for processing (5)
bin/lib/onboard.jsbin/lib/runner.jsbin/nemoclaw.jsscripts/telegram-bridge.jstest/preflight.test.js
✅ Files skipped from review due to trivial changes (1)
- test/preflight.test.js
|
Closing — #286 supersedes this with centralized macOS Docker socket detection via platform.js. |
Summary
~/.docker/run/docker.sock(DOCKER_HOST not set Error while installation. #137)Depends on #144
Fixes #137
Test plan
Automated Tests
New
detectContainerSockettests (3 added, 9 total with #144):Hardware Validation
Validated against real macOS socket layout:
~/.colima/default/docker.sock— correct prioritySocket connectivity verified:
DOCKER_HOST=unix:///Users/kobe/.colima/default/docker.sock docker pssuccessfully listed 4 running containers.Priority order confirmed on real hardware:
~/.colima/default/docker.sock) — present, selected~/.docker/run/docker.sock) — not present, would be checked nextDocker Desktop socket path (
~/.docker/run/docker.sock) matches the official Docker Desktop documentation for macOS. The priority order ensures existing Colima users are unaffected by this change.Summary by CodeRabbit
New Features
Tests