feat(cli): install iii-engine binary first, Docker opt-in#396
Conversation
Reorder startEngine() fallback so the native iii binary is the happy path. Docker compose becomes opt-in via AGENTMEMORY_USE_DOCKER=1, an explicit prompt choice, or CI mode falls back to it only when auto-install fails. Fixes a class of "engine started but REST never responded" failures observed on Rancher Desktop, where docker compose returns 0 on a degraded pull and we wait 15s for /livez against nothing. New tier order: 1. iii on PATH 2. ~/.local/bin/iii / fallback paths 3. clack select (install / docker / manual) -- auto-install in CI 4. Docker compose only via explicit opt-in 5. fail-loud with install instructions Extracts the installer logic out of runUpgrade into runIiiInstaller() so first-run and upgrade share the same pinned-v0.11.2 curl+tar path. Adds an `agentmemory stop` command: - writes ~/.agentmemory/iii.pid from spawnEngineBackground - SIGTERM with 3s wait, escalates to SIGKILL (iii ignores SIGTERM) - lsof :3111 fallback when pidfile is stale or missing - clears pidfile on success Engine persistence is unchanged: `detached: true` + `child.unref()` means memories keep flowing after npx exits. A second `npx` short- circuits at isEngineRunning() before re-prompting. Tightens installInstructions() copy: drops the AGENTMEMORY_III_VERSION rationale tail (now in --help) and reorders so the curl path leads, Docker is path B with the opt-in env var. Help: documents stop, AGENTMEMORY_USE_DOCKER, AGENTMEMORY_III_VERSION.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR refactors the CLI's iii-engine lifecycle management: it adds version probing, pidfile tracking for non-Docker processes, an auto-installer that downloads pinned releases, rewrites the startup decision logic to interactively choose between Docker/install/manual modes based on environment and TTY status, updates install guidance, simplifies the upgrade flow, and introduces a new ChangesEngine Lifecycle and Installation Management
Sequence Diagramsflowchart TD
A["startEngine"] --> B{"AGENTMEMORY_USE_DOCKER set?"}
B -->|Yes| C["Use Docker compose"]
B -->|No| D{"Interactive mode<br/>TTY and not CI?"}
D -->|Yes| E["Prompt user:<br/>install/docker/manual"]
D -->|No| F{"docker-compose.yml exists?"}
E -->|install| G["runIiiInstaller"]
E -->|docker| C
E -->|manual| H["Set no-engine"]
G -->|Success| I["startIiiBin<br/>record PID to pidfile"]
G -->|Fail| J{"Interactive?"}
J -->|Yes| C
J -->|No| H
F -->|Yes| C
F -->|No| H
C --> K["Docker compose startup"]
I --> L["Engine running"]
H --> M["No engine mode"]
K --> N["Docker engine running"]
sequenceDiagram
participant User
participant runStop
participant pidfile
participant lsof
participant Process
User->>runStop: stop command
runStop->>pidfile: read pidfile
runStop->>lsof: findEnginePidsByPort (non-Windows)
runStop->>Process: signalAndWait: SIGTERM
Process-->>runStop: exit or timeout
alt timeout
runStop->>Process: escalate SIGKILL
end
Process-->>runStop: confirmed exit
runStop->>pidfile: clear pidfile
runStop-->>User: success or error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cli.ts`:
- Around line 1376-1415: runStop currently treats any port-bound process as the
engine and will kill host proxies when the engine was started via Docker
compose; modify runStop/readEnginePidfile logic to detect Docker-started engines
and avoid killing host PIDs: when starting via Docker compose, write a small
state file (engine kind and compose file path) alongside the pidfile; in
runStop, check readEnginePidfile() and this state file first—if the state
indicates "docker-compose" or a docker-compose.yml is discoverable and pidfile
is empty, refuse to signal host PIDs and instead either run `docker compose -f
<file> down` (using the stored compose path) or print a clear instruction to run
that command and exit non-zero; keep existing behavior (findEnginePidsByPort +
signalAndWait) only when state shows a non-docker native engine or no compose
file is present, and still call clearEnginePidfile()/clean up state when
shutdown completes.
- Around line 1378-1385: The stop flow currently short-circuits on
isEngineRunning() and clears the pidfile even if a process exists; change the
logic to first call getRestPort(), then call findEnginePidsByPort(port) (or
equivalent PID lookup) in addition to isEngineRunning(), and only call
clearEnginePidfile() and p.outro("Nothing to stop.") when isEngineRunning() is
false AND findEnginePidsByPort(port) returns no candidate PIDs; if candidate
PIDs are found but the HTTP probe failed, preserve the pidfile and surface the
PID(s) so the user can investigate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
Two issues caught in PR #396 review: 1. Docker-started engine would let runStop kill host proxies. `findEnginePidsByPort` returned every PID holding :3111. When the engine was started via `docker compose up -d`, the listening socket lives inside the container; on the host, lsof reports the Docker socket proxy (`com.docker.backend`, vmnetd, etc.). The stop flow would happily SIGTERM/SIGKILL those, taking down Docker Desktop networking for every other container the user has running. Adds an EngineState record at ~/.agentmemory/engine-state.json written by startEngine at spawn time (kind=native|docker, configPath or composeFile). runStop checks state first: - kind=docker: run `docker compose -f <file> down`, never signal host PIDs. Verify the docker binary is still on PATH and the compose file still exists; otherwise print the exact command the user should run. - kind=native: existing pidfile + lsof flow. - state missing + compose file discoverable + port held + no pidfile: refuse to signal blind, point at `docker compose down`. Catches engines started by a different shell or older agentmemory build that didn't write state. 2. runStop cleared the pidfile when isEngineRunning() returned false, even if a stale process was still bound to the port. The HTTP probe can fail while the engine is hung, paused (SIGSTOP), or in a half-closed state. Clearing the pidfile in that case made the next run silently start a second engine on a conflicting port. Now we keep the pidfile and surface the live PIDs with `ps -p` + `lsof` hints so the user can investigate before manual cleanup. 3. Bonus fix: `lsof -i :3111 -t` also returns CLIENT PIDs, not just the LISTEN socket owner. The CLI's own keep-alive fetch in isEngineRunning() showed up as a client, so signalAndWait would SIGKILL its own parent — exit code 137, files never cleaned up. Restrict to `-sTCP:LISTEN` and filter out `process.pid` as a belt-and-braces guard. Verified with four scenarios on a real iii engine: - native + pidfile + state: SIGTERM → SIGKILL → exit 0, state cleared - docker state pointing at missing compose: refusal, state preserved, engine untouched - paused engine (kill -STOP): preserves pidfile, surfaces PIDs - lsof self-pid filter: CLI no longer kills itself Build: dist/cli.mjs 44.46 kB. Tests: 892/903 (11 pre-existing mcp-standalone failures unrelated to CLI changes).
…cker-aware teardown (#401) Patch bump. No breaking changes to the public API or the existing CLI subcommands. CLI bootstrap flow changes are additive; existing Docker users keep working via the new AGENTMEMORY_USE_DOCKER=1 opt-in. Files bumped (9): - package.json - packages/mcp/package.json - plugin/.claude-plugin/plugin.json - plugin/.codex-plugin/plugin.json - src/version.ts - src/types.ts (ExportData.version union — also adds 0.9.13 which was missed in the previous release) - src/functions/export-import.ts (supportedVersions Set) - test/export-import.test.ts - CHANGELOG.md PRs included since v0.9.13: #396 — feat(cli): install iii-engine binary first, Docker opt-in; agentmemory stop with Docker-aware teardown; LISTEN-only lsof filter so the CLI no longer signals its own parent. #397 — docs(readme): move OpenHuman after pi in agents grid.
Summary
AGENTMEMORY_USE_DOCKER=1, an explicit prompt choice, or auto-fallback when install failsagentmemory stopcommand with SIGTERM → SIGKILL escalation (iii ignores SIGTERM) andlsof :3111fallback~/.agentmemory/iii.pidfromspawnEngineBackgroundsostopdoesn't have to greppsWhy
Cold-start
npx @agentmemory/agentmemorycould spend 15s waiting for/livezagainst nothing when Docker was the only fallback.docker compose up -dreturns 0 even when the underlying daemon is degraded — most recently hit on Rancher Desktop, where the engine never starts but the CLI thinks it did and times out with a misleading "engine started but REST never responded".The native iii binary path was already implemented in
runUpgrade(curl + tar to~/.local/bin), it just wasn't wired into first-run. This PR extracts it intorunIiiInstaller()and makes it the default tier.Persistence is unchanged:
detached: true+child.unref()means memories keep flowing afternpxexits. A secondnpxshort-circuits atisEngineRunning().agentmemory stopis the explicit teardown.New fallback chain (cli.ts:271)
iiion PATH → start~/.local/bin/iii/ fallback paths → startp.select{ install / docker / manual }CI=1auto-picks installAGENTMEMORY_USE_DOCKER=1auto-picks dockerBehaviour matrix
AGENTMEMORY_USE_DOCKER=1iii --versionmatchesIIPINNED_VERSIONon second runisEngineRunning(), no promptagentmemory stop~/.agentmemory/iii.pidfirst, falls back tolsof -i :3111 -tNothing to stopwhen port closedTest plan
npm test— 903/903 passingnpm run build— clean (44.41 kB cli.mjs)node dist/cli.mjs --help— shows newstopcommand and env varsnode dist/cli.mjs stopagainst a running iii engine on :3111 — SIGTERM → SIGKILL, port releasednode dist/cli.mjs stopagainst no engine — short-circuits with "Nothing to stop"~/.local/bin/iii, runnpx @agentmemory/agentmemory, expect clack prompt → install → startisEngineRunning()~/.rd/bin/dockerbut noiii; expect clack prompt (not Docker)CI=1 npx @agentmemory/agentmemorynon-interactively installs binaryAGENTMEMORY_USE_DOCKER=1 npx ...keeps current Docker behaviourSummary by CodeRabbit
New Features
stopcommand to manage and terminate engine processesImprovements