agentmemory stop disconnects worker but leaves iii engine running with stale function registrations

## Symptom

When iterating on agentmemory locally (rebuild → swap dist → restart), the new code on disk is NOT picked up by the running engine even after `agentmemory stop && agentmemory`. The runtime continues to serve the OLD function definitions until the iii engine process is explicitly killed.

## Repro

1. Run agentmemory normally
2. Modify any function source, e.g. `src/functions/diagnostics.ts` — add a new category to `ALL_CATEGORIES`
3. `npm run build`
4. Copy the new dist over the deployed location, e.g.:
   ```bash
   rm -rf /opt/homebrew/lib/node_modules/@agentmemory/agentmemory/dist
   cp -R dist /opt/homebrew/lib/node_modules/@agentmemory/agentmemory/dist
   ```
5. `agentmemory stop && nohup agentmemory > /tmp/am.log 2>&1 < /dev/null & disown`
6. Verify livez: `curl -s http://localhost:3111/agentmemory/livez` returns 200 OK
7. Probe the changed function: returns OLD behavior (the new category is absent from the response, or the new ALL_CATEGORIES entries are silently filtered out by `categories.filter((c) => ALL_CATEGORIES.includes(c))` rejecting them)

I confirmed in my own repro that the deployed `dist/index.mjs` SHA matches the locally-built one and the new code IS present in the bundle — but the running process behaves as if it's still on the old code.

## Workaround that DOES work

```bash
pkill -9 -f "iii\|node dist/index"
nohup agentmemory > /tmp/am.log 2>&1 < /dev/null & disown
```

After the hard kill, the next start picks up the new bundle immediately.

## Likely cause

`agentmemory stop` appears to disconnect the agentmemory node worker from the iii engine, but the iii engine process keeps running. When `agentmemory` starts again, it reconnects to the existing iii engine, which presumably retains the prior worker's registered function definitions (or otherwise serves them from a cache that the new worker can't override).

A pid check during my repro confirmed the iii Rust binary (`~/.local/bin/iii`) keeps running across `agentmemory stop` invocations.

## Impact

- Anyone doing local iteration on agentmemory has to know to `pkill -9` for changes to take effect — the normal "stop && start" loop silently keeps old code live, which is extremely confusing
- Related: the iii websocket reconnect chatter (`[OTel] Disconnected from engine, will reconnect…`) in the log makes it harder to tell whether a restart actually happened, because the same reconnect messages appear during regular long-running operations too

## Suggested fix

`agentmemory stop` should signal the iii engine process to exit, not just disconnect the worker. Verify the iii pid is gone before returning. If iii has a graceful-shutdown API, prefer that; otherwise SIGTERM with a short timeout and SIGKILL fallback.

## Context

Surfaced while iterating on #472 (chunking) and #473 (lesson visibility) — every rebuild+restart cycle was returning misleading "the new code isn't deployed" symptoms until I switched to the kill-9 workaround.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agentmemory stop disconnects worker but leaves iii engine running with stale function registrations #474

Symptom

Repro

Workaround that DOES work

Likely cause

Impact

Suggested fix

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

agentmemory stop disconnects worker but leaves iii engine running with stale function registrations #474

Description

Symptom

Repro

Workaround that DOES work

Likely cause

Impact

Suggested fix

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions