Symptom
When iterating on agentmemory locally (rebuild → swap dist → restart), the new code on disk is NOT picked up by the running engine even after agentmemory stop && agentmemory. The runtime continues to serve the OLD function definitions until the iii engine process is explicitly killed.
Repro
- Run agentmemory normally
- Modify any function source, e.g.
src/functions/diagnostics.ts — add a new category to ALL_CATEGORIES
npm run build
- Copy the new dist over the deployed location, e.g.:
rm -rf /opt/homebrew/lib/node_modules/@agentmemory/agentmemory/dist
cp -R dist /opt/homebrew/lib/node_modules/@agentmemory/agentmemory/dist
agentmemory stop && nohup agentmemory > /tmp/am.log 2>&1 < /dev/null & disown
- Verify livez:
curl -s http://localhost:3111/agentmemory/livez returns 200 OK
- Probe the changed function: returns OLD behavior (the new category is absent from the response, or the new ALL_CATEGORIES entries are silently filtered out by
categories.filter((c) => ALL_CATEGORIES.includes(c)) rejecting them)
I confirmed in my own repro that the deployed dist/index.mjs SHA matches the locally-built one and the new code IS present in the bundle — but the running process behaves as if it's still on the old code.
Workaround that DOES work
pkill -9 -f "iii\|node dist/index"
nohup agentmemory > /tmp/am.log 2>&1 < /dev/null & disown
After the hard kill, the next start picks up the new bundle immediately.
Likely cause
agentmemory stop appears to disconnect the agentmemory node worker from the iii engine, but the iii engine process keeps running. When agentmemory starts again, it reconnects to the existing iii engine, which presumably retains the prior worker's registered function definitions (or otherwise serves them from a cache that the new worker can't override).
A pid check during my repro confirmed the iii Rust binary (~/.local/bin/iii) keeps running across agentmemory stop invocations.
Impact
- Anyone doing local iteration on agentmemory has to know to
pkill -9 for changes to take effect — the normal "stop && start" loop silently keeps old code live, which is extremely confusing
- Related: the iii websocket reconnect chatter (
[OTel] Disconnected from engine, will reconnect…) in the log makes it harder to tell whether a restart actually happened, because the same reconnect messages appear during regular long-running operations too
Suggested fix
agentmemory stop should signal the iii engine process to exit, not just disconnect the worker. Verify the iii pid is gone before returning. If iii has a graceful-shutdown API, prefer that; otherwise SIGTERM with a short timeout and SIGKILL fallback.
Context
Surfaced while iterating on #472 (chunking) and #473 (lesson visibility) — every rebuild+restart cycle was returning misleading "the new code isn't deployed" symptoms until I switched to the kill-9 workaround.
Symptom
When iterating on agentmemory locally (rebuild → swap dist → restart), the new code on disk is NOT picked up by the running engine even after
agentmemory stop && agentmemory. The runtime continues to serve the OLD function definitions until the iii engine process is explicitly killed.Repro
src/functions/diagnostics.ts— add a new category toALL_CATEGORIESnpm run buildagentmemory stop && nohup agentmemory > /tmp/am.log 2>&1 < /dev/null & disowncurl -s http://localhost:3111/agentmemory/livezreturns 200 OKcategories.filter((c) => ALL_CATEGORIES.includes(c))rejecting them)I confirmed in my own repro that the deployed
dist/index.mjsSHA matches the locally-built one and the new code IS present in the bundle — but the running process behaves as if it's still on the old code.Workaround that DOES work
After the hard kill, the next start picks up the new bundle immediately.
Likely cause
agentmemory stopappears to disconnect the agentmemory node worker from the iii engine, but the iii engine process keeps running. Whenagentmemorystarts again, it reconnects to the existing iii engine, which presumably retains the prior worker's registered function definitions (or otherwise serves them from a cache that the new worker can't override).A pid check during my repro confirmed the iii Rust binary (
~/.local/bin/iii) keeps running acrossagentmemory stopinvocations.Impact
pkill -9for changes to take effect — the normal "stop && start" loop silently keeps old code live, which is extremely confusing[OTel] Disconnected from engine, will reconnect…) in the log makes it harder to tell whether a restart actually happened, because the same reconnect messages appear during regular long-running operations tooSuggested fix
agentmemory stopshould signal the iii engine process to exit, not just disconnect the worker. Verify the iii pid is gone before returning. If iii has a graceful-shutdown API, prefer that; otherwise SIGTERM with a short timeout and SIGKILL fallback.Context
Surfaced while iterating on #472 (chunking) and #473 (lesson visibility) — every rebuild+restart cycle was returning misleading "the new code isn't deployed" symptoms until I switched to the kill-9 workaround.