Skip to content

feat: MCP server enhancements + dimos mcp CLI + agent context (DIM-686, DIM-687)#1451

Merged
spomichter merged 40 commits intodevfrom
feat/dim-686-mcp-agent-cli
Mar 9, 2026
Merged

feat: MCP server enhancements + dimos mcp CLI + agent context (DIM-686, DIM-687)#1451
spomichter merged 40 commits intodevfrom
feat/dim-686-mcp-agent-cli

Conversation

@spomichter
Copy link
Contributor

@spomichter spomichter commented Mar 6, 2026

What

MCP server enhancements + CLI tools for interacting with running DimOS instances via the Model Context Protocol.

MCP Server (DIM-686)

  • McpServer introspection as @Skill tools: server_status, list_modules, agent_send are standard MCP tools discoverable via tools/list (previously custom JSON-RPC methods)
  • McpAdapter client class (dimos/agents/mcp/mcp_adapter.py): Single JSON-RPC client used by CLI + tests. Uses requests, unique uuid4() IDs, URL discovery from RunEntry or GlobalConfig
  • Transport start()/stop() lifecycle in agent_send

CLI

  • dimos mcp list-tools — list available MCP tools
  • dimos mcp call <tool> --arg key=value or --json-args '{"key": "val"}' — call any tool
  • dimos mcp status — server status (modules, skills, PID)
  • dimos mcp modules — module-skill mapping
  • dimos agent-send <message> — send message to running agent via LCM
  • All commands use McpAdapter.from_run_entry() for URL discovery
  • Proper error handling with McpError exceptions

Integration Tests

  • 29 pytest tests (test_mcp_integration.py) covering every MCP method, CLI command, error path, lifecycle, and rapid restart
  • Tests use shared McpAdapter — no duplication
  • Demo scripts renamed with demo_ prefix (not executed in CI)

Other Fixes

  • worker.py: os.kill(pid, 0) for process liveness check (replaces is_alive() + AssertionError catch)
  • requests>=2.28 added as explicit dependency
  • Port 9990 sourced only from GlobalConfig.mcp_port
  • Blueprint renamed to demo-mcp-stress-test (hidden from dimos list)

Review Comments Addressed

All 27 of Paul's review comments addressed:

  • McpAdapter class (P1/P3/P4/P8/P13/P14/P17/P21)
  • Custom methods → @Skill tools (P16)
  • UUID IDs (P5), imports at top (P6), --json-args (P7)
  • Assert after cleanup (P11), transport start/stop (P20)
  • requests in pyproject (P12), skill_count removed (P18)
  • File renames (P22/P26), GlobalConfig port (P25)
  • Duplicate CI env removed (P9), object→Any (P15)

Not in scope: agent_context.py was deleted in a prior commit — runtime info available via dimos mcp status.

- close devnull/stderr_file after dup2 (fd leak)
- remove PID from pre-fork output (printed parent PID, not daemon PID)
- show log_dir not log_dir/dimos.jsonl (file doesn't exist yet)
- keep tests in tests/ (dimos/conftest.py breaks isolated tests)
dimos status — shows running instances with run-id, pid, blueprint, uptime, log dir
dimos stop — sends SIGTERM (or SIGKILL with --force), waits 5s, escalates if needed
  --pid to target specific instance, --all to stop everything
  cleans registry on stop

also adds get_most_recent() to run_registry.
8 new tests covering sigterm, sigkill, escalation, dead process cleanup, most-recent lookup.
lightweight 2-module blueprint (PingModule + PongModule) that needs no
hardware, no LFS data, and no replay files. tests real forkserver workers
and module deployment.

covers:
- single worker lifecycle (build -> health check -> registry -> stop)
- multiple workers (2 workers, both alive)
- health check detects dead worker (SIGKILL -> detect failure)
- registry entry JSON field roundtrip
- stale entry cleanup (dead PIDs removed, live entries kept)
both stdout and stderr redirect to the same file after daemonize(),
so stderr.log was misleading. daemon.log better describes the contents.
folds DIM-685 into daemon PR to avoid merge conflicts on dimos.py.

- set_run_log_dir() before blueprint.build() routes structlog to
  <log_base>/<run-id>/main.jsonl
- workers inherit DIMOS_RUN_LOG_DIR env var via forkserver
- FileHandler replaces RotatingFileHandler (multi-process rotation unsafe)
- fallback: env var check -> legacy per-pid files
- 6 unit tests for routing logic
setup_logger() runs at import time throughout the codebase, creating
FileHandlers pointing to the legacy log path. set_run_log_dir() was
resetting the global path but not updating these existing handlers,
so main.jsonl was created but stayed empty (0 bytes).

fix: iterate all stdlib loggers and redirect their FileHandlers to
the new per-run path. verified: main.jsonl now receives structured
JSON logs (1050 bytes, 5 lines in test run).
testpaths in pyproject.toml is ['dimos'], so tests/ at repo root
wouldn't be picked up by CI. moved all 3 test files to dimos/core/
alongside existing core tests. all 41 tests pass with conftest active.
matches convention from test_worker.py — forkserver-based tests are
marked slow so they run in CI but are skipped in local default pytest.

local default: 36 tests (unit only)
CI (-m 'not (tool or mujoco)'): 41 tests (unit + e2e)
…682, DIM-684)

16 tests using typer CliRunner with real subprocess PIDs:

status (7 tests):
- no instances, running instance details, uptime formatting
- MCP port, multiple instances, dead PID filtering

stop (9 tests):
- default most recent, --pid, --pid not found
- --all, --all empty, --force SIGKILL
- already-dead cleanup, SIGTERM verification
…684)

3 new e2e tests that exercise dimos status and stop against a live
PingPong blueprint with real forkserver workers:

- status shows live blueprint details (run_id, PID, blueprint name)
- registry entry findable after real build, workers alive
- coord.stop() kills workers, registry cleaned
- move system imports to top of run(), logger.info before heavy imports
- remove hardcoded MCP port line from daemon output
- add n_workers/n_modules properties to ModuleCoordinator (public API)
- single-instance model: remove --all/--pid from stop, simplify status
- use _get_user_data_dir() for XDG-compliant registry/log paths
- remove mcp_port from RunEntry (should be in GlobalConfig)
- inline _shutdown_handler as closure in install_signal_handlers
- add SIGKILL wait poll (2s) to avoid zombie race with port conflict check
- replace handler._open() private API with new FileHandler construction
- fix e2e _clean_registry to monkeypatch REGISTRY_DIR
- reset logging_config module globals in test fixture
- move test imports to module level
structlog FileHandler writes to main.jsonl — daemon.log only ever
captured signal shutdown messages. no useful content.
…port

fixes mypy name-defined error from import reorganization
- simplify health_check to single alive check (build is synchronous)
- remove --health-timeout flag (no longer polling)
- add workers property to ModuleCoordinator (public API)
- separate try-excepts for kill/wait in sleeper fixture cleanup
- move repeated imports to top in test_per_run_logs
- convert all test helpers to fixtures with cleanup
- replace os.environ['CI']='1' at import time with monkeypatch fixture
- replace all time.sleep() with polling loops in tests
- mark slow stop tests @pytest.mark.slow
- move stop_entry logic from CLI to run_registry
- remove # type: ignore, properly type _stop_entry
- remove duplicate port conflict test
- remove redundant monkeypatch cleanup in test_per_run_logs
- list(glob()) to avoid mutation during iteration in cleanup_stale
- XDG_STATE_HOME compliant _get_state_dir() for registry/log paths
- remove redundant cli_config_overrides in e2e builds
- move duplicate imports to module level in e2e tests
@spomichter
Copy link
Contributor Author

@greptile review

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR introduces a McpAdapter client class, promotes three introspection handlers (server_status, list_modules, agent_send) from custom JSON-RPC methods to standard @skill tools, and adds a full dimos mcp CLI sub-command group backed by 29 integration tests. The architecture is clean and addresses the previous round of review comments well.

Two bugs need fixing before merge:

  • dimos/core/worker.py — The pid property now calls os.kill(pid, 0) but does not guard against self._process.pid being None (unstarted process). os.kill(None, 0) raises TypeError, which is not caught by except OSError, resulting in an unhandled exception.
  • dimos/core/tests/demo_mcp_killtest.pytest_mcp_basic_ops() still calls the removed custom methods dimos/status and dimos/agent_send. These now return error code -32601, causing KeyError in both assertion blocks and making every kill/restart cycle report failures.

One lower-priority item worth addressing:

  • dimos/agents/mcp/mcp_adapter.pyrequests.HTTPError (raised by raise_for_status() on 4xx/5xx responses) is not caught by CLI callers, which only handle requests.ConnectionError and McpError, leaving users with raw Python tracebacks on server errors.

Confidence Score: 2/5

  • Not safe to merge — one production bug in worker.py and one demo script that will always fail due to removed methods.
  • The worker.py pid property introduces a real TypeError regression on unstarted processes (a code path that can be hit under normal operation). Additionally, demo_mcp_killtest.py still calls the two custom JSON-RPC methods that were explicitly removed in this PR, guaranteeing failures in the demo/stress suite. Both require code changes before the PR is production-ready.
  • dimos/core/worker.py (uncaught TypeError in pid property) and dimos/core/tests/demo_mcp_killtest.py (calls to deleted JSON-RPC methods).

Important Files Changed

Filename Overview
dimos/agents/mcp/mcp_adapter.py New McpAdapter JSON-RPC client class used by CLI and tests; well-structured with _unwrap error handling, but requests.HTTPError (non-200 responses) is not converted to McpError, leaving CLI callers exposed to unhandled exceptions.
dimos/agents/mcp/mcp_server.py Refactored MCP server converts server_status, list_modules, and agent_send from custom JSON-RPC methods to standard @skill tools; custom handler code removed cleanly, and module deduplication now uses insertion-order-preserving dict.fromkeys.
dimos/core/worker.py Changed pid liveness check from is_alive() to os.kill(pid, 0), but the new code does not guard against self._process.pid being None (unstarted process), causing an uncaught TypeError instead of OSError.
dimos/robot/cli/dimos.py Adds mcp subcommand group and agent-send command with proper McpError and ConnectionError handling; empty tool output is no longer treated as failure.
dimos/core/test_mcp_integration.py 29 integration tests covering MCP lifecycle, CLI, error handling, and rapid restarts; class-scoped fixtures reduce setup overhead, but the mcp_shared fixture places assert ready after teardown, so a startup failure produces confusing cascading test errors rather than an immediate diagnostic.
dimos/core/tests/demo_mcp_killtest.py Kill/restart stress demo still calls the removed dimos/status and dimos/agent_send JSON-RPC methods; these now return error -32601, causing KeyError in both assertion blocks and making every test cycle report failures.

Sequence Diagram

sequenceDiagram
    participant CLI as dimos CLI
    participant Adapter as McpAdapter
    participant Server as McpServer (FastAPI)
    participant Handler as handle_request()
    participant RPC as RpcCall (skill dispatch)
    participant Module as StressTestModule / McpServer skills

    CLI->>Adapter: mcp list-tools / call / status / modules
    Adapter->>Server: POST /mcp {jsonrpc, method, params}
    Server->>Handler: handle_request(body, app.state.skills, app.state.rpc_calls)

    alt method == "initialize"
        Handler-->>Server: _handle_initialize()
    else method == "tools/list"
        Handler-->>Server: _handle_tools_list(skills)
    else method == "tools/call"
        Handler->>RPC: rpc_calls[name](**args)
        RPC->>Module: skill method (echo / ping / server_status / list_modules / agent_send)
        Module-->>RPC: result
        RPC-->>Handler: result
        Handler-->>Server: _jsonrpc_result_text(result)
    else unknown method
        Handler-->>Server: _jsonrpc_error(-32601)
    end

    Server-->>Adapter: JSONResponse
    Adapter->>Adapter: _unwrap() — raises McpError on error key
    Adapter-->>CLI: typed result or McpError
Loading

Comments Outside Diff (2)

  1. dimos/core/tests/demo_mcp_killtest.py, line 1263-1277 (link)

    Calls to removed custom JSON-RPC methods will always fail

    test_mcp_basic_ops() still calls the old custom JSON-RPC methods dimos/status (line 1263) and dimos/agent_send (line 1272) that were removed in this PR. The new handle_request only handles initialize, tools/list, and tools/call; calling any other method returns {"error": {"code": -32601, ...}}.

    As a result, both blocks will raise KeyError when they attempt to access result["result"]["modules"] or assert on result["result"]["content"], causing every kill/restart cycle to accumulate failures.

    Replace with the equivalent tools/call invocations:

        # server_status (was dimos/status)
        try:
            result = mcp_call("tools/call", {"name": "server_status", "arguments": {}})
            data = json.loads(result["result"]["content"][0]["text"])
            assert "StressTestModule" in data["modules"]
            p(f"server_status → pid={data['pid']}")
        except Exception as e:
            p(f"server_status failed: {e}", ok=False)
            failures += 1
    
        # agent_send (was dimos/agent_send)
        try:
            result = mcp_call("tools/call", {"name": "agent_send", "arguments": {"message": "hello from killtest"}})
            assert "hello from killtest" in result["result"]["content"][0]["text"]
            p("agent_send → delivered")
        except Exception as e:
            p(f"agent_send failed: {e}", ok=False)
            failures += 1
  2. dimos/core/test_mcp_integration.py, line 450-453 (link)

    assert ready fires during teardown, masking the real failure

    wait_for_ready() is called before yield, but assert ready is placed after coord.stop() in teardown. If the MCP server never starts, the fixture still yields, all tests in the class run (and fail with confusing ConnectionErrors or wrong assertions), and the root-cause assertion only triggers during fixture teardown — attributed to the last test in the class rather than the startup step.

    Move the assertion before yield so a startup failure is surfaced immediately:

        coord = bp.build()
        ready = _adapter().wait_for_ready()
        assert ready, "MCP server did not start within timeout"
        yield coord
        coord.stop()

Last reviewed commit: 315b201

"dimos status # show running instance",
"dimos stop # stop the instance",
"dimos stop --force # force kill (SIGKILL)",
"dimos restart # restart with same args",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented dimos restart command does not exist

The generated context.md instructs agents to run dimos restart, but no such command exists in dimos/robot/cli/dimos.py. An AI agent parsing this file will attempt to call dimos restart, receive a Typer error, and either fail silently or behave unexpectedly. The context file should only document commands that are actually implemented.

Suggested change
"dimos restart # restart with same args",
"dimos mcp modules # module-skill mapping",

Remove the dimos restart line from the CLI Commands section, or implement the command first.

Comment on lines +333 to +336
result = _mcp_call("tools/call", {"name": tool_name, "arguments": arguments})
content = result.get("result", {}).get("content", [])
for item in content:
typer.echo(item.get("text", str(item)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON-RPC errors silently produce empty output

When the MCP server returns a JSON-RPC error response (e.g., {"jsonrpc": "2.0", "id": 1, "error": {"code": -32601, "message": "Unknown: ..."} }), result.get("result", {}) will be {}, so content will be [], and the loop produces no output at all. The user calling dimos mcp call <tool> gets no feedback that the call failed — no error message, no non-zero exit code.

The response should be checked for a top-level "error" key before extracting content:

    if "error" in result:
        err = result["error"]
        typer.echo(f"Error: {err.get('message', err)}", err=True)
        raise typer.Exit(1)
    content = result.get("result", {}).get("content", [])
    for item in content:
        typer.echo(item.get("text", str(item)))

Comment on lines +117 to +139
def _handle_dimos_status(req_id: Any) -> dict[str, Any]:
"""Return running modules, skills, and server info."""
import os

skills = app.state.skills
return _jsonrpc_result(
req_id,
{
"pid": os.getpid(),
"modules": list({s.class_name for s in skills}),
"skills": [s.func_name for s in skills],
"skill_count": len(skills),
},
)


def _handle_dimos_list_modules(req_id: Any) -> dict[str, Any]:
"""Return module names and their skills."""
skills = app.state.skills
modules: dict[str, list[str]] = {}
for s in skills:
modules.setdefault(s.class_name, []).append(s.func_name)
return _jsonrpc_result(req_id, {"modules": modules})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New handlers bypass the skills parameter, coupling to global state

_handle_tools_list and _handle_tools_call both receive skills / rpc_calls as explicit arguments (enabling unit testing and potential multi-tenancy), but _handle_dimos_status and _handle_dimos_list_modules reach into app.state.skills directly. This makes these two handlers untestable in isolation (e.g., via handle_request(...) with a synthetic skill list) and will silently diverge if the skills passed to handle_request ever differ from the global app.state.

Consider accepting skills as a parameter just like the other handlers do, and pass it through from handle_request.

lines.append("")
lines.append("Use `dimos mcp list-tools` for full schema, or call via:")
lines.append("```")
lines.append('curl -s localhost:9990/mcp -d \'{"jsonrpc":"2.0","id":1,"method":"tools/list"}\'')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded port in curl example ignores mcp_port parameter

The function accepts mcp_port as a parameter and correctly uses it for the URL on line 60, but the curl example on this line still hardcodes localhost:9990 regardless of the value of mcp_port. If the port is overridden (e.g., in tests or multi-instance scenarios), the curl snippet in the generated context file will be wrong.

Suggested change
lines.append('curl -s localhost:9990/mcp -d \'{"jsonrpc":"2.0","id":1,"method":"tools/list"}\'')
lines.append(f'curl -s localhost:{mcp_port}/mcp -d \'{{"jsonrpc":"2.0","id":1,"method":"tools/list"}}\'')

Comment on lines +218 to +223
@pytest.mark.slow
class TestDaemonMCPRecovery:
"""Test MCP recovery after daemon crashes and restarts."""

def test_restart_after_clean_stop(self):
"""Stop then start again — MCP should come back."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stop() called twice on the same coordinator

test_mcp_dead_after_stop explicitly calls mcp_blueprint.stop() to assert the server goes down, but the mcp_blueprint fixture also calls coord.stop() in its teardown (yield coord; coord.stop()). This means stop() is invoked twice on the same coordinator. Depending on the implementation of stop(), this can raise exceptions, log spurious errors, or hang — especially since McpServer.stop() waits on _serve_future.result(timeout=5.0) which may raise if already stopped.

Consider testing this in an independent fixture that handles teardown manually, or guarding against double-stop in the coordinator/module.

Comment on lines +78 to +89
blueprint="stress-test",
started_at=datetime.now(timezone.utc).isoformat(),
log_dir="/tmp/dimos-stress-test",
cli_args=["stress-test"],
config_overrides={"n_workers": 1},
)
entry.save()
yield entry
entry.remove()


def _mcp_call(method: str, params: dict | None = None, port: int = MCP_PORT) -> dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded /tmp/dimos-stress-test path breaks test isolation

The mcp_entry fixture creates a RunEntry with log_dir="/tmp/dimos-stress-test" as a hard-coded string. If multiple test runs execute concurrently (e.g., in CI with parallelism), they will share the same path and can interfere with each other. The fixture already has access to tmp_path (via mcp_blueprint's implicit dependency chain); consider using str(tmp_path / "stress-test") for isolation, or at least a unique suffix tied to run_id.

…ests (DIM-686, DIM-687)

MCP server:
- new dimos/status and dimos/list_modules JSON-RPC methods

CLI:
- dimos mcp list-tools/call/status/modules commands
- uses stdlib urllib (no requests dependency)

agent context (DIM-687):
- generate_context() writes context.md on dimos run

stress tests (23 tests):
- MCP lifecycle, tools, error handling, rapid calls, restart cycles
- all tests use fixtures with cleanup, poll loops (no sleep)

Closes DIM-686, Closes DIM-687
- remove dimos restart from agent context (not in this branch yet)
- handle JSON-RPC errors in dimos mcp call (show error, exit 1)
- pass skills as parameter to dimos/status and dimos/list_modules handlers
- fix hardcoded port in curl example (use mcp_port parameter)
- fix double stop() in test_mcp_dead_after_stop (standalone coordinator)
- use tmp_path for log_dir in mcp_entry fixture (test isolation)
@spomichter spomichter force-pushed the feat/dim-686-mcp-agent-cli branch from 4afc593 to 9caa3fb Compare March 6, 2026 22:24
@spomichter
Copy link
Contributor Author

@greptile review — addressed all 6 comments: removed undocumented restart from context, added JSON-RPC error handling in CLI, passed skills as parameter to handlers, fixed hardcoded port in curl example, fixed double stop() in test, used tmp_path for test isolation.

- dimos/agent_send MCP method publishes on /human_input LCM channel
- dimos agent-send 'message' CLI wraps the MCP call
- 4 new tests: MCP send, empty message, CLI send, no-server
lines.append("Use `dimos mcp list-tools` for full schema, or call via:")
lines.append("```")
lines.append(
f'curl -s localhost:{mcp_port}/mcp -d \'{"jsonrpc":"2.0","id":1,"method":"tools/list"}\''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f-string evaluates JSON dict as Python expression

The curly braces in {"jsonrpc":"2.0","id":1,"method":"tools/list"} are treated as an f-string expression, not a literal string. Python evaluates the dict literal and calls str() on it, producing Python dict repr with single quotes and unquoted integers — not valid JSON. The resulting context.md will contain a broken curl command:

# Generated (wrong):
curl -s localhost:9990/mcp -d '{'jsonrpc': '2.0', 'id': 1, 'method': 'tools/list'}'

Double-braces must be used to escape literal curly braces in an f-string:

Suggested change
f'curl -s localhost:{mcp_port}/mcp -d \'{"jsonrpc":"2.0","id":1,"method":"tools/list"}\''
lines.append(
f'curl -s localhost:{mcp_port}/mcp -d \'{{"jsonrpc":"2.0","id":1,"method":"tools/list"}}\''
)

This produces the intended output: curl -s localhost:9990/mcp -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Comment on lines +245 to +256
def test_registry_cleanup_after_stop(self, mcp_blueprint, mcp_entry):
"""Registry entry should be removable after stop."""
assert health_check(mcp_blueprint)

runs = list_runs(alive_only=True)
assert len(runs) == 1

mcp_blueprint.stop()
mcp_entry.remove()

runs = list_runs(alive_only=True)
assert len(runs) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double stop() on coordinator via mcp_blueprint fixture

test_registry_cleanup_after_stop uses the mcp_blueprint fixture (whose teardown always calls coord.stop()) but also explicitly calls mcp_blueprint.stop() at line 252 before teardown runs. This triggers a double-stop: once in the test body and again in the fixture. Because McpServer.stop() waits on _serve_future.result(timeout=5.0) — which will raise or hang if the future has already been resolved — this can cause spurious errors, hangs, or misleading test failures.

Consider refactoring the test to use its own isolated coordinator (without the mcp_blueprint fixture) so it can own teardown:

def test_registry_cleanup_after_stop(self, mcp_entry):
    global_config.update(viewer_backend="none", n_workers=1)
    bp = autoconnect(StressTestModule.blueprint(), McpServer.blueprint())
    coord = bp.build()
    try:
        assert health_check(coord)
        runs = list_runs(alive_only=True)
        assert len(runs) == 1
        coord.stop()
        mcp_entry.remove()
        runs = list_runs(alive_only=True)
        assert len(runs) == 0
    finally:
        # guard in case test body fails before explicit stop
        try:
            coord.stop()
        except Exception:
            pass

- escape f-string curly braces in curl example (agent_context.py)
- fix double stop() in test_registry_cleanup_after_stop
- add JSON-RPC error handling to all MCP CLI commands
- add type annotation for LCM transport
- add agent-send to generated context.md CLI commands
@spomichter
Copy link
Contributor Author

@greptile review — fixed all 3 round-2 comments: escaped f-string JSON braces, eliminated double stop() in registry cleanup test, added error handling to all MCP CLI commands (not just call).

- dimos/module_io MCP method: skills grouped by module with schema
- dimos mcp module-io CLI: human-readable module/skill listing
- 2 new tests
deadline = time.monotonic() + timeout
while time.monotonic() < deadline:
try:
resp = requests.post(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so you are using requests. But you need to add it to pypackage.toml. We have it because other packages depend on it.

deadline = time.monotonic() + timeout
while time.monotonic() < deadline:
try:
requests.post(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need an McpAdapter which can handle using correct ids and should use an mcp_url, not just port.

entry.remove()


def _mcp_call(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another _mcp_call in robot/cli/dimos.py which uses urllib.request.urlopen. This one shorter and sweeter. (But I still think all of this MCP stuff should be a class.)


def _mcp_call(
method: str, params: dict[str, object] | None = None, port: int = MCP_PORT
) -> dict[str, object]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object is almost never a good type. When it can be anything the type should be Any.

return _handle_tools_list(req_id, skills)
if method == "tools/call":
return await _handle_tools_call(req_id, params, rpc_calls)
if method == "dimos/status":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is technically possible because MCP just uses jsonrpc, I'm not sure it makes sense for us to implement our own methods. I can't find anything that explicitly bans this or allows it. But how can it be used by others? They don't know what the methods are.

I think all of these should be tools. Clients can learn about them by calling tools/list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"pid": os.getpid(),
"modules": list(dict.fromkeys(s.class_name for s in skills)),
"skills": [s.func_name for s in skills],
"skill_count": len(skills),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"skill_count": len(skills),

Clients can call len() themselves.

return _jsonrpc_result(
req_id,
{
"pid": os.getpid(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the pid of whatever worker McpServer is deployed to. I assume you want the main process PID, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah now pulling from latest RunEntry.pid

Comment on lines +147 to +154
try:
from dimos.core.transport import pLCMTransport

transport: pLCMTransport[str] = pLCMTransport("/human_input")
transport.publish(message)
return _jsonrpc_result_text(req_id, f"Message sent to agent: {message[:100]}")
except Exception as e:
return _jsonrpc_error(req_id, -32000, f"Failed to send: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need start/stop. (Technically, start is also called on the first publish but that's a race condition.)

Suggested change
try:
from dimos.core.transport import pLCMTransport
transport: pLCMTransport[str] = pLCMTransport("/human_input")
transport.publish(message)
return _jsonrpc_result_text(req_id, f"Message sent to agent: {message[:100]}")
except Exception as e:
return _jsonrpc_error(req_id, -32000, f"Failed to send: {e}")
try:
from dimos.core.transport import pLCMTransport
transport: pLCMTransport[str] = pLCMTransport("/human_input")
transport.start()
transport.publish(message)
return _jsonrpc_result_text(req_id, f"Message sent to agent: {message[:100]}")
except Exception as e:
return _jsonrpc_error(req_id, -32000, f"Failed to send: {e}")
finally:
transport.stop()

print(f"{'=' * 60}")


def wait_for_mcp(timeout: float = 20.0) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much MCP duplication.

You have:

  • wait_for_mcp in dimos/core/tests/e2e_mcp_killtest.py
  • _wait_for_mcp in dimos/core/test_e2e_mcp_stress.py
  • wait_for_mcp in dimos/core/tests/e2e_devex_test.py

So many other MCP functions are duplicated.

I already have an mcp client in agents/mcp/mcp_client.py , but in that case "client" refers to the whole Module.

Like I said in other comments, we need an McpAdapter to abstract this. (I'm using the term "McpAdapter" so we don't end up confusing it with the McpClient Module.)

@@ -0,0 +1,306 @@
#!/usr/bin/env python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a test. It's not executed in CI.

I've been calling these manual verifications "demo" to avoid confusion.

import sys
import time

REPO_DIR = os.path.dirname(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use from dimos.constants import DIMOS_PROJECT_ROOT

# See the License for the specific language governing permissions and
# limitations under the License.

"""Standalone MCP kill/restart stress test — no pytest.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like these non-test tests. There's no value in test code that's not being executed. If it's not executed in CI who is going to run this?

While my mujoco tests also don't run in CI, I do run them myself, and I plan to have them work in CI eventually.


from dimos.core.module_coordinator import ModuleCoordinator

MCP_PORT = 9990
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 9990 all over the codebase. We should put it in a single place.

# See the License for the specific language governing permissions and
# limitations under the License.

"""MCP server e2e stress tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see how these are stress tests. They're supposed to test the limits of the system.

Also they're not true end-to-end tests because they don't launch the dimos command. e2e_devex_test.py and e2e_mcp_killtest.py are end-to-end, but they're no tests.

Normally, it's a good idea to run stress tests in an end-to-end manner because it disconnects the test harness from the system under test so that a slow system doesn't affect the measurements of the harness.

return _jsonrpc_result(req_id, {"modules": modules})


def _handle_dimos_agent_send(req_id: Any, params: dict[str, Any]) -> dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, this for us to use without using humancli, right?

Because otherwise it doesn't make sense for openclaw to send a message to our agent, when it is supposed to be the agent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct no humancli needed. it's an easy way for the agant to access our agent interface

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's what I mean. Why would we want openclaw to access our agent. Shouldn't openclaw be the agent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider any dimOS agent as a subagent. for example as our dimOS agents need to be small and local their sole purpose would be robot operation. Openclaw replaces the human as the coder and tester. Ofc since we have MCP openclaw can directly call any skills that the dimOS agent can as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPenclaw by defauilt dosen thave mcp, so this cli agent -send thing is acutally needed for it ti easilty communicate with teh dimOS agent without making the user setup mcporter or some other workaround

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah thinking about it more i assume agent-send wont rlly be used, the idea was scaffolding for subagent stuff. Agents can call dimos mcp [tool call] directly even without mcp server integrated. CLI tool is a wrapper on MCP

Address Paul review comments on PR #1451:

- New McpAdapter class replacing 3 duplicated _mcp_call implementations
- Convert dimos/status, list_modules, agent_send to @Skill on McpServer
- CLI thin wrappers over McpAdapter, added --json-args flag
- worker.py: os.kill(pid, 0) for pid check
- Renamed test files (demo_ prefix for non-CI, integration for pytest)
- transport.start()/stop(), removed skill_count, requests in pyproject

29/29 tests pass locally (41s).
@spomichter
Copy link
Contributor Author

@paul-nechifor made changes. i do want those cli tests eventually made to run in ci so followed your demo_ naming convention for now.

@spomichter
Copy link
Contributor Author

@greptile review

Comment on lines +162 to +167
return None
try:
# Signal 0 just checks if the process is alive.
pid: int = self._process.pid # type: ignore[assignment]
os.kill(pid, 0)
return pid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeError not caught when process.pid is None

self._process.pid is None when a multiprocessing.Process has been created but .start() has not been called yet. Passing None to os.kill(None, 0) raises TypeError: an integer is required, which is not a subclass of OSError and therefore escapes the except OSError clause — resulting in an unhandled exception wherever Worker.pid is called.

The old code was safe because is_alive() returns False for an unstarted process, cleanly returning None.

Suggested change
return None
try:
# Signal 0 just checks if the process is alive.
pid: int = self._process.pid # type: ignore[assignment]
os.kill(pid, 0)
return pid
if self._process is None:
return None
try:
# Signal 0 just checks if the process is alive.
pid = self._process.pid
if pid is None:
return None
os.kill(pid, 0)
return pid
except OSError:
return None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

near impossible edge case but fine

- McpAdapter.call(): catch requests.HTTPError and re-raise as McpError
  so CLI callers get clean error messages instead of raw tracebacks
- Worker.pid: check for None before os.kill() — unstarted processes
  have pid=None which would raise TypeError (not caught by OSError)
McpServer runs in a forkserver worker, so os.getpid() returns the
worker PID. Read from RunEntry instead to get the main daemon PID
that dimos stop/status actually needs.
Replace manual string splitting with _KeyValueType click.ParamType
per Paul's review suggestion. Validation and JSON coercion now handled
by click's type system instead of inline loop.
- Update test_mcp_integration.py and demo_mcp_killtest.py to use
  'viewer' instead of 'viewer_backend' (renamed in #1477)
- Fix test_cli_call_tool_wrong_arg_format: exit code 2 (click
  ParamType validation) instead of 1, check KEY=VALUE in output
- Merge dev to pick up #1477 rename and #1494 replay loop
@spomichter spomichter merged commit efda7cb into dev Mar 9, 2026
12 checks passed
@spomichter spomichter deleted the feat/dim-686-mcp-agent-cli branch March 9, 2026 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants