refactor: rename --viewer-backend to --viewer#1477
Conversation
Greptile SummaryThis PR is a clean rename refactor: the Key changes:
Status: Safe to merge after fixing the stale env-var reference in the documentation. Confidence Score: 3/5
|
b7f2298 to
dd47c38
Compare
Rename GlobalConfig field viewer_backend -> viewer. The CLI flag auto-generates from field name, so --viewer-backend becomes --viewer. All Python references, comments, and docs updated. No behavioral change.
dd47c38 to
0fb66b1
Compare
…6, DIM-687) (#1451) * feat(cli): add daemon mode for dimos run (DIM-681) * fix: address greptile review — fd leak, wrong PID, fabricated log path - close devnull/stderr_file after dup2 (fd leak) - remove PID from pre-fork output (printed parent PID, not daemon PID) - show log_dir not log_dir/dimos.jsonl (file doesn't exist yet) - keep tests in tests/ (dimos/conftest.py breaks isolated tests) * feat(cli): add dimos stop and dimos status commands (DIM-682, DIM-684) dimos status — shows running instances with run-id, pid, blueprint, uptime, log dir dimos stop — sends SIGTERM (or SIGKILL with --force), waits 5s, escalates if needed --pid to target specific instance, --all to stop everything cleans registry on stop also adds get_most_recent() to run_registry. 8 new tests covering sigterm, sigkill, escalation, dead process cleanup, most-recent lookup. * test: add e2e daemon lifecycle tests with PingPong blueprint lightweight 2-module blueprint (PingModule + PongModule) that needs no hardware, no LFS data, and no replay files. tests real forkserver workers and module deployment. covers: - single worker lifecycle (build -> health check -> registry -> stop) - multiple workers (2 workers, both alive) - health check detects dead worker (SIGKILL -> detect failure) - registry entry JSON field roundtrip - stale entry cleanup (dead PIDs removed, live entries kept) * fix: rename stderr.log to daemon.log (addresses greptile review) both stdout and stderr redirect to the same file after daemonize(), so stderr.log was misleading. daemon.log better describes the contents. * fix: resolve mypy type errors in stop command (DIM-681) * feat: per-run log directory with unified main.jsonl (DIM-685) folds DIM-685 into daemon PR to avoid merge conflicts on dimos.py. - set_run_log_dir() before blueprint.build() routes structlog to <log_base>/<run-id>/main.jsonl - workers inherit DIMOS_RUN_LOG_DIR env var via forkserver - FileHandler replaces RotatingFileHandler (multi-process rotation unsafe) - fallback: env var check -> legacy per-pid files - 6 unit tests for routing logic * fix: migrate existing FileHandlers when set_run_log_dir is called setup_logger() runs at import time throughout the codebase, creating FileHandlers pointing to the legacy log path. set_run_log_dir() was resetting the global path but not updating these existing handlers, so main.jsonl was created but stayed empty (0 bytes). fix: iterate all stdlib loggers and redirect their FileHandlers to the new per-run path. verified: main.jsonl now receives structured JSON logs (1050 bytes, 5 lines in test run). * chore: move daemon tests to dimos/core/ for CI discovery testpaths in pyproject.toml is ['dimos'], so tests/ at repo root wouldn't be picked up by CI. moved all 3 test files to dimos/core/ alongside existing core tests. all 41 tests pass with conftest active. * chore: mark e2e daemon tests as slow matches convention from test_worker.py — forkserver-based tests are marked slow so they run in CI but are skipped in local default pytest. local default: 36 tests (unit only) CI (-m 'not (tool or mujoco)'): 41 tests (unit + e2e) * test: add CLI integration tests for dimos stop and dimos status (DIM-682, DIM-684) 16 tests using typer CliRunner with real subprocess PIDs: status (7 tests): - no instances, running instance details, uptime formatting - MCP port, multiple instances, dead PID filtering stop (9 tests): - default most recent, --pid, --pid not found - --all, --all empty, --force SIGKILL - already-dead cleanup, SIGTERM verification * test: add e2e CLI tests against real running blueprint (DIM-682, DIM-684) 3 new e2e tests that exercise dimos status and stop against a live PingPong blueprint with real forkserver workers: - status shows live blueprint details (run_id, PID, blueprint name) - registry entry findable after real build, workers alive - coord.stop() kills workers, registry cleaned * fix: address paul's review comments - move system imports to top of run(), logger.info before heavy imports - remove hardcoded MCP port line from daemon output - add n_workers/n_modules properties to ModuleCoordinator (public API) - single-instance model: remove --all/--pid from stop, simplify status - use _get_user_data_dir() for XDG-compliant registry/log paths - remove mcp_port from RunEntry (should be in GlobalConfig) - inline _shutdown_handler as closure in install_signal_handlers - add SIGKILL wait poll (2s) to avoid zombie race with port conflict check - replace handler._open() private API with new FileHandler construction - fix e2e _clean_registry to monkeypatch REGISTRY_DIR - reset logging_config module globals in test fixture - move test imports to module level * fix: drop daemon.log, redirect all stdio to /dev/null structlog FileHandler writes to main.jsonl — daemon.log only ever captured signal shutdown messages. no useful content. * fix: restore LOG_BASE_DIR import, remove duplicate set_run_log_dir import fixes mypy name-defined error from import reorganization * fix: address remaining paul review comments - simplify health_check to single alive check (build is synchronous) - remove --health-timeout flag (no longer polling) - add workers property to ModuleCoordinator (public API) - separate try-excepts for kill/wait in sleeper fixture cleanup - move repeated imports to top in test_per_run_logs * fix: address all remaining paul review comments - convert all test helpers to fixtures with cleanup - replace os.environ['CI']='1' at import time with monkeypatch fixture - replace all time.sleep() with polling loops in tests - mark slow stop tests @pytest.mark.slow - move stop_entry logic from CLI to run_registry - remove # type: ignore, properly type _stop_entry - remove duplicate port conflict test - remove redundant monkeypatch cleanup in test_per_run_logs - list(glob()) to avoid mutation during iteration in cleanup_stale - XDG_STATE_HOME compliant _get_state_dir() for registry/log paths - remove redundant cli_config_overrides in e2e builds - move duplicate imports to module level in e2e tests * fix: remove module docstring from test_daemon.py * feat: MCP server enhancements, dimos mcp CLI, agent context, stress tests (DIM-686, DIM-687) MCP server: - new dimos/status and dimos/list_modules JSON-RPC methods CLI: - dimos mcp list-tools/call/status/modules commands - uses stdlib urllib (no requests dependency) agent context (DIM-687): - generate_context() writes context.md on dimos run stress tests (23 tests): - MCP lifecycle, tools, error handling, rapid calls, restart cycles - all tests use fixtures with cleanup, poll loops (no sleep) Closes DIM-686, Closes DIM-687 * fix: address greptile review on PR #1451 - remove dimos restart from agent context (not in this branch yet) - handle JSON-RPC errors in dimos mcp call (show error, exit 1) - pass skills as parameter to dimos/status and dimos/list_modules handlers - fix hardcoded port in curl example (use mcp_port parameter) - fix double stop() in test_mcp_dead_after_stop (standalone coordinator) - use tmp_path for log_dir in mcp_entry fixture (test isolation) * feat: dimos agent-send CLI + MCP method - dimos/agent_send MCP method publishes on /human_input LCM channel - dimos agent-send 'message' CLI wraps the MCP call - 4 new tests: MCP send, empty message, CLI send, no-server * fix: address greptile review round 2 - escape f-string curly braces in curl example (agent_context.py) - fix double stop() in test_registry_cleanup_after_stop - add JSON-RPC error handling to all MCP CLI commands - add type annotation for LCM transport - add agent-send to generated context.md CLI commands * feat: module IO introspection via MCP + CLI - dimos/module_io MCP method: skills grouped by module with schema - dimos mcp module-io CLI: human-readable module/skill listing - 2 new tests * fix: daemon context generation + standalone e2e stress tests Bug fixes found during e2e testing: - worker.pid: catch AssertionError after daemonize() when is_alive() asserts _parent_pid == os.getpid() (double-fork changes PID, but stored process.pid is still valid) - agent_context: fix f-string ValueError from triple braces in curl example — split into f-string + plain string concat New standalone test scripts (no pytest): - e2e_mcp_killtest.py: SIGKILL/restart stress test (3 cycles, verifies MCP recovers after crash, tests all endpoints) - e2e_devex_test.py: full developer experience test simulating OpenClaw agent workflow (daemon start → CLI ops → logs → stop) Register stress-test blueprint in all_blueprints.py for `dimos run stress-test --daemon` support. * refactor: strip module-io, fix greptile review issues 7-13 - Remove dimos/module_io handler + 'dimos mcp module-io' CLI command (showed skill descriptions, not actual IO — misleading) - Remove unused rpc_calls param from _handle_dimos_agent_send - Add JSON-RPC error checking to mcp_list_tools, mcp_status, mcp_modules - Fix empty tool content: exit 0 with '(no output)' instead of exit 1 - Add Content-Type header to curl example in agent context - Fix double-stop in test_registry_cleanup_after_stop - Remove module-io references from standalone test scripts * cleanup: remove agent_context.py, fix final greptile nits - Delete agent_context.py entirely — runtime info is available via `dimos status` and `dimos mcp status`, no need for a separate file - Remove generate_context() calls from CLI (daemon + foreground paths) - Fix non-deterministic module list in dimos/status (set → dict.fromkeys) - Remove unused heartbeat: Out[str] from StressTestModule - Remove dead list literal from e2e_mcp_killtest.py * fix: address latest greptile review round - Remove stale module-io step from e2e_devex_test (command was removed) - Fix step numbering in devex test (7-10 sequential, no gaps) - Fix double remove() on registry entry — let fixture handle cleanup - Remove misleading 'non-default to avoid conflicts' port comment * fix: resolve mypy errors in worker.py and stress_test_module - worker.py: use typed local variable for process.pid instead of direct return (fixes no-any-return) - stress_test_module.py: add return type annotation to start() * perf: class-scoped MCP fixtures, 125s → 51s test runtime - Make mcp_shared fixture class-scoped: blueprint starts once per class instead of once per test (~4s setup saved per test) - Move no-server tests (dead_after_stop, cli_no_server_error, agent_send_cli_no_server) to dedicated TestMCPNoServer class to avoid port conflict with shared fixture - Add full type annotations to all fixtures and helpers - Add docstring explaining performance rationale - Remove unused mcp_blueprint fixture (replaced by mcp_shared) * fix: resolve remaining CI failures (mypy + all_blueprints) - Fix mypy errors in standalone test scripts (e2e_devex_test.py, e2e_mcp_killtest.py): typed CompletedProcess, multiprocessing.Event, dict params, assert pid not None before os.kill - Regenerate all_blueprints.py (stress-test entry now alphabetically sorted) * refactor: McpAdapter class + convert custom methods to @Skill tools Address Paul review comments on PR #1451: - New McpAdapter class replacing 3 duplicated _mcp_call implementations - Convert dimos/status, list_modules, agent_send to @Skill on McpServer - CLI thin wrappers over McpAdapter, added --json-args flag - worker.py: os.kill(pid, 0) for pid check - Renamed test files (demo_ prefix for non-CI, integration for pytest) - transport.start()/stop(), removed skill_count, requests in pyproject 29/29 tests pass locally (41s). * fix: alphabetical order in all_blueprints.py for demo-mcp-stress-test * fix: catch HTTPError in McpAdapter, guard None pid in Worker - McpAdapter.call(): catch requests.HTTPError and re-raise as McpError so CLI callers get clean error messages instead of raw tracebacks - Worker.pid: check for None before os.kill() — unstarted processes have pid=None which would raise TypeError (not caught by OSError) * fix: server_status returns main process PID, not worker PID McpServer runs in a forkserver worker, so os.getpid() returns the worker PID. Read from RunEntry instead to get the main daemon PID that dimos stop/status actually needs. * refactor: use click.ParamType for --arg parsing in mcp call Replace manual string splitting with _KeyValueType click.ParamType per Paul's review suggestion. Validation and JSON coercion now handled by click's type system instead of inline loop. * fix: viewer_backend → viewer rename + KeyValueType test fix - Update test_mcp_integration.py and demo_mcp_killtest.py to use 'viewer' instead of 'viewer_backend' (renamed in #1477) - Fix test_cli_call_tool_wrong_arg_format: exit code 2 (click ParamType validation) instead of 1, check KEY=VALUE in output - Merge dev to pick up #1477 rename and #1494 replay loop * fix: mypy arg-type error for KeyValueType dict(args)
Renames the GlobalConfig field
viewer_backend→viewer, which auto-generates the CLI flag as--viewerinstead of--viewer-backend.Shorter, cleaner. All Python references, comments, and docs updated. No behavioral change.
15 files, ±30 lines