feat(cli): dimos restart command (DIM-683)#1476
Conversation
Greptile SummaryThis PR implements Key points:
Confidence Score: 5/5
Last reviewed commit: 7406c2c |
dimos/robot/cli/dimos.py
Outdated
| if isinstance(value, bool): | ||
| if value: | ||
| cmd.append(flag) | ||
| else: |
There was a problem hiding this comment.
Boolean False config overrides are silently dropped on restart.
When a boolean config override is explicitly False (e.g., the original run was started with --no-verbose or another --no-<flag> style option), the current code only appends --<flag> for True values and appends nothing for False, causing the --no-<flag> form to be lost when reconstructing the command.
Looking at create_dynamic_callback() (lines 65-77), boolean GlobalConfig fields use the --flag/--no-flag Typer pattern. Any False override stored in config_overrides must emit --no-<flag> to preserve the original invocation.
| if isinstance(value, bool): | |
| if value: | |
| cmd.append(flag) | |
| else: | |
| if isinstance(value, bool): | |
| if value: | |
| cmd.append(flag) | |
| else: | |
| cmd.append(f"--no-{key.replace('_', '-')}") |
Re-stop the running instance and re-invoke with the same blueprint args and config overrides. Uses os.execvp to replace the process so the new run inherits the terminal. dimos restart # graceful stop + restart dimos restart --force # SIGKILL + restart dimos restart --daemon # restart in background
Store sys.argv in RunEntry.original_argv when a run starts. The restart command now replays the exact original command via os.execvp instead of fragile config_overrides → CLI flag reconstruction. Removes --daemon flag from restart (already in saved argv if used). Tested: daemon start → restart → new instance starts with same args.
581e015 to
fd0b8a9
Compare
|
@greptile |
dimos/robot/cli/dimos.py
Outdated
| typer.echo(f" {msg}") | ||
|
|
||
| typer.echo(f" Running: {' '.join(argv)}") | ||
| os.execvp(argv[0], argv) |
There was a problem hiding this comment.
Unhandled OSError from os.execvp
os.execvp raises FileNotFoundError (a subclass of OSError) if argv[0] is not found on PATH, and PermissionError if it is not executable. Since the old instance has already been stopped at this point, an unhandled exception here leaves the system with nothing running and exposes a raw Python traceback to the user instead of a clean error message.
| os.execvp(argv[0], argv) | |
| try: | |
| os.execvp(argv[0], argv) | |
| except OSError as exc: | |
| typer.echo(f"Error: failed to restart — {exc}", err=True) | |
| raise typer.Exit(1) |
dimos/robot/cli/dimos.py
Outdated
| msg, _ok = stop_entry(entry, force=force) | ||
| typer.echo(f" {msg}") | ||
|
|
||
| typer.echo(f" Running: {' '.join(argv)}") | ||
| os.execvp(argv[0], argv) |
There was a problem hiding this comment.
No wait after SIGKILL before exec in force mode
When --force is used, stop_entry sends SIGKILL and returns immediately without waiting for the old process to exit. os.execvp then runs the new dimos run which will call check_port_conflicts(). Because entry.remove() has already been called inside stop_entry, the registry check passes, but the underlying gRPC port may still be transiently held by the dying process. In practice SIGKILL is near-instantaneous on Linux, but a small polling wait before exec would make the force path as robust as the SIGTERM path (which already waits up to 5 s):
msg, _ok = stop_entry(entry, force=force)
typer.echo(f" {msg}")
# For SIGKILL the port is released almost immediately, but a brief
# poll ensures the new run's port-conflict check never races the old process.
from dimos.core.run_registry import is_pid_alive
for _ in range(20): # up to 2 s
if not is_pid_alive(entry.pid):
break
time.sleep(0.1)
typer.echo(f" Running: {' '.join(argv)}")
os.execvp(argv[0], argv)| @main.command() | ||
| def restart( | ||
| force: bool = typer.Option(False, "--force", "-f", help="Force kill before restarting"), | ||
| ) -> None: | ||
| """Restart the running DimOS instance with the same arguments.""" |
There was a problem hiding this comment.
--daemon usage in PR description is unimplemented
The PR description explicitly advertises dimos restart --daemon as a supported invocation, but the restart command only declares a --force / -f option. Running dimos restart --daemon today produces a Typer "Unexpected option" error.
If background restart is genuinely desired, a --daemon flag needs to be added here (and os.execvp replaced with a subprocess spawn for that code path). If the feature is out of scope for this PR the description should be corrected to avoid user confusion.
- Wrap os.execvp in try/except OSError so a missing binary gives a clean error instead of a traceback (after the old instance is dead) - Poll up to 2s for old process to exit before exec, preventing port conflict races when --force sends SIGKILL
a0a7e5f to
9b0efa8
Compare
|
@greptile |
| cli_args: list[str] = field(default_factory=list) | ||
| config_overrides: dict[str, object] = field(default_factory=dict) | ||
| grpc_port: int = 9877 | ||
| original_argv: list[str] = field(default_factory=list) |
There was a problem hiding this comment.
Very minor suggestion, but it might be good to make these kind of things immutable in future generally. This also avoids needing the default_factory:
| original_argv: list[str] = field(default_factory=list) | |
| original_argv: Sequence[str] = () |
Summary
Adds
dimos restartcommand that stops the running DimOS instance and re-invokes with the same arguments.Supersedes #1445 (which targeted the now-merged daemon branch and referenced removed
_stop_entryfunction).Usage
How it works
original_argv(exactsys.argvfrom the original run)stop_entry()(SIGTERM or SIGKILL with--force)os.execvp()to replay the original commandNo CLI flag reconstruction — just replays
sys.argvverbatim. If the original run used--daemon, the restart inherits it automatically.Changes
dimos/core/run_registry.py: Addoriginal_argvfield toRunEntry(+1 line)dimos/robot/cli/dimos.py: Storesys.argvin RunEntry, addrestartcommand (+30 lines)Related
Contributor License Agreement