Conversation
Greptile SummaryThis PR introduces a look-out loop perception skill ( Key changes and issues found:
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Agent
participant PerceiveLoopSkill
participant ImageStream
participant VlModel
Agent->>PerceiveLoopSkill: look_out_for(["person"])
activate PerceiveLoopSkill
PerceiveLoopSkill->>VlModel: start()
PerceiveLoopSkill->>ImageStream: subscribe(sharpness_window)
PerceiveLoopSkill-->>Agent: "Started looking for ["person"]..."
deactivate PerceiveLoopSkill
loop Every ~2s (sharpness window)
ImageStream->>PerceiveLoopSkill: _on_image(frame)
activate PerceiveLoopSkill
Note over PerceiveLoopSkill: Acquires _lock
PerceiveLoopSkill->>VlModel: query_detections(image, active_lookout_str)
VlModel-->>PerceiveLoopSkill: detections (empty → return)
Note over PerceiveLoopSkill: Releases _lock
deactivate PerceiveLoopSkill
end
ImageStream->>PerceiveLoopSkill: _on_image(frame) [match found]
activate PerceiveLoopSkill
Note over PerceiveLoopSkill: Acquires _lock (held for entire block)
PerceiveLoopSkill->>VlModel: query_detections(image, active_lookout_str)
VlModel-->>PerceiveLoopSkill: detections [non-empty]
PerceiveLoopSkill->>PerceiveLoopSkill: dispose subscription, clear _active_lookout
PerceiveLoopSkill->>VlModel: query(image, describe_prompt)
VlModel-->>PerceiveLoopSkill: description
PerceiveLoopSkill->>VlModel: stop()
Note over PerceiveLoopSkill: Releases _lock
PerceiveLoopSkill->>Agent: add_message("Found a match for ('person',)...")
deactivate PerceiveLoopSkill
Agent->>PerceiveLoopSkill: stop_looking_out()
activate PerceiveLoopSkill
Note over PerceiveLoopSkill: Acquires _lock → reads active_lookout_str → releases _lock
PerceiveLoopSkill->>PerceiveLoopSkill: _stop_lookout() [re-acquires _lock]
PerceiveLoopSkill-->>Agent: "Stopped looking out for..."
deactivate PerceiveLoopSkill
Last reviewed commit: 32fbd6b
|
5002609 to
32fbd6b
Compare
32fbd6b to
649db7b
Compare
leshy
left a comment
There was a problem hiding this comment.
looks good, small suggestions at Pauls discression
0e07984 to
ea7a704
Compare
ea7a704 to
dfb30a0
Compare
Release v0.0.11 82 PRs, 10 contributors, 396 files changed. This release brings a production CLI, MCP tooling, temporal memory, and first-class support for coding agents. Dask has been removed. The entire stack now runs from `dimos run` through `dimos stop`. ### Agent-Native Development DimOS is now built to be driven by coding agents. Point OpenClaw, Claude Code, or Cursor at [AGENTS.md](AGENTS.md) and they can build, run, and debug Dimensional applications using the CLI and MCP interfaces directly. - **AGENTS.md** — comprehensive onboarding doc: architecture, CLI reference, skill rules, blueprint quick-reference. Your agent reads this and starts coding. - **MCP server** — all `@skill` methods exposed as HTTP tools. External agents call `dimos mcp call relative_move --arg forward=0.5` or connect via JSON-RPC. - **MCP CLI** — `dimos mcp list-tools`, `dimos mcp call`, `dimos mcp status`, `dimos mcp modules` - **Agent context logging** — MCP tool calls and agent messages logged to per-run JSONL for debugging and replay. ### CLI & Daemon Full process lifecycle — no more Ctrl-C in tmux. - `dimos run --daemon` — background execution with health checks and run registry - `dimos stop [--force]` — graceful shutdown with SIGTERM → SIGKILL fallback - `dimos restart` — replays the original CLI arguments - `dimos status` — PID, blueprint, uptime, MCP port - `dimos log -f` — structured per-run logs with follow, JSON output, filtering - `dimos show-config` — resolved GlobalConfig with source tracing ### Temporal-Spatial Memory Robots in physical space ingest hours of video and lidar. Temporal-spatial memory gives them a human-like understanding of the world — causal object relationships, entity tracking through time and physical space, and the ability to answer complex temporal queries: *Who spends the most time in the kitchen? What time on average do I wake up? Which set of switches toggles the main lights? Who was at the office at 9am last Thursday?* Traditional frame-level embeddings (CLIP, ViT) lose temporal context and don't scale beyond a handful of frames. Video transformers are expensive and don't operate in RGB-D. Dimensional agents work with video + lidar natively, tracking entities across hours and days. ```bash dimos --replay --replay-dir unitree_go2_office_walk2 run unitree-go2-temporal-memory ``` ### Interactive Viewer Custom Rerun fork (`dimos-viewer`) is now the default. Click-to-navigate: click a point in the 3D view → PointStamped → A* planner → robot moves. - Camera | 3D split layout on Go2, G1, and drone blueprints - Native keyboard teleop in the viewer - `--viewer rerun|rerun-web|rerun-connect|foxglove|none` ### Drone Support Drone blueprints modernized to match Go2 composition pattern. `drone-basic` and `drone-agentic` work with replay, Rerun, and the full CLI. ```bash dimos --replay run drone-basic dimos --replay run drone-agentic ``` ### More - **Go2 fleet control** — multi-robot with `--robot-ips` (#1487) - **Replay `--replay-dir`** — select dataset, loops by default (#1519, #1494) - **Interactive install** — `curl -fsSL .../install.sh | bash` (#1395) - **Nix on non-Debian Linux** (#1472) - **Remove Dask** — native worker pool (#1365) - **Remove asyncio dependency** (#1367) - **Perceive loop** — continuous observation module for agents (#1411) - **Worker resource monitor** — `dtop` TUI (#1378) - **G1 agent wiring fix** (#1518) - **Rerun rate limiting** — prevents viewer OOM on continuous streams (#1509, #1521) - **RotatingFileHandler** — prevents unbounded log growth (#1492) - **Test coverage** (#1397), draft PR CI skip (#1398), manipulation test fixes (#1522) ### Breaking Changes - `--viewer-backend` renamed to `--viewer` - Dask removed — blueprints using Dask workers need migration to native worker pool - Default viewer changed from `rerun-web` to `rerun` (native dimos-viewer) ### Contributors @spomichter, @PaulNechifor, @ruthwikdasyam, @summeryang, @MustafaBhadsorawala, @leshy, @sambull, @JeffHykin, @RadientBrain ## Contributor License Agreement - [x] I have read and approved the [CLA](https://github.com/dimensionalOS/dimos/blob/main/CLA.md).
Problem
We need a skill that can scan for something continuously.
Closes DIM-634
Solution
spatial-memoryto save on VRAM).Breaking Changes
None
How to Test
MuJoCo test ("lookout for a person (any person) and when you see him, follow him")
Contributor License Agreement