Skip to content

feat(dask): remove dask#1365

Merged
leshy merged 1 commit intodevfrom
paul/feat/remove-dask
Feb 28, 2026
Merged

feat(dask): remove dask#1365
leshy merged 1 commit intodevfrom
paul/feat/remove-dask

Conversation

@paul-nechifor
Copy link
Contributor

@paul-nechifor paul-nechifor commented Feb 25, 2026

Problem

  • Dask is slow to start.
  • Causes issues with shutdown (you get ugly errors when hitting Ctrl+C).
  • Required a lot of ugly patches in core/__init__.py
  • Causes confusion because things are imported once per worker. (E.g.: see Jeff's rerun issue, or my actor confusion)

Closes DIM-609

Solution

  • Removed the dask package entirely.
  • Moved imports out of dimos.core. Now you have to import from the original place.
  • Removed everything in dimos/core/__init__.py.
  • Made dimos shut down cleanly, without any exceptions, and right away.
  • Inlined some of the imports which were causing slowdowns.
  • Added a short doc about py-spy so others can investigate speed if they want to.
  • Fixed mujoco errors on shutdown.

Breaking Changes

  • UtilizationModule is gone.
  • Blueprint configurations cannot use lambdas anymore because multiprocessing requires params to be pickleable, and lambdas are not because they don't have names. This worked with Dask because Dask serialized the bytecode of the lambda and reconstructed it on the other side.

How to Test

Test any blueprint. Example:

uv run dimos run unitree-go2

Contributor License Agreement

  • I have read and approved the CLA.

@paul-nechifor paul-nechifor marked this pull request as draft February 25, 2026 06:33
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 25, 2026

Too many files changed for review. (135 files found, 100 file limit)

opacity=0.2,
background="#484981",
),
"world/camera_info": _convert_camera_info,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this change is needed is because multiprocessing can't pickle lambdas. Lambdas don't have names so they can't be pickled because there's nothing to reference them from the other side.

Dask cheats by serializing the bytecode of the lambda.

@paul-nechifor paul-nechifor marked this pull request as ready for review February 27, 2026 00:34
@@ -1,278 +0,0 @@
from __future__ import annotations
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted dask stuff here

self._close_module()

def _close_module(self) -> None:
with self._module_closed_lock:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't double close

raise TypeError(f"Input {input_name} is not a valid stream")
input_stream.connection = remote_stream

def dask_receive_msg(self, input_name: str, msg: Any) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer need these dask functions

for module_class, module in reversed(self._deployed_modules.items()):
logger.info("Stopping module...", module=module_class.__name__)
try:
module.stop()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continue stopping other modules even if one errors



@pytest.mark.slow
def test_worker_pool_modules_share_workers(create_worker_manager):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test that verifies that two modules can deploy to the same worker

self._module_class: type[ModuleT] = module_class
self._args: tuple[Any, ...] = args
self._kwargs: dict[Any, Any] = kwargs or {}
"""Generic worker process that can host multiple modules."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed Worker to support deploying multiple modules to it.

self._closed = False
self._started = False

def start(self) -> None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start the workers beforehand.

@@ -1,60 +0,0 @@
# Copyright 2025-2026 Dimensional Inc.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was used and we have a blueprint to test the cam

Returns:
Detection2DBBox instance or None if invalid
"""
from dimos.perception.detection.type import Detection2DBBox
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline to avoid torch import.

else:
logger.warning("Tracking module failed to start")

self.connection.start()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start no longer returns a boolean

from dimos.robot.unitree.g1.blueprints.basic.unitree_g1_basic import unitree_g1_basic


def _person_only(det: Any) -> bool:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cannot serialize lambdas, so using functions

Copy link
Contributor

@leshy leshy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

large PR but actually acan't find any objections to core/ stuff

@leshy leshy merged commit ed35bd7 into dev Feb 28, 2026
12 checks passed
model = MockModel(json_path=self.config.model_fixture)

with self._lock:
# Here to prevent unwanted imports in the file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul-nechifor I assume LLM generated comments its on every import shift. not useful and looks ugly

Copy link
Contributor Author

@paul-nechifor paul-nechifor Feb 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not LLM generated. I specifically added them there to indicate that those imports should not be moved to the top because they cause imports (like that of torch) which slow down the start of dimos run.

Without those comments, people might move them to the top. I often move unnecessary local imports because they're added by LLMs. LLMs often prefer to add local imports because they have the habit of producing the smallest diff possible, even when it doesn't make sense.


with self._lock:
if self._tracker is None:
# Here to prevent unwanted imports in the file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

super().__init__()
self._skill_started = False

# Here to prevent unwanted imports in the file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@spomichter spomichter mentioned this pull request Mar 11, 2026
1 task
spomichter added a commit that referenced this pull request Mar 12, 2026
Release v0.0.11

82 PRs, 10 contributors, 396 files changed.

This release brings a production CLI, MCP tooling, temporal memory, and first-class support for coding agents. Dask has been removed. The entire stack now runs from `dimos run` through `dimos stop`.

### Agent-Native Development

DimOS is now built to be driven by coding agents. Point OpenClaw, Claude Code, or Cursor at [AGENTS.md](AGENTS.md) and they can build, run, and debug Dimensional applications using the CLI and MCP interfaces directly.

- **AGENTS.md** — comprehensive onboarding doc: architecture, CLI reference, skill rules, blueprint quick-reference. Your agent reads this and starts coding.
- **MCP server** — all `@skill` methods exposed as HTTP tools. External agents call `dimos mcp call relative_move --arg forward=0.5` or connect via JSON-RPC.
- **MCP CLI** — `dimos mcp list-tools`, `dimos mcp call`, `dimos mcp status`, `dimos mcp modules`
- **Agent context logging** — MCP tool calls and agent messages logged to per-run JSONL for debugging and replay.

### CLI & Daemon

Full process lifecycle — no more Ctrl-C in tmux.

- `dimos run --daemon` — background execution with health checks and run registry
- `dimos stop [--force]` — graceful shutdown with SIGTERM → SIGKILL fallback
- `dimos restart` — replays the original CLI arguments
- `dimos status` — PID, blueprint, uptime, MCP port
- `dimos log -f` — structured per-run logs with follow, JSON output, filtering
- `dimos show-config` — resolved GlobalConfig with source tracing

### Temporal-Spatial Memory

Robots in physical space ingest hours of video and lidar. Temporal-spatial memory gives them a human-like understanding of the world — causal object relationships, entity tracking through time and physical space, and the ability to answer complex temporal queries:

*Who spends the most time in the kitchen? What time on average do I wake up? Which set of switches toggles the main lights? Who was at the office at 9am last Thursday?*

Traditional frame-level embeddings (CLIP, ViT) lose temporal context and don't scale beyond a handful of frames. Video transformers are expensive and don't operate in RGB-D. Dimensional agents work with video + lidar natively, tracking entities across hours and days.

```bash
dimos --replay --replay-dir unitree_go2_office_walk2 run unitree-go2-temporal-memory
```

### Interactive Viewer

Custom Rerun fork (`dimos-viewer`) is now the default. Click-to-navigate: click a point in the 3D view → PointStamped → A* planner → robot moves.

- Camera | 3D split layout on Go2, G1, and drone blueprints
- Native keyboard teleop in the viewer
- `--viewer rerun|rerun-web|rerun-connect|foxglove|none`

### Drone Support

Drone blueprints modernized to match Go2 composition pattern. `drone-basic` and `drone-agentic` work with replay, Rerun, and the full CLI.

```bash
dimos --replay run drone-basic
dimos --replay run drone-agentic
```

### More

- **Go2 fleet control** — multi-robot with `--robot-ips` (#1487)
- **Replay `--replay-dir`** — select dataset, loops by default (#1519, #1494)
- **Interactive install** — `curl -fsSL .../install.sh | bash` (#1395)
- **Nix on non-Debian Linux** (#1472)
- **Remove Dask** — native worker pool (#1365)
- **Remove asyncio dependency** (#1367)
- **Perceive loop** — continuous observation module for agents (#1411)
- **Worker resource monitor** — `dtop` TUI (#1378)
- **G1 agent wiring fix** (#1518)
- **Rerun rate limiting** — prevents viewer OOM on continuous streams (#1509, #1521)
- **RotatingFileHandler** — prevents unbounded log growth (#1492)
- **Test coverage** (#1397), draft PR CI skip (#1398), manipulation test fixes (#1522)

### Breaking Changes

- `--viewer-backend` renamed to `--viewer`
- Dask removed — blueprints using Dask workers need migration to native worker pool
- Default viewer changed from `rerun-web` to `rerun` (native dimos-viewer)

### Contributors

@spomichter, @PaulNechifor, @ruthwikdasyam, @summeryang, @MustafaBhadsorawala, @leshy, @sambull, @JeffHykin, @RadientBrain

## Contributor License Agreement

- [x] I have read and approved the [CLA](https://github.com/dimensionalOS/dimos/blob/main/CLA.md).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants