Skip to content

fix(rerun): only rate-limit heavy message types (Image, PointCloud2)#1521

Merged
spomichter merged 2 commits intodevfrom
fix/rate-limit-heavy-only
Mar 11, 2026
Merged

fix(rerun): only rate-limit heavy message types (Image, PointCloud2)#1521
spomichter merged 2 commits intodevfrom
fix/rate-limit-heavy-only

Conversation

@spomichter
Copy link
Contributor

Problem

PR #1509 added blanket per-entity-path rate-limiting (10 Hz max) to the Rerun bridge to prevent viewer OOM from high-bandwidth camera streams. This inadvertently dropped low-frequency but critical messages like navigation Path and PointStamped (click-to-nav), breaking path visualization in the viewer.

Fix

Only rate-limit message types with large payloads that actually cause viewer OOM:

  • Image (~1 MB/frame at 30 fps)
  • PointCloud2 (~600-800 KB/frame from lidar)

Light messages (Path, PointStamped, Twist, TF, EntityMarkers, etc.) now pass through unthrottled.

Changes

  • Added _HEAVY_MSG_TYPES = (Image, PointCloud2) tuple
  • Rate limiter now checks isinstance(msg, _HEAVY_MSG_TYPES) before throttling
  • One-line logic change + imports

Testing

  • Verified imports resolve correctly
  • Image and PointCloud2 are the only types with payloads large enough to cause OOM
  • All other message types pass through instantly, fixing click-to-nav path rendering

The blanket per-entity-path rate limiter (PR #1509) was dropping
low-frequency but critical messages like navigation Path and
PointStamped (click-to-nav).

Only rate-limit types with large payloads that actually cause viewer
OOM: Image (~1 MB/frame) and PointCloud2 (~600-800 KB/frame).
Light messages (Path, Twist, TF, EntityMarkers, etc.) now pass
through unthrottled.
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR narrows the Rerun bridge's rate-limiting guard (introduced in #1509) so it only applies to the two message types that carry large payloads — Image and PointCloud2 — instead of every entity path. This restores pass-through delivery for lightweight navigation messages (Path, PointStamped, Twist, TF, etc.) that were previously being dropped at the 10 Hz cap, fixing broken path visualization and click-to-navigate functionality.

  • Added module-level _HEAVY_MSG_TYPES = (Image, PointCloud2) constant with a clear explanatory comment
  • Changed the rate-limit condition from if self.config.min_interval_sec > 0 to if self.config.min_interval_sec > 0 and isinstance(msg, _HEAVY_MSG_TYPES), so only heavy messages are throttled
  • Minor style inconsistency in imports: PointCloud2 is imported directly from its module (dimos.msgs.sensor_msgs.PointCloud2) while Image is imported via the package __init__; since both are exported from dimos.msgs.sensor_msgs, a single unified import would be cleaner

Confidence Score: 5/5

  • Safe to merge — the one-line logic change is correct and well-targeted, with no risk of regressions.
  • The change is minimal and surgical: a single isinstance guard added to an already-working rate-limit block. Image and PointCloud2 are both correctly imported and are concrete types whose isinstance check will also cover any future subclasses. Light messages now skip the throttle entirely, restoring the previously broken navigation overlay. No data loss, no concurrency concerns, no side effects on the rest of the bridge.
  • No files require special attention.

Important Files Changed

Filename Overview
dimos/visualization/rerun/bridge.py Rate-limiting guard narrowed from all message types to only Image and PointCloud2; logic is correct and well-commented. Minor style issue: PointCloud2 import uses a direct module path while Image uses the package __init__ — both are available from the package directly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Pubsub message received] --> B[_on_message]
    B --> C[Compute entity_path]
    C --> D{min_interval_sec > 0\nAND isinstance msg,\n_HEAVY_MSG_TYPES?}
    D -- No\n light msg e.g. Path, PointStamped, TF --> F[Apply visual override / to_rerun]
    D -- Yes\n Image or PointCloud2 --> E{now - last_log\n< min_interval_sec?}
    E -- Too soon --> G[Drop frame / return]
    E -- OK --> H[Update _last_log timestamp]
    H --> F
    F --> I{rerun_data is None?}
    I -- Yes --> J[Suppress / return]
    I -- No --> K{is_rerun_multi?}
    K -- Yes --> L[rr.log each path+archetype]
    K -- No --> M[rr.log entity_path, archetype]
Loading

Last reviewed commit: 1083f46

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@spomichter spomichter merged commit 5bebf73 into dev Mar 11, 2026
12 checks passed
@spomichter spomichter deleted the fix/rate-limit-heavy-only branch March 11, 2026 18:24
@spomichter spomichter mentioned this pull request Mar 11, 2026
1 task
spomichter added a commit that referenced this pull request Mar 12, 2026
Release v0.0.11

82 PRs, 10 contributors, 396 files changed.

This release brings a production CLI, MCP tooling, temporal memory, and first-class support for coding agents. Dask has been removed. The entire stack now runs from `dimos run` through `dimos stop`.

### Agent-Native Development

DimOS is now built to be driven by coding agents. Point OpenClaw, Claude Code, or Cursor at [AGENTS.md](AGENTS.md) and they can build, run, and debug Dimensional applications using the CLI and MCP interfaces directly.

- **AGENTS.md** — comprehensive onboarding doc: architecture, CLI reference, skill rules, blueprint quick-reference. Your agent reads this and starts coding.
- **MCP server** — all `@skill` methods exposed as HTTP tools. External agents call `dimos mcp call relative_move --arg forward=0.5` or connect via JSON-RPC.
- **MCP CLI** — `dimos mcp list-tools`, `dimos mcp call`, `dimos mcp status`, `dimos mcp modules`
- **Agent context logging** — MCP tool calls and agent messages logged to per-run JSONL for debugging and replay.

### CLI & Daemon

Full process lifecycle — no more Ctrl-C in tmux.

- `dimos run --daemon` — background execution with health checks and run registry
- `dimos stop [--force]` — graceful shutdown with SIGTERM → SIGKILL fallback
- `dimos restart` — replays the original CLI arguments
- `dimos status` — PID, blueprint, uptime, MCP port
- `dimos log -f` — structured per-run logs with follow, JSON output, filtering
- `dimos show-config` — resolved GlobalConfig with source tracing

### Temporal-Spatial Memory

Robots in physical space ingest hours of video and lidar. Temporal-spatial memory gives them a human-like understanding of the world — causal object relationships, entity tracking through time and physical space, and the ability to answer complex temporal queries:

*Who spends the most time in the kitchen? What time on average do I wake up? Which set of switches toggles the main lights? Who was at the office at 9am last Thursday?*

Traditional frame-level embeddings (CLIP, ViT) lose temporal context and don't scale beyond a handful of frames. Video transformers are expensive and don't operate in RGB-D. Dimensional agents work with video + lidar natively, tracking entities across hours and days.

```bash
dimos --replay --replay-dir unitree_go2_office_walk2 run unitree-go2-temporal-memory
```

### Interactive Viewer

Custom Rerun fork (`dimos-viewer`) is now the default. Click-to-navigate: click a point in the 3D view → PointStamped → A* planner → robot moves.

- Camera | 3D split layout on Go2, G1, and drone blueprints
- Native keyboard teleop in the viewer
- `--viewer rerun|rerun-web|rerun-connect|foxglove|none`

### Drone Support

Drone blueprints modernized to match Go2 composition pattern. `drone-basic` and `drone-agentic` work with replay, Rerun, and the full CLI.

```bash
dimos --replay run drone-basic
dimos --replay run drone-agentic
```

### More

- **Go2 fleet control** — multi-robot with `--robot-ips` (#1487)
- **Replay `--replay-dir`** — select dataset, loops by default (#1519, #1494)
- **Interactive install** — `curl -fsSL .../install.sh | bash` (#1395)
- **Nix on non-Debian Linux** (#1472)
- **Remove Dask** — native worker pool (#1365)
- **Remove asyncio dependency** (#1367)
- **Perceive loop** — continuous observation module for agents (#1411)
- **Worker resource monitor** — `dtop` TUI (#1378)
- **G1 agent wiring fix** (#1518)
- **Rerun rate limiting** — prevents viewer OOM on continuous streams (#1509, #1521)
- **RotatingFileHandler** — prevents unbounded log growth (#1492)
- **Test coverage** (#1397), draft PR CI skip (#1398), manipulation test fixes (#1522)

### Breaking Changes

- `--viewer-backend` renamed to `--viewer`
- Dask removed — blueprints using Dask workers need migration to native worker pool
- Default viewer changed from `rerun-web` to `rerun` (native dimos-viewer)

### Contributors

@spomichter, @PaulNechifor, @ruthwikdasyam, @summeryang, @MustafaBhadsorawala, @leshy, @sambull, @JeffHykin, @RadientBrain

## Contributor License Agreement

- [x] I have read and approved the [CLA](https://github.com/dimensionalOS/dimos/blob/main/CLA.md).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant