Skip to content

GraspGen generator with docker dimos module integration#1119

Merged
JalajShuklaSS merged 8 commits intodevfrom
jalaj_graspgen_docker_integration
Feb 7, 2026
Merged

GraspGen generator with docker dimos module integration#1119
JalajShuklaSS merged 8 commits intodevfrom
jalaj_graspgen_docker_integration

Conversation

@JalajShuklaSS
Copy link
Contributor

Adds GraspGen grasp generation to DimOS and introduces Docker-backed module execution, enabling dependency-heavy modules to run in containers while behaving like native modules with RPC, streaming, and autoconnect support. This skill would further be a subskill inside the manipulation stack to generate grasps.

GraspGen Module

  • Adds a GraspGen module for grasp generation.
  • Generates grasp candidates from pointcloud inputs and publishes grasp poses via streams
  • Exposes lifecycle control and grasp generation via RPC
  • Designed to run inside Docker to isolate CUDA, PyTorch, and EGL dependencies avoiding ABI conflicts
  • Having a temporary test demo, composable with existing perception and manipulation pipelines

Docker-backed Module Support

  • Introduces DockerModule (host side handler) and StandaloneModuleRunner (container runtime)
  • Enables containerized modules to behave like native modules from the blueprint perspective
  • Supports RPC communication via LCM multicast
  • Enables stream wiring via configure_stream, compatible with existing autoconnect logic
  • Current stream support uses pLCMTransport (topic-based pub/sub)
  • Once this spec and schematic is discussed can add full transport parity (SHM, JPEG-SHM, typed LCM)

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 27, 2026

Greptile Overview

Greptile Summary

This PR introduces Docker-backed module execution to DimOS and adds GraspGen grasp generation capability. The implementation enables dependency-heavy modules (CUDA, PyTorch, EGL) to run in isolated containers while behaving like native modules from the blueprint perspective.

Key Changes:

  • Docker Module Infrastructure: New DockerModule host-side handler and StandaloneModuleRunner for containerized execution with RPC communication via LCM multicast
  • Dockerfile Conversion: Automated footer injection system (docker_build.py, module-install.sh) that converts any Dockerfile into a DimOS module container
  • GraspGen Integration: Adds GraspGenModule with grasp pose generation from pointclouds, supporting multiple grippers (Robotiq 2F-140, Franka Panda, suction cup)
  • Perception Extensions: Enhanced ObjectSceneRegistrationModule with pointcloud extraction methods for grasp generation support
  • Stream Wiring: New configure_stream RPC method enables Docker modules to participate in autoconnect stream wiring
  • End-to-End Demo: Complete demo_perception_grasping blueprint integrating camera → perception → grasp generation → agent

Issues Found:

  • Path mismatch between Dockerfile (GRASPGEN_PATH=/app/GraspGen) and Python default (/dimos/third_party/GraspGen) will cause initialization failures
  • Config file search returns non-existent path instead of raising error
  • Personal test file objscene_registreation_myversion.py with typo should not be committed
  • Temporary testing module temp_graspgen_testing.py marked as temporary but being merged
  • PoseArray uses insecure pickle encoding instead of proper LCM schema

Architecture Notes:
The Docker module deployment bypasses standard Dask actor deployment, which may affect existing lifecycle hooks. The implementation uses host networking for LCM multicast, requiring --network=host mode.

Confidence Score: 3/5

  • This PR has solid architecture but contains critical path mismatch bug and includes temporary/test files that should not be merged
  • Score of 3 reflects well-designed Docker module infrastructure and comprehensive implementation, but the GRASPGEN_PATH mismatch between Dockerfile and Python code will cause runtime failures. Additionally, the inclusion of personal test files and temporary modules reduces production readiness.
  • Pay close attention to dimos/grasping/graspgen_module.py (path mismatch bug), dimos/perception/objscene_registreation_myversion.py (should be removed), and dimos/grasping/temp_graspgen_testing.py (temporary code)

Important Files Changed

Filename Overview
dimos/core/docker_module.py Adds DockerModule host-side handler and StandaloneModuleRunner for containerized module execution with RPC communication
dimos/grasping/graspgen_module.py Implements GraspGenModule with Docker-based execution for grasp pose generation with CUDA dependencies
dimos/core/init.py Integrates Docker module deployment into DimosCluster with automatic detection and lifecycle management
dimos/grasping/temp_graspgen_testing.py Temporary test pipeline wiring perception to grasp generation via RPC calls (marked as temporary)
dimos/msgs/geometry_msgs/PoseArray.py Adds PoseArray message type with pickle-based encoding/decoding for grasp pose arrays
dimos/perception/objscene_registreation_myversion.py Appears to be a personal copy of object_scene_registration with typo in filename (should not be committed)

Sequence Diagram

sequenceDiagram
    participant User
    participant Agent
    participant GraspingSkill
    participant GraspPipeline
    participant Perception
    participant GraspGen
    participant Docker
    
    Note over User,Docker: Initialization Phase
    User->>Agent: Start demo_perception_grasping blueprint
    Agent->>Docker: Deploy GraspGenModule
    Docker->>Docker: Build image with DimOS footer
    Docker->>GraspGen: Start container with LCMRPC
    GraspGen-->>Docker: Module ready signal
    Docker-->>Agent: DockerModule handle created
    Agent->>Perception: Deploy ObjectSceneRegistration
    Agent->>GraspPipeline: Deploy GraspPipeline
    Agent->>GraspingSkill: Deploy GraspingSkillContainer
    
    Note over User,Docker: Grasp Generation Flow
    User->>Agent: "Generate grasps for cup"
    Agent->>GraspingSkill: generate_grasps("cup")
    GraspingSkill->>GraspPipeline: Publish trigger ("cup")
    GraspPipeline->>Perception: RPC: get_object_pointcloud_by_name("cup")
    Perception-->>GraspPipeline: PointCloud2
    GraspPipeline->>Perception: RPC: get_full_scene_pointcloud()
    Perception-->>GraspPipeline: Scene PointCloud2
    GraspPipeline->>GraspGen: RPC: generate_grasps(pc, scene_pc)
    GraspGen->>GraspGen: Initialize GraspGen model (if needed)
    GraspGen->>GraspGen: Run inference with CUDA
    GraspGen->>GraspGen: Filter collisions (optional)
    GraspGen->>GraspingSkill: Publish PoseArray via stream
    GraspingSkill->>GraspingSkill: _on_grasps() callback
    GraspingSkill-->>Agent: Return grasp summary
    Agent-->>User: "Generated 100 grasps for 'cup'"
    
    Note over User,Docker: Cleanup Phase
    User->>Agent: Stop blueprint
    Agent->>Docker: Stop all DockerModules
    Docker->>GraspGen: Send SIGTERM
    GraspGen->>GraspGen: Cleanup GPU memory
    Docker->>Docker: Remove container
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 files reviewed, 10 comments

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 27, 2026

Additional Comments (1)

dimos/perception/object_scene_registration.py
The parameter is exclude_object_id but usage in the grasp pipeline passes object names. Verify the parameter semantics match the actual usage.

logger = setup_logger()


class GraspingSkillContainer(SkillModule):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to move away from using skill containers. Skills should just be in the Module itself.

Jalaj Shukla added 2 commits January 30, 2026 11:13
- Move grasping module from dimos/grasping/ to dimos/manipulation/grasping/
- Rename docker_module.py to docker_runner.py with simplified implementation- Isolated the docker container not to have any langchain dependency
- Add proper docstrings with Args for OpenAI function calling
- Simplify visualize_grasps.py to minimal debug tool
- Add get_object_pointcloud_by_object_id RPC for object ID lookups
Copy link
Contributor

@mustafab0 mustafab0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete temp files before merging

@paul-nechifor
Copy link
Contributor

@JalajShuklaSS You have a few mypy failures and test failures. I can help with mypy if you need me to.

Type fixes:
- Add type annotations for ndarray, list, dict parameters
- Add type: ignore comments for Docker-only imports (torch, grasp_gen)
- Fix RPC call args from tuple to list in docker_runner.py
- Add Iterator return type to PoseArray.__iter__
- Fix import order for ruff compliance

Cleanup:
- Delete unused objscene_registreation_myversion.py
- Update .dockerignore to only allow data/.lfs/
@JalajShuklaSS JalajShuklaSS requested review from alexlin2 and removed request for alexlin2 February 4, 2026 00:57
@JalajShuklaSS JalajShuklaSS self-assigned this Feb 4, 2026
Jalaj Shukla added 3 commits February 4, 2026 13:51
Fixes:
- Update test_classmethods to expect 10 RPCs (matches current Module base class)
- Regenerate all_blueprints.py to include new grasping_module

Formatting (ruff/pre-commit auto-fixes):
- Apply import ordering and TYPE_CHECKING patterns across grasping modules
- Fix whitespace and blank line formatting
- Update JSON indentation in calibration.json
- Regenerate uv.lock
@JalajShuklaSS JalajShuklaSS merged commit 0ee31cd into dev Feb 7, 2026
28 of 30 checks passed
spomichter added a commit that referenced this pull request Feb 21, 2026
Release v0.0.10: Manipulation Stack, MuJoCo Simulation, DDS Transport, Web and Native Visualization via Rerun


## Highlights

88+ commits, 20 contributors, 700+ files changed.

The TLDR: **a complete manipulation stack**, **MuJoCo simulation**, **DDS transport**, and **a rewritten visualization pipeline**. Agents are no longer bolted on top — they're refactored as native modules with direct stream access. The entire ROS message dependency has been removed from core DimOS, and we've added VR, phone, and arm teleoperation stacks. You can now vibecode a pick-and-place task from natural language to motor commands. Installation has been significantly streamlined — no more direnv, simpler setup, and the web viewer is now the default.

---

## 🚀 New Features

### Simulation
- **MuJoCo simulation module** — Run any DimOS blueprint in simulation with no hardware. Supports xArm and Unitree embodiments, parses MJCF/URDF for robot properties, monotonic clock timing (no `time.sleep`). `dimos --simulation run unitree-go2` ([#1035](#1035)) by @jca0
- **Simulation teleop blueprints** — Added simulation teleop blueprints for Piper, xArm6, and xArm7. ([#1308](#1308)) by @mustafab0

### Manipulation
- **Modular manipulation stack** — Full planning stack with Drake: FK/IK solvers (Jacobian + Drake optimization), RRT path planning, world model with obstacle monitoring, multi-robot management. xArm6/7 and Piper support. ([#1079](#1079)) by @mustafab0
- **Joint servo and cartesian controllers** — Joint position/velocity controllers and cartesian IK task with Pinocchio solver. PoseStamped stream input for real-time control. ([#1116](#1116)) by @mustafab0
- **GraspGen integration** — Grasp generation via Docker-hosted GPU model. Lazy container startup, thread-safe init, RPC `generate_grasps()` returns ranked PoseArray. ([#1119](#1119), [#1234](#1234)) by @JalajShuklaSS
- **Gripper control** — Gripper RPC methods on control coordinator, exposed adapter property for custom implementations. ([#1213](#1213)) by @mustafab0
- **Detection3D and Object support** — Object input topics, TF support on manipulation module, pointcloud-to-convex-hull for Drake imports. ([#1236](#1236)) by @mustafab0
- **Agentic pick and place** — Reimplemented manipulation skills for agent-driven pick-and-place workflows. ([#1237](#1237)) by @mustafab0

### Teleoperation
- **Quest VR teleoperation** — Full WebXR + Deno bridge stack. Quest controller data (pose, trigger, grip) streamed to DimOS modules. Monitor-style locking for control loops. ([#1215](#1215)) by @ruthwikdasyam
- **Phone teleoperation** — Control Go2 from your phone with a web-based teleop interface. ([#1280](#1280)) by @ruthwikdasyam
- **Arm teleop with Pinocchio IK** — Single and dual arm teleoperation using Pinocchio inverse kinematics. Blueprints for xArm, Piper, and dual configurations. ([#1246](#1246)) by @ruthwikdasyam

### Transports & Infrastructure
- **DDS transport protocol** — CycloneDDS transport with configurable QoS (high-throughput and reliable profiles). Optional install, benchmark integration. ([#1174](#1174)) by @Kaweees
- **Pubsub pattern subscriptions** — Glob and regex pattern matching for topic subscriptions. `subscribe_all()` for bridge-style consumers. Topic type encoding in channel strings (`/topic#module.ClassName`). ([#1114](#1114)) by @leshy
- **LCM raw bytes passthrough** — Skip `lcm_encode()` when message is already bytes. ([#1223](#1223)) by @leshy
- **Unified TimeSeriesStore** — Pluggable backends (InMemory, SQLite, Pickle, PostgreSQL) with SortedKeyList for O(log n) operations. Replaces the old replay system and TimestampedCollection. Collection API with slice, range, and streaming methods. ([#1080](#1080)) by @leshy
- **DimosROS benchmark tests** — Benchmark suite for ROS transport performance. ([#1087](#1087)) by @leshy

### Navigation
- **FASTLIO2 support** — Hardware-verified localization with arm64 support. Docker deployment with FAR Planner, terrain analysis, and bagfile playback mode. Builds or-tools from source on arm64. ([#1149](#1149)) by @baishibona
- **Native Livox + FASTLIO2 module** — First-class DimOS native module for Livox Mid-360 lidar with FASTLIO2 localization. ([#1235](#1235)) by @leshy

### Visualization
- **RerunBridge module and CLI** — New bridge that subscribes to all LCM messages and logs those with `to_rerun()` to Rerun viewer. GlobalConfig singleton, web viewer support. Replaces the old rerun initialization system. ([#1154](#1154)) by @leshy
- **Webcam rerun visualization** — Camera module logs to Rerun with pinhole projection for 3D visualization. ([#1117](#1117)) by @ruthwikdasyam
- **Default viewer switched to rerun-web** — Browser-based viewer is now the default for broader compatibility. No native viewer install needed. ([#1324](#1324)) by @spomichter

### Agents
- **Agent refactor** — Restructured agent module with cleaner imports and global config integration. ([#1211](#1211)) by @paul-nechifor
- **Timestamp knowledge** — Agents now have timestamp awareness in prompts for temporal reasoning. ([#1093](#1093)) by @ClaireBookworm
- **Observe skill** — Go2 can now observe (capture and describe) its environment via agent skill. ([#1109](#1109)) by @paul-nechifor

### Platform & Hardware
- **G1 without ROS** — Unitree G1 blueprints decoupled from ROS dependency. Lazy imports for fast startup. ([#1221](#1221)) by @jeff-hykin
- **ARM (aarch64) support** — DimOS runs on ARM hardware. Platform-conditional dependencies, open3d source builds for arm64. ([#1229](#1229)) by @jeff-hykin
- **Universal joint/hardware schema** — `HardwareComponent` dataclass with `JointState`, `JointName` type aliases. Backend registry with auto-discovery for SDK adapters. ([#1040](#1040), [#1067](#1067)) by @mustafab0

---

## 🔧 Improvements

- **Optional Dask** — Start without Dask using `--no-dask` flag. Startup time reduced from ~60s to ~45s. ([#1111](#1111), [#1232](#1232)) by @paul-nechifor
- **RPC rework** — Renamed `ModuleBlueprint` → `_BlueprintAtom`, `ModuleBlueprintSet` → `Blueprint`, `ModuleConnection` → `Stream`. Added `ModuleRef`, improved type hints throughout. ([#1143](#1143)) by @jeff-hykin
- **Image class simplification** — Rewritten as pure NumPy dataclass. Removed CUDA backend, unused methods (solve_pnp, csrt_tracker), and image_impls/ directory. ([#1161](#1161)) by @leshy
- **Odometry message cleanup** — Simplified Odometry message type. ([#1256](#1256)) by @leshy
- **Remove all ROS message dependencies** — Purged ROS message types from core DimOS. Refactored rosnav to use ROSTransport. Removed dead ROS bridge code. ([#1230](#1230)) by @alexlin2
- **Removed bad function serialization** — Eliminated unnecessary serialization of Python functions. ([#1121](#1121)) by @paul-nechifor
- **Benchmark IEC units** — Switched bandwidth benchmarks from SI to IEC units for accuracy. ([#1147](#1147)) by @leshy
- **Pubsub typing improvements** — Thread-safety locks on `subscribe_new_topics` and `subscribe_all`. Proper type params across pubsub stack. ([#1153](#1153)) by @leshy
- **Autogenerated blueprint list** — Blueprints are now auto-discovered and listed. ([#1100](#1100)) by @paul-nechifor
- **Generic Buttons message** — Renamed `QuestButtons` to `Buttons` with generic field names for cross-platform teleop. ([#1261](#1261)) by @ruthwikdasyam
- **Dev container uses ros-dev image** — `./bin/dev` now runs the ROS-enabled dev image. ([#1170](#1170)) by @leshy
- **LSP support** — Added python-lsp-server and python-lsp-ruff to dev dependencies. ([#1169](#1169)) by @leshy
- **Lazy-load pyrealsense2** — RealSense camera module uses lazy imports to avoid errors in simulation environments without the SDK. ([#1309](#1309)) by @spomichter
- **Removed unused mmcv and mmengine** — Dead Detic dependencies removed, eliminating slow source builds from install. ([#1319](#1319)) by @spomichter
- **Simplified installation** — Removed direnv requirement, streamlined install instructions across all platforms. ([#1315](#1315)) by @spomichter
- **DDS extra excluded from --all-extras** — `cyclonedds` requires a source build, so `dds` is now excluded from `uv sync --all-extras` by default. ([#1318](#1318)) by @spomichter
- **Nix pre-commit skip** — Skip pre-commit install if hooks already exist. ([#1162](#1162)) by @leshy
- **Removed base-requirements** — Consolidated dependency management. ([#1098](#1098)) by @paul-nechifor
- **Removed old graspnet** — Cleaned up deprecated graspnet version. ([#1248](#1248)) by @paul-nechifor
- **Code cleanup** — Removed `tofix` markers ([#1216](#1216)), fixed ruff issues ([#1112](#1112)), removed old README_installation.md ([#1101](#1101)) by @paul-nechifor

---

## 🐛 Bug Fixes

- Fix LFS updating (move from .local to venv) ([#1090](#1090)) by @jeff-hykin
- Launch hotfixes: git clone HTTPS, get_data main branch ([#1091](#1091)) by @spomichter
- Fix camera demo not showing in Rerun ([#1148](#1148)) by @jeff-hykin
- Default to rerun native viewer ([#1099](#1099)) by @Nabla7
- Fix exploration blocking agent loop ([#1258](#1258)) by @paul-nechifor
- Fix person-follow blocking agent loop ([#1278](#1278)) by @paul-nechifor
- Skip metric3d tests on unsupported xformers GPUs (Blackwell compute capability >9.0) ([#1225](#1225)) by @leshy
- Fix manipulation tests ([#1218](#1218), [#1247](#1247)) by @jeff-hykin, @paul-nechifor
- Fix control coordinator e2e test ([#1212](#1212)) by @mustafab0
- Fix xarm7-sim broken e2e tests ([#1294](#1294)) by @paul-nechifor
- Pin langchain to restore supported providers ([#1241](#1241)) by @spomichter
- Fix missing library dependencies in Nix flake ([#1240](#1240)) by @Kaweees
- Fix discord invite link ([#1122](#1122)) by @spomichter
- macOS edgecase fix ([#1096](#1096)) by @jeff-hykin
- Fix second N in logo ([#1250](#1250)) by @jeff-hykin
- Fix Unitree Go2 minor issues ([#1307](#1307)) by @paul-nechifor
- Fix broken tests ([#1305](#1305)) by @ruthwikdasyam
- Fix `uv sync` for some macOS systems ([#1322](#1322)) by @jeff-hykin
- Fix mmcv install ([#1313](#1313)) by @paul-nechifor
- Fix mypy issues ([#1150](#1150), [#1167](#1167), [#1257](#1257)) by @leshy, @paul-nechifor, @jeff-hykin
- Fix Nix install uv pip extras ([#1321](#1321)) by @spomichter

---

## 📚 Documentation

- **Major docs overhaul** — New README with feature grid, hardware table, quickstart. Navigation, transports, data streams, and agent docs. ([#1295](#1295)) by @leshy
- **Day 1 docs** — Comprehensive getting started guide, development docs, contributing guide, architecture overview. Executable blueprint docs via md-babel-py. ([#1064](#1064)) by @jeff-hykin
- **Arm integration guide** — How-to for integrating new robotic arms with DimOS. ([#1238](#1238)) by @mustafab0
- **MCP documentation update** — Updated MCP install and usage instructions. ([#1251](#1251)) by @Kaweees
- **Docker docs** — First pass on Docker deployment documentation. ([#1151](#1151)) by @leshy
- **Transports documentation** — Encode/decode mixins, SHM examples, ROS/DDS transport docs. ([#1107](#1107)) by @leshy
- **Rerun API examples** — Updated examples for the new RerunBridge API. ([#1262](#1262)) by @jeff-hykin
- **PR template** added ([#1172](#1172)) by @christiefhyang
- **Simplified install instructions** — Removed direnv, streamlined across all platforms. ([#1315](#1315)) by @spomichter
- **Python example restored** — Added back the Python usage example. ([#1317](#1317)) by @jeff-hykin
- **Nix install updated** — Replaced uv with pip for Nix compatibility. ([#1326](#1326)) by @ruthwikdasyam
- **README improvements** ([#1311](#1311)) by @paul-nechifor
- **Simplified writing docs** — Consolidated writing_docs to a single markdown file. ([#1254](#1254)) by @jeff-hykin

---

## 🏗️ CI & Build

- **ci-complete gate** — Dynamic branch protection via single aggregated status check. MD-only PRs no longer blocked. ([#1279](#1279)) by @spomichter
- **Path-based test filtering** — Test jobs fully skip (no container spin-up) when no relevant code changed. ([#1284](#1284), [#1286](#1286)) by @spomichter
- **Navigation docker build workflow** — CI builds for the ROS navigation stack. ([#1259](#1259)) by @spomichter
- **CUDA test marker** — `@pytest.mark.cuda` for GPU-dependent tests. ([#1220](#1220)) by @jeff-hykin
- **e2e test marker** — Marked end-to-end tests for selective CI runs. ([#1110](#1110)) by @paul-nechifor
- **pytest stdin fix** — Added `-s` to default addopts for LCM autoconf compatibility. ([#1320](#1320)) by @spomichter

---

## ⚠️ Breaking Changes

- **RPC renames**: `ModuleBlueprint` → `_BlueprintAtom`, `ModuleBlueprintSet` → `Blueprint`, `ModuleConnection` → `Stream` ([#1143](#1143))
- **Image class rewrite**: `CudaImage` and `NumpyImage` removed. Image is now a pure NumPy dataclass. Methods like `solve_pnp`, `csrt_tracker`, `from_depth`, `to_depth_meters` removed. ([#1161](#1161))
- **ROS messages removed from core**: All `to_ros`/`from_ros` conversion methods removed. Use `ROSTransport` instead. ([#1230](#1230))
- **QuestButtons → Buttons**: Renamed with generic field names. ([#1261](#1261))
- **RerunBridge replaces old rerun init**: `dimos.dashboard.rerun_init` removed. Use `RerunBridgeModule` or the `rerun-bridge` CLI. ([#1154](#1154))
- **Unitree directory restructuring**: `unitree_go2` → `unitree/go2`, `unitree_g1` → `unitree/g1`. Blueprint names updated. ([#1221](#1221))
- **Default viewer is now rerun-web**: Use `--viewer-backend rerun` to restore native viewer. ([#1324](#1324))

---

## Quickstart

```bash
# Install
uv pip install dimos[base,unitree]

# Try it (no hardware needed)
# NOTE: First run downloads ~2.4 GB from LFS
dimos --replay run unitree-go2

# Simulate
uv pip install dimos[base,unitree,sim]
dimos --simulation run unitree-go2
```

---

## New Contributors 🎉

- @ruthwikdasyam — Quest VR teleoperation, phone teleop, arm teleop, webcam rerun viz
- @JalajShuklaSS — GraspGen integration
- @jca0 — MuJoCo simulation module
- @christiefhyang — PR template

---

**Full Changelog**: [v0.0.9...v0.0.10](v0.0.9...v0.0.10)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants