Skip to content

DimOS Manipulation Framework, ObjectDetectionStream Changes#308

Merged
spomichter merged 15 commits intodevfrom
manipulation_constraints
Jun 2, 2025
Merged

DimOS Manipulation Framework, ObjectDetectionStream Changes#308
spomichter merged 15 commits intodevfrom
manipulation_constraints

Conversation

@spomichter
Copy link
Contributor

@spomichter spomichter commented May 29, 2025

Changes

  • Added manipulation_constraint.py with AbstractConstraint base class and three concrete constraint types (Translation, Rotation, Force)
  • Implemented RotationConstraintSkill for controlling rotation axes, angles, and pivot points
  • Implemented TranslationConstraintSkill for bounded/unbounded translations with reference points
  • Implemented ForceConstraintSkill for force-limited manipulations with min/max forces and direction
  • Created ManipulateSkill that combines constraint types for executing complex manipulation tasks
  • Added testfile with Manipulation development interface with streaming of VLM ManipulationPoint locations on RGB images
  • Created ManipulationInterface
  • Created ManipulationHistory with pytest unit test

DimOS Manipulation Framework

A comprehensive framework for defining, executing, and tracking robot manipulation tasks with constraint-based control.

Overview

The DimOS Manipulation Framework structure robot manipulation tasks with:

  • Type-safe constraint definitions (Translation, Rotation, Force)
  • Detailed task history with advanced search capabilities
  • Integration with object detection systems
  • Comprehensive metadata tracking

Key Components

Core Types

Type Description
ManipulationTask Defines a manipulation operation with target objects, points, and constraints
AbstractConstraint Base class for all constraint types
TranslationConstraint Controls movement along specific axes with bounds
RotationConstraint Manages rotation around specific axes with angle limits
ForceConstraint Specifies force parameters for manipulation
ManipulationTaskConstraint Container for multiple constraints
ManipulationMetadata Stores context information about the manipulation environment

History Tracking

Type Description
ManipulationHistory Stores and retrieves manipulation task history
ManipulationHistoryEntry Records individual task execution with results

Integration Classes

Type Description
ManipulationInterface Connects robot hardware to manipulation logic
MockManipulationRobot Test Robot class with webcam camera

Usage Examples

Creating a Basic Manipulation Task

from dimos.types.manipulation import ManipulationTask
from dimos.types.vector import Vector
import time

# Create a simple task to pick up a cup
task = ManipulationTask(
    description="Pick up the cup",
    target_object="cup",
    target_point=(100, 200),  # 2D point from image detection
    task_id="task1",
    metadata={
        "timestamp": time.time(),
        "objects": {
            "cup1": {
                "object_id": 1,
                "label": "cup",
                "confidence": 0.95,
                "position": {"x": 1.5, "y": 2.0, "z": 0.5},
            }
        }
    }
)

Adding Constraints to a Task

from dimos.types.manipulation import TranslationConstraint, RotationConstraint, ForceConstraint

# Translation constraint (movement only along Y-axis)
translation_constraint = TranslationConstraint(
    translation_axis="y",
    reference_point=Vector(2.5, 1.0, 0.3),
    bounds_min=Vector(2.0, 0.5, 0.3),
    bounds_max=Vector(3.0, 1.5, 0.3),
    target_point=Vector(2.7, 1.2, 0.3),
    description="Constrained translation along Y-axis only"
)
task.add_constraint(translation_constraint)

# Rotation constraint (rotation only around X-axis/roll)
rotation_constraint = RotationConstraint(
    rotation_axis="roll",
    start_angle=Vector(0, 0, 0),
    end_angle=Vector(90, 0, 0),
    pivot_point=Vector(2.5, 1.0, 0.3),
    description="Constrained rotation around X-axis (roll only)"
)
task.add_constraint(rotation_constraint)

# Force constraint (apply moderate downward force)
force_constraint = ForceConstraint(
    min_force=2.0,
    max_force=5.0,
    force_direction=Vector(0, 0, -1),
    description="Apply moderate downward force during manipulation"
)
task.add_constraint(force_constraint)

Using a Robot to Execute Manipulation

from dimos.robot.robot import MockManipulationRobot
from dimos.skills.skills import SkillLibrary
from dimos.skills.manipulation.manipulate_skill import ManipulateSkill

# Initialize robot and skills
skill_library = SkillLibrary()
skill_library.register(ManipulateSkill)
robot = MockManipulationRobot(skill_library=skill_library)

# Create and execute a manipulation task
manipulation_skill = robot.get_skill(ManipulateSkill)

# Create constraint objects
translation_constraint = TranslationConstraint(
    translation_axis="y",
    reference_point=Vector(2.5, 1.0, 0.3),
    bounds_min=Vector(2.0, 0.5, 0.3),
    bounds_max=Vector(3.0, 1.5, 0.3)
)

rotation_constraint = RotationConstraint(
    rotation_axis="roll",
    start_angle=Vector(0, 0, 0),
    end_angle=Vector(90, 0, 0),
    pivot_point=Vector(2.5, 1.0, 0.3)
)

# Execute the skill with proper constraint objects
result = manipulation_skill(
    target_object="bottle",
    target_point=(150, 250),
    constraints=[translation_constraint, rotation_constraint]
)

Manipulation History

Creating a ManipulationHistory instance and Adding an Entry

from dimos.manipulation.manipulation_history import ManipulationHistory
from dimos.types.manipulation import ManipulationTask

# Initialize history (will load existing history if available)
history = ManipulationHistory(output_dir="/path/to/history")

# Create a task entry directly through history
entry = history.create_task_entry(
    task=task: ManipulationTask,
    result={"status": "success", "execution_time": 2.5},
    agent_response="Successfully picked up the cup"
)

# Basic methods for retrieving entries
cup_entries = history.get_entries_by_object("cup")

Searching Manipulation History

The ManipulationHistory class provides powerful search capabilities for finding manipulation tasks based on various criteria:

from dimos.manipulation.manipulation_history import ManipulationHistory

# Initialize history
history = ManipulationHistory(output_dir="/path/to/history")

# Example 1: Search by object properties and metadata
# Find all tasks involving cups with position.z greater than 0.4
cup_entries = history.search(**{
    "task.target_object": "cup",
    "task.metadata.objects.*.position.z": (">", 0.4)
})

# Example 2: Search by constraint properties
# Find tasks with rotation constraints around roll axis with end angle of 90 degrees
rotation_tasks = history.search(**{
    "task.constraints.*.rotation_axis": "roll",
    "task.constraints.*.end_angle.x": 90
})

# Example 3: Search by time range and execution status
# Find successful tasks executed in the last hour
import time
current_time = time.time()
one_hour_ago = current_time - 3600

recent_success = history.search(**{
    "task.metadata.timestamp": (">", one_hour_ago),
    "result.status": "success"
})

Additional ManipulationHistory Search Parameters

The ManipulationHistory class provides powerful search capabilities using dot notation to access any field in the task history:

Time-Based Searches

  • Search for tasks after a specific time: search(**{"task.metadata.timestamp": ('>', start_time)})
  • Find tasks from the last 30 minutes: search(**{"task.metadata.timestamp": ('>=', time.time() - 1800)})
  • Tasks between two timestamps: search(**{"task.metadata.timestamp": ('>', start_time), "task.metadata.timestamp": ('<', end_time)})

Constraint Property Searches

  • Tasks with specific reference points: search(**{"task.constraints.*.reference_point.x": 2.5})
  • Tasks with specific rotation angles: search(**{"task.constraints.*.end_angle.x": 90})
  • Tasks with locked axes: search(**{"task.constraints.*.lock_x": True})
  • Tasks using a specific constraint type: search(**{"task.constraints.*.rotation_axis": "roll"})

Object and Result Searches

  • Tasks involving specific objects: search(**{"task.metadata.objects.*.label": "cup"})
  • Tasks with specific outcome: search(**{"result.status": "success"})
  • Tasks with specific errors: search(**{"result.error": "Collision"})
  • Tasks with execution time: search(**{"result.execution_time": ("<", 3.0)})

Wildcard and Nested Property Searches

  • Using * to match any object in a collection: search(**{"task.metadata.objects.*.position.z": ('>', 0.5)})
  • String contains matching (automatic for string values): search(**{"manipulation_response": "successfully"})
  • Deep nested property searches: search(**{"task.metadata.objects.bottle1.position.x": 2.5})

Combined Multi-Criteria Searches

  • All criteria must match: search(**{"task.target_object": "cup", "result.status": "success", "task.constraints.*.translation_axis": "y"})

How Manipulation Integrates with ObjectDetectionStream

The manipulation system integrates with object detection systems to track and manipulate detected objects:

# Get detected objects from the robot's vision system
detected_objects: List[ObjectData] = robot.manipulation_interface.get_latest_objects()

# Create manipulation metadata with detected objects
objects_data = {}
for obj in detected_objects:
    obj_id = obj.get("object_id", str(time.time()))
    objects_data[obj_id] = dict(obj)
    
metadata = {"timestamp": time.time(), "objects": objects_data}

# Create task with the detected objects
task = ManipulationTask(
    description="Pick up detected cup",
    target_object="cup",
    target_point=detected_objects[0]["center_point"],
    metadata=metadata
)

return constraint

# TODO: Implement
def _execute_manipulation(self, constraint: ManipulationConstraint) -> Dict[str, Any]:
Copy link
Contributor Author

@spomichter spomichter May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashas-salankimatt @atharva620 Where y'all would integrate with constraint executor / motion planning

@spomichter spomichter changed the title Rotation, Translation, and Force manipulation constraints and ManipulationSkills implemented Significant Manipulation updates, Added masks to ObjectDetectionStream Jun 2, 2025
alexlin2
alexlin2 previously approved these changes Jun 2, 2025
@spomichter spomichter changed the title Significant Manipulation updates, Added masks to ObjectDetectionStream DimOS Manipulation Framework, ObjectDetectionStream Changes Jun 2, 2025
@spomichter spomichter merged commit 77047ee into dev Jun 2, 2025
14 checks passed
@spomichter spomichter deleted the manipulation_constraints branch June 2, 2025 23:32
spomichter added a commit that referenced this pull request Oct 28, 2025
Release v0.0.5


## What's Changed
* Unitree WebRTC implementation on rebased dev by @leshy in #277
* Update ros_observable_topic timeout to 100s by @leshy in #273
* Updated README, more clear on API key requirements and updated go2_ros2_sdk remote by @spomichter in #272
* Release v0.0.4 Patch: readme changes by @spomichter in #292
* Readme patch v0.0.4 by @spomichter in #293
* Development container & CI by @leshy in #278
* env/devcontainer ruff formatting/typing by @leshy in #294
* Global reformat 100 line length  by @spomichter in #300
* Global code reformat with ruff by @leshy in #295
* Position/Vector type cleanup & tests by @leshy in #297
* Linelength100 by @leshy in #301
* Auto-delivery of binary data files for testing, rewrite of dev script by @leshy in #298
* pre-commit hooks in dev container & CI, automatic LFS upload by @leshy in #303
* Removed all submodules - Testing by @spomichter in #306
* Fixed v0.0.4 Unitree ROS runfile broken by WebRTC development, Vector.py fixes by @spomichter in #307
* test/mapper by @leshy in #305
* Reduced CI cleanup frequency to PRs only into dev/main by @spomichter in #312
* DimOS Manipulation Framework, ObjectDetectionStream Changes by @spomichter in #308
* Added auto-license header to pre-commit by @spomichter in #336
* Move thread fix for alex planner by @leshy in #334
* base typing cleanup, sensor reply tests+docs by @leshy in #309
* devcontainer docs by @leshy in #338
* ci docs by @leshy in #339
* Add Cerebras Agent by @joshuajerin in #310
* Repo cleanup by @leshy in #340
* noros builds by @leshy in #341
* Update testing_stream_reply.md by @leshy in #342
* ONNX conversions for YOLOv11 and FastSAM by @mdaiter in #350
* Test cicd fake ros change by @spomichter in #361
* Reverted cleanup workflow frequency to on any PUSH due to CICD docker workflow issues by @spomichter in #360
* Trigger docker ros rerun by @spomichter in #363
* Ros CI change detection by @leshy in #364
* trigger full rebuild by @leshy in #365
* Add CLIP ONNX conversion and support, with passing vision and text tests by @mdaiter in #353
* CI fix 3 by @leshy in #367
* ONNX Support for YOLO, SAM2 + Unit tests for CLIP, YOLO, SAM2 by @spomichter in #345
* LFS moved to utils from testing by @leshy in #368
* Contact graspnet integration on pytorch and pyproject build processes setup with cuda/manipulation tags by @spomichter in #370
* data/* deletions by @leshy in #369
* Ci pre-commit and docker builds run in parallel by @leshy in #372
* Ci shared docker cache by @leshy in #371
* Unitree WebRTC integrated with full functionality, remove all ROS dependency, refactored entire robot base class and connection interface, added explore skill by @alexlin2 in #279
* Unitree WebRTC only implementation, Exploration skills [Staging --> Dev] by @spomichter in #379
* Dask lcm multiprocess by @leshy in #377
* DimOS Packaging & Build Improvements for CPU-only, CUDA, Manipulation installations by @spomichter in #394
* Multitree go2 by @leshy in #381
* better LCM system checks, fixes bin/lfs_push by @leshy in #382
* UnitreeSpeak skill over webrtc, Voice Interface added on localhost, Voice interface on mobile device on network by @spomichter in #400
* FIX: multiprocess by @leshy in #402
* Lcmspy cli by @leshy in #404
* changed position type name to pose by @alexlin2 in #358
* WIP: foxglove bridge stub by @leshy in #411
* Create running_without_devcontainer.md by @leshy in #405
* new LCM class format support by @leshy in #417
* Fixed PoseStamped ros_msgs error in dimos-lcm by @spomichter in #457
* Fixes move stream issue, Odom receive issue by @leshy in #456
* Small stream/type fixes for unitree by @leshy in #460
* Local planner, Global Planner, Explore, SpatialMemory working via LCM/Dask Multiprocess by @spomichter in #467
* Added working runfile to Unitreego2Light class by @spomichter in #474
* Point Cloud Filtering and Segmentation, Full 6DOF Object pose estimation, Grasp generation, ZED driver support, Hosted grasp integration by @spomichter in #458
* Stream fixes, Twist, Pose, Quaternion updates by @leshy in #471
* Added self-hosted runner to full CICD by @spomichter in #484
* Full Unitree (Local planner, Explore, SpatialMemory) FakeRTC/WebRTC LCM modules working in self-hosted devcontainer  by @spomichter in #487
* Porting types/ LCM msgs/ new LCM types, Transform visualization by @leshy in #477
* Tracking streams lcm dask refactor by @spomichter in #488
* Pytransforms by @leshy in #491
* Fix python and dev docker builds for CICD by @spomichter in #489
* Remove PIL Image Usage by @alexlin2 in #490
* Added missing __init__.py's to transforms  by @spomichter in #493
* Added tofix pytest tag back to addopts by @spomichter in #494
* Added module docs by @spomichter in #495
* SpatialMemory converted to Dask module, input LCM odom and video streams by @spomichter in #481
* Run modules tests only on 16gb runner by @spomichter in #499
* Trigger CI only on PR or push to main/dev by @spomichter in #500
* Added more aggressive cleanup workflows by @spomichter in #501
* Visual Servoing for Pick and Place Demo by @alexlin2 in #476
* Testing run-tests container pull fix and removed modules tests by @spomichter in #505
* Fix permissions in pre-build-cleanup by @spomichter in #508
* Moved pre-build cleanup to build template by @spomichter in #509
* dimos lcm update to main branch latest commit by @leshy in #498
* RPC Kwargs by @leshy in #503
* Transform system, stream convinience features, type checking by @leshy in #504
* Dimoslcm bump by @leshy in #510
* Testing UV builds in docker by @spomichter in #513
* OccupancyGrid, Path types by @leshy in #511
* subscribing to transports/streams from main loop by @leshy in #524
* Alex Lin's version of ROS Nav2 by @alexlin2 in #514
* Agent refactor conversation history by @spomichter in #541
* Exposed optional memory_limit param in dimos core by @spomichter in #540
* Agent refactor by @spomichter in #535
* Validating transforms with ros examples by @leshy in #538
* rpc timeout by @leshy in #542
* MuJoCo Simulation by @paul-nechifor in #539
* Revert "MuJoCo Simulation" by @spomichter in #548
* perception refactor to be on parity with old architecture by @alexlin2 in #534
* Skill coordinator by @leshy in #536
* WIP Mujoco simulation by @paul-nechifor in #549
* Fix event loop leak by @paul-nechifor in #547
* Correct way to build package directly in non-editable mode, no manife… by @spomichter in #551
* Office environment mujoco by @paul-nechifor in #554
* Less bandwidth usage on LCM, bug fixed with navigation by @alexlin2 in #559
* disabled old agent tests by @leshy in #563
* Camera Module Refactor, added image rectification by @alexlin2 in #566
* long rpc timeout by @leshy in #569
* Twist message for all move command, added keyboard teleop for easy robot control in sim by @alexlin2 in #570
* numerical sort for sensor replay by @leshy in #564
* 2d detection module by @leshy in #567
* Stream timestamp alignment by @leshy in #557
* Sharpness for Images by @leshy in #560
* Jetson humanoid integration by @spomichter in #590
* 2d detection module + Agent2 - yolo demo by @leshy in #582
* jetson.md cleanup by @spomichter in #602
* Unitree b1 integration with continuous cmd_vel Twist interface, joystick control for testing, C++ UDP server for onboard B1 by @spomichter in #601
* Joystick integrated g1 humanoid by @spomichter in #603
* Unitree b1 manipulation pose integration by @spomichter in #604
* use SHM in Foxglove by @paul-nechifor in #607
* CPU isolated shared mem by @mdaiter in #589
* silence unnecessary unitree go 2 tricks by @paul-nechifor in #615
* Pshm to lcm by @paul-nechifor in #616
* Unitree agents2 skill integration paul by @paul-nechifor in #617
* Unitree go2 runfile integration tool call issues by @spomichter in #605
* gstreamer camera by @paul-nechifor in #613
* zed local node by @leshy in #623
* ROS Bridge for Unitree G1 and B1 Navigation, Working G1 navigation by @spomichter in #610
* B1 ros navigation rebase by @spomichter in #626
* Added build directory to gitignore by @yashas-salankimatt in #628
* 2D detection module + Pointcloud localization by @leshy in #583
* Camera calibration loading by @leshy in #629
* Agent2 nav skills by @paul-nechifor in #630
* WIP shared mem again by @paul-nechifor in #650
* Fix leaks by @paul-nechifor in #649
* Fix SHM leak by @paul-nechifor in #652
* Suppress echos with counter by @paul-nechifor in #653
* Removing websocket vis causing crazy lag by @spomichter in #656
* Suppress with UUID by @paul-nechifor in #655
* Modules navigate object bbox by @spomichter in #654
* Ros bridge test fix by @alexlin2 in #660
* video g1 spatial mem + detection - tomerge by @leshy in #651
* Update README.md by @spomichter in #664
* Image upgrades! Impls for CUDA + numpy, along with an abstraction and full backwards compatibility by @mdaiter in #612
* Revert "Image upgrades! Impls for CUDA + numpy, along with an abstraction and full backwards compatibility" by @leshy in #665
* Detection second pass by @leshy in #662
* CudaImage by @spomichter in #671
* Add start/stop to all modules and other resources by @paul-nechifor in #675
* forgotten context managers by @paul-nechifor in #676
* CUDAImage, NumpyImage, Image implementations with robust backend tests for image operations by @spomichter in #680
* CudaImage by @spomichter in #677
* alibaba env var fix by @leshy in #673
* Rename FakeRTC --> ReplayRTC by @spomichter in #681
* Fix websocketvis performance rebase by @spomichter in #682
* Alexl ros nav intergration by @alexlin2 in #632
* detection pipeline rewrite, embedding, vl model standardization, reid system by @leshy in #674
* cli tooling theme by @leshy in #687
* Fix spatial memory bug in g1  by @spomichter in #689
* Add autoconnect back2 by @paul-nechifor in #684
* Add ability to remap module connections name. by @paul-nechifor in #698
* Add transport which encodes images as JPEG to improve performance. by @paul-nechifor in #693
* New Ruff autofixes by @paul-nechifor in #694

## New Contributors
* @joshuajerin made their first contribution in #310
* @mdaiter made their first contribution in #350
* @yashas-salankimatt made their first contribution in #628

**Full Changelog**: https://github.com/dimensionalOS/dimos/commits/v0.0.5
spomichter added a commit that referenced this pull request Jan 8, 2026
DimOS Manipulation Framework, ObjectDetectionStream Changes

Changes
- Added manipulation_constraint.py with AbstractConstraint base class and three concrete constraint types (Translation, Rotation, Force)
- Implemented RotationConstraintSkill for controlling rotation axes, angles, and pivot points
- Implemented TranslationConstraintSkill for bounded/unbounded translations with reference points
- Implemented ForceConstraintSkill for force-limited manipulations with min/max forces and direction
- Created generic AbstractManipulationSkill
- Created Manipulate AbstractManipulationSkill that combines constraint types for executing complex manipulation tasks
- Added testfile with Manipulation development interface with streaming of VLM ManipulationPoint locations on RGB images
- Created ManipulationInterface
- Created ManipulationHistory with pytest unit test

Former-commit-id: 77047ee
paul-nechifor pushed a commit that referenced this pull request Jan 8, 2026
DimOS Manipulation Framework, ObjectDetectionStream Changes

Changes
- Added manipulation_constraint.py with AbstractConstraint base class and three concrete constraint types (Translation, Rotation, Force)
- Implemented RotationConstraintSkill for controlling rotation axes, angles, and pivot points
- Implemented TranslationConstraintSkill for bounded/unbounded translations with reference points
- Implemented ForceConstraintSkill for force-limited manipulations with min/max forces and direction
- Created generic AbstractManipulationSkill
- Created Manipulate AbstractManipulationSkill that combines constraint types for executing complex manipulation tasks
- Added testfile with Manipulation development interface with streaming of VLM ManipulationPoint locations on RGB images
- Created ManipulationInterface
- Created ManipulationHistory with pytest unit test

Former-commit-id: 77ed416 [formerly 77047ee]
Former-commit-id: 1f04ee3
jeff-hykin pushed a commit that referenced this pull request Jan 9, 2026
DimOS Manipulation Framework, ObjectDetectionStream Changes

Changes
- Added manipulation_constraint.py with AbstractConstraint base class and three concrete constraint types (Translation, Rotation, Force)
- Implemented RotationConstraintSkill for controlling rotation axes, angles, and pivot points
- Implemented TranslationConstraintSkill for bounded/unbounded translations with reference points
- Implemented ForceConstraintSkill for force-limited manipulations with min/max forces and direction
- Created generic AbstractManipulationSkill
- Created Manipulate AbstractManipulationSkill that combines constraint types for executing complex manipulation tasks
- Added testfile with Manipulation development interface with streaming of VLM ManipulationPoint locations on RGB images
- Created ManipulationInterface
- Created ManipulationHistory with pytest unit test

Former-commit-id: 7be7469 [formerly 77047ee]
Former-commit-id: 1f04ee3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants