Feature: Detection3D and Object support to Manipulation Module #1217
Conversation
Greptile OverviewGreptile SummaryAdds Detection3D and Object support to the manipulation module for perception-driven motion planning. Key changes:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant OSR as ObjectSceneRegistration
participant MM as ManipulationModule
participant WM as WorldMonitor
participant WOM as WorldObstacleMonitor
participant DW as DrakeWorld
participant TF as TF Publisher Thread
participant Client as ManipulationClient
Note over OSR,MM: Perception Integration Flow
OSR->>OSR: Deduplicate detections via ObjectDB
OSR->>MM: objects.publish(Object[])
MM->>WM: on_objects(objects)
WM->>WOM: on_objects(objects)
WOM->>WOM: Cache objects with stable object_id
Note over Client,DW: User Workflow
Client->>MM: refresh_obstacles(min_duration)
MM->>WM: refresh_obstacles(min_duration)
WM->>WOM: refresh_obstacles(min_duration)
WOM->>WOM: Filter cached objects by duration
WOM->>WOM: pointcloud_to_convex_hull_obj()
WOM->>DW: add_obstacle(Obstacle{MESH/BOX})
DW->>DW: Create Convex or Box shape
WOM-->>Client: List of added obstacles
Note over MM,TF: TF Publishing (10Hz Thread)
loop Every 0.1s
TF->>WM: get_ee_pose(robot_id)
WM->>WM: scratch_context()
WM->>DW: get_ee_pose(ctx, robot_id)
DW-->>TF: PoseStamped
TF->>WM: get_link_pose(robot_id, link)
WM->>WM: scratch_context()
WM->>DW: get_link_pose(ctx, robot_id, link)
DW-->>TF: PoseStamped
TF->>TF: tf.publish(Transform[])
end
Note over Client,DW: Motion Planning with Obstacles
Client->>MM: goto_object("cup", dz=0.1)
MM->>MM: Match object from cache
MM->>MM: plan_pose(x, y, z+offset)
MM->>DW: check_collision_free()
DW->>DW: Check against obstacles (MESH/BOX)
MM-->>Client: Plan result
Client->>MM: execute()
MM->>MM: task_invoke("traj_arm", "execute", trajectory)
|
82e6db4 to
d071ef6
Compare
f51144b to
6c77e02
Compare
* fix xarm adapter gripper method * exposed adapter as a property to control coordinator. This enables cusotm method implementation * rpc calls for gripper control added to control coordinator * Gripper RPC methods added to manipuilation module * updated manipulation client
d071ef6 to
9431928
Compare
…ator' into feature/mustafa-detection3d-pcd-manipulation
…izer (#1227) * with seperate preview urdf * running meshcat on its dedicated thread allows for real time preview update * added meshcat viz executor shutdown * removed sleep on meshcat thread now time.sleep is only called in rpc thread * preview urdf is now persistent and does not disappear after preview * wrapped meshcat threadexecutor in its class
|
|
||
|
|
||
| def pointcloud_to_convex_hull_obj( | ||
| points: NDArray[np.float64], |
There was a problem hiding this comment.
why are you keeping points as numpy arrays? should use PointCloud2 which is stored in o3d, which has natural very fast convex hull calculations so this function would be one line or can add .to_convex_hull on PointCloud2 class?
| # These must be imported at runtime (not TYPE_CHECKING) for In/Out port creation | ||
| from dimos.msgs.sensor_msgs import JointState | ||
| from dimos.msgs.trajectory_msgs import JointTrajectory | ||
| from dimos.perception.detection.type.detection3d.object import Object as DetObject # noqa: TC001 |
There was a problem hiding this comment.
not a huge fan of this Object type, as this seems to be a current top level interface for the world this is a potential issue, but not reviewing it in detail, feel free to own/rewrite
|
|
||
| def get_link_pose( | ||
| self, ctx: Any, robot_id: WorldRobotID, link_name: str | ||
| ) -> NDArray[np.float64]: |
There was a problem hiding this comment.
why are you returning np arrays and not poses/transforms? as long as your external interface is dimos types I don't care, but u will suffer
| coordinator_task_name: str | None = None | ||
| gripper_hardware_id: str | None = None | ||
| # TF publishing for extra links (e.g., camera mount) | ||
| tf_extra_links: list[str] = field(default_factory=list) |
There was a problem hiding this comment.
shouldn't this be a list of actual transforms? from a particular existing arm transform to a new one?
(like Joint3 -> Joint_Camera)
I can have multiple cameras etc, why strings, they don't encode actual spatial relationships
285351e
into
feature/mustafa-add-gripper-control-for-control-coordinator
Problem
Detected 3D objects from perception cannot be used as collision obstacles for motion planning.
ManipulationModule lacks TF publishing needed for eye-in-hand camera transform chains.
Solution
ObjectSceneRegistration publishes deduplicated objects via new objects output port with pointclouds.
ManipulationModule subscribes async via observable() and caches objects in WorldObstacleMonitor.
Dedicated 10Hz TF thread publishes EE and tf_extra_links poses using scratch_context().
Eager self.tf init prevents autoconf() blocking in Dask workers.
WorldObstacleMonitor syncs BOX obstacles to Drake on refresh() with opt-in convex hull meshes.
pointcloud_to_convex_hull_obj centers points at origin to avoid double-transform with pose.
ManipulationClient adds refresh(), detections(), goto_object(), and perception status commands.
xarm_perception blueprint wires xArm7 + eye-in-hand RealSense + ObjectSceneRegistration + Foxglove.
Breaking Changes
None.
How to Test
closes DIM-353
closes DIM-397
closes DIM-396