This repository provides Spatial Concepts-based Prompts with Large Language Models for Robot Action Planning (SpCoRAP).
- Maintainer: Shoichi Hasegawa (hasegawa.shoichi@em.ci.ritsumei.ac.jp)
- Author: Shoichi Hasegawa (hasegawa.shoichi@em.ci.ritsumei.ac.jp)
Required:
- Ubuntu: 20.04
- ROS: Noetic
- Python: 3.8
Confirmed Environment:
Ubuntu: 20.04 LTS
ROS: Noetic
Python: 3.8.10
git clone https://gitlab.com/general-purpose-tools-in-emlab/probabilistic-generative-models/spatial-concept-models/em_spcorap.gitcd em_spcorap
pip install -r requirements.txtFor LLM-based planning, this study uses FlexBE (a Smash-based behavior engine). Clone and build the following two repositories:
This section explains the spatial concept model programs used in this study. The process consists of four major steps: data collection, data conversion, learning, and inference.
In the simulation experiment of this study, a robot is placed in a home environment built on Gazebo, and data is collected using rosbag. The map itself should be created using your robot’s mapping tool.
A portion of the rosbag files used in this study is available as samples. These can be used to verify the operation of SpCoRAP: https://drive.google.com/drive/folders/15hbax2xDYlx29VukOD4_np8CygtVAbcm?usp=sharing
Usage of convertion_from_rosbag_to_img_for_spco_dataset (for generating image feature data):
- Place the rosbag folder under
/convertion_from_rosbag_to_img_for_spco_dataset/datawith the following structure:
rosbag
├ bathroom_sink
│ ├ bathroom_sink_1.bag
│ ├ ...
├ bed
│ ├ bed_1.bag
│ ├ ...
├ bedroom_closet
├ bedroom_closet_1.bag
├ ...
- Run:
python main.py - Folders
data/videoanddata/imageare generated. The first image of each bag file is stored indata/reconstructed_imagein the order bathroom_sink → bed → bedroom_closet. These reconstructed images are then used to extract image features stored indata/img.
Usage of convertion_from_rosbag_to_pose_for_spco_dataset (for generating position data):
- Place rosbag folders under
/convertion_from_rosbag_to_pose_for_spco_dataset/datain the same structure as above. - Run:
python rosbag2pose_ieee_access2025.py data/position_exp.csvis generated, containing five rows for each class (bathroom_sink → bed → bedroom_closet). Adata/positiondirectory is also created, containing each position as a CSV file.
Usage of convertion_from_image_to_boo_for_spco_dataset:
- Place reconstructed images under:
reconstructed_image
├ 1.png
├ 2.png
├ ...
├ 15.png
- Launch object detector:
roslaunch detic_ros node.launch
- Run:
python spco2_object_features_detic_ver.py data/detect_imageanddata/tmp_booare generated.
- Place
place_word_list.csvin thedatafolder (first column: list of place names). - Run:
python spco2_word_generator_from_utterance.py data/utteranceanddata/tmpare generated.
- Adjust paths in
__init__.py,spco2_learn_concepts_non_gmapping.py, andspco2_visualizer.pyto your environment. - Store converted data in the specified paths inside
spco2_learn_concepts_non_gmapping.py. - Run learning:
python spco2_learn_concepts_non_gmapping.py- learning result is
/spco2_boo/rgiro_spco2_slam/data/output/test/max_likelihood_param/ - you can use final step parameter of max_likelihood_param for inference.
- learning result is
- Visualize learned results using
python spco2_visualizer.py, which displays Gaussian distributions in Rviz.
Usage of inference_object_to_position_dist_index.py:
- Place the following learned parameter files into
/crossmodal_inference_for_spco/data/params:
index.csv
mu.csv
Object_W_list.csv
particle0.csv
phi.csv
pi.csv
sig.csv
theta.csv
W_list.csv
W.csv
Xi.csv
- Run:
python inference_object_to_position_dist_index.py - The inference results are saved to
data/result/result_object_2_position_dist_index.csv.
Usage of inference_place_word_to_position_dist_index.py:
- Store the same set of parameter files in the params folder.
- Run:
python inference_place_word_to_position_dist_index.py - Results saved to
data/result/result_place_word_2_position_dist_index.csv.
spcorap_planner_for_distribution.py is the main code that runs on FlexBE. Place the following files in the folders specified inside the script (all except the OpenAI API key are included as samples):
- OPENAI_API_KEY.key
- robot_behavior_info.yml
- prompt.txt
- W_list.csv
- pi.csv
- phi.csv
- W.csv
- Object_W_list.csv
- Xi.csv
- Launch FlexBE:
roslaunch flexbe_app_default.launch - In Behavior Dashboard, select Load Behavior
- Choose spcorap_for_distribution
- The state machine will appear in StateMachine Editor
- Start execution in the Runtime Control panel by clicking Start Execution (green button)
This study uses Toyota’s Human Support Robot (HSR). Action skill programs are based on libraries distributed to HSR users and cannot be publicly released.
However, open-source HSR-related programs are available and can be used to implement navigation and manipulation:
The object detection program is based on the following repository:
The prompts folder contains the following prompt files. Use them as needed:
- commonsense_placement_prompt_for_object_placement_real.txt
Prompt for real-world object placement using commonsense reasoning. - commonsense_placement_prompt_for_object_placement_sim.txt
Prompt for simulated object placement using commonsense reasoning.
- prompt.txt
NLMap-based prompt combined with an LLM.
- prompt.txt
NLMap method combined with an LLM.
- prompt.txt
SpCoRAP-based method.
- referring_place_name_instruction_for_put_away.txt
Instruction prompt referring to place names for put-away tasks. - referring_place_name_instruction_for_search.txt
Instruction prompt referring to place names for search tasks.
- referring_surrounding_objects_instruction_for_put_away.txt
Instruction prompt referring to surrounding objects for put-away tasks. - referring_surrounding_objects_instruction_for_search.txt
Instruction prompt referring to surrounding objects for search tasks.