ModuBot: Modular LLM-Driven Multi-Persona Robot Framework

Demo Videos: Watch demo videos on Google Drive

A modular, extensible ROS 2 framework for controlling humanoid robots with LLM-powered conversation, persona-aware behavior, and synchronized gesture-speech actions. Demonstrated using the Webots simulator with two humanoid robots: NAO and Robotis OP2.

🎯 Key Features

LLM-Powered Conversations — Natural dialogue driven by large language models with persona-specific behavior
Multi-Persona Support — Easily switch between personalities (Angry Cab Driver, Polite Teacher, Polite Receptionist, etc.)
Semantic Action Matching — Context-appropriate gestures selected via embedding-based intent matching
Gesture-Speech Synchronization — Dynamic voice rate adjustment to sync actions with speech duration
Anticipatory Action Module — Learns from failure to adapt action selection for short dialogues
Highly Modular Design — Identical project structure across robots; only joint names and tuning differ

🏗️ Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                              ROBOT BRAIN                                    │
│                                                                             │
│  ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌──────────────┐              │
│  │   VAD   │───▶│   STT   │───▶│   LLM   │───▶│ ActionSelect │              │
│  │ (Silero)│    │ (Riva)  │    │ (Llama) │    │ (Embeddings) │              │
│  └─────────┘    └─────────┘    └─────────┘    └──────┬───────┘              │
│       ▲                              │               │                      │
│       │                              ▼               ▼                      │
│   [Microphone]                 ┌─────────┐    ┌─────────────--┐             │
│                                │   TTS   │    │  Publish to   │             │
│                                │ (Riva)  │    │/perform_action│             │
│                                └────┬────┘    └──────┬──────---             │
│                                     │                │                      │
│                                     ▼                │                      │
│                               [Speaker]              │                      │
└──────────────────────────────────────────────────────┼──────────────────────┘
                                                       │
                           ROS 2 Topic: /perform_action│
                                                       ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                              ROBOT DRIVER                                       │
│                                                                                 │
│  ┌────────────────┐    ┌─────────────────┐    ┌─────────────────┐              │
│  │ Action Callback│───▶│ Animation Engine│───▶│  Webots Motors  │              │
│  │ (ROS Subscriber)│   │ (Sine Wave Math)│    │ (Joint Control) │              │
│  └────────────────┘    └─────────────────┘    └─────────────────┘              │
└─────────────────────────────────────────────────────────────────────────────────┘

🤖 Supported Robots

We demonstrate the framework with two humanoid robots. The modular design means the project structure is identical — only robot names, joint configurations, and action tuning differ.

Robot	Description	Documentation
SoftBank NAO	Popular humanoid research robot	📖 NAO Controller README
Robotis OP2	Open-source humanoid platform	📖 OP2 Controller README

📊 Evaluation Metrics

Comprehensive evaluation conducted over 10 runs, each with 20 conversation messages, testing 3 personas (Angry Cab Driver, Polite Teacher, Polite Receptionist).

Metric	Score	Description
Persona Fidelity	90%	Robot maintains persona-consistent language, tone, and behavior across interactions
Action Grounding & Synchronization	83% ± 2%	Correct action retrieval + well-timed gesture–speech synchronization. Voice rate dynamically adjusted based on words-to-speak vs action duration
Emotional TTS Quality	0.87 ± 0.03	High emotional expressiveness (Whisper Large + Magpie)
Interaction Latency	3.2 ± 0.3 sec	End-to-end STT → LLM → [TTS + Action] pipeline latency (15-20 words). Includes cloud API overhead
HRI User Study	4.1 ± 0.5 / 5	MOS-equivalent rating from 20 participants for likeability, clarity, and perceived intelligence
Robustness & Reliability	81% ± 5%	Recovery from uncertainty/noise; avoidance of unsafe motions. Includes anticipatory module that learns from failure to adapt action selection for short dialogues

🔧 Modularity & Extensibility

The framework is designed for rapid adaptation and extension:

Task	Effort	Details
Add New Persona	~20 ± 5 minutes	JSON-only configuration
Add New Action	~30 ± 3 LOC	Action-specific driver control implementation
Add New Robot	150–200 LOC (~30-40 min)	Previously required 8-10 hours of extensive effort per robot

🚀 Why ROS 2?

Wide Community Support — Extensive documentation, tutorials, and active development
Robot Agnostic — Framework adapts to any humanoid robot with ROS 2 support
Modular by Design — Nodes communicate via topics, enabling flexible system composition
Simulation Ready — Seamless integration with Webots and other simulators

📁 Directory Structure

.
├── ReadMe.md                    # This file
├── my_nao_controller/           # NAO Robot Controller Package
│   ├── config/
│   │   └── config.py            # API keys, paths, parameters
│   ├── docs/
│   │   └── README.md            # Detailed NAO documentation
│   ├── launch/
│   │   └── robot_launch.py      # ROS 2 launch file
│   ├── my_nao_controller/
│   │   ├── nao_brain.py         # AI node (STT → LLM → TTS → Action)
│   │   ├── nao_driver.py        # Animation engine (Webots controller)
│   │   ├── nao_action_vocab.py  # Action vocabulary definitions
│   │   ├── personas.py          # Persona definitions
│   │   ├── generate_action_embeddings.py
│   │   └── action_embeddings.pkl
│   ├── resource/
│   │   └── nao.urdf
│   ├── worlds/
│   │   └── nao_world.wbt
│   ├── package.xml
│   ├── setup.py
│   └── run.py
│
└── op2_controller/              # Robotis OP2 Controller Package
    ├── config/
    │   └── config.py
    ├── docs/
    │   └── README.md            # Detailed OP2 documentation
    ├── launch/
    │   └── robot_launch.py
    ├── op2_controller/
    │   ├── op2_brain.py
    │   ├── op2_driver.py
    │   ├── op2_action_vocab.py
    │   ├── personas.py
    │   ├── generate_action_embeddings.py
    │   └── action_embeddings.pkl
    ├── resource/
    │   └── op2.urdf
    ├── worlds/
    │   └── op2_world.wbt
    ├── package.xml
    ├── setup.py
    └── run.py

🛠️ System Requirements

Component	Version
OS	Ubuntu 22.04 LTS (Jammy Jellyfish)
ROS 2	Humble Hawksbill
Simulator	Webots R2023b or newer
Python	3.10+

📚 Getting Started

Clone the repository into your ROS 2 workspace:

cd ~/ros2_ws/src
git clone https://github.com/endeavorXx/ROS-Nao-Simulation-in-Webots.git

Follow robot-specific instructions:
- NAO Controller Setup
- OP2 Controller Setup

Build the workspace:

cd ~/ros2_ws
colcon build
source install/setup.bash

📖 Documentation

For detailed setup, configuration, and usage instructions, refer to the robot-specific documentation:

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

Built with ROS 2 Humble, Webots, and various open-source AI/ML libraries including Silero VAD, NVIDIA Riva, and Sentence Transformers.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
ReadMe.md		ReadMe.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ModuBot: Modular LLM-Driven Multi-Persona Robot Framework

🎯 Key Features

🏗️ Architecture Overview

🤖 Supported Robots

📊 Evaluation Metrics

🔧 Modularity & Extensibility

🚀 Why ROS 2?

📁 Directory Structure

🛠️ System Requirements

📚 Getting Started

📖 Documentation

📄 License

🙏 Acknowledgments

Webots-Nao-Simulation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ModuBot: Modular LLM-Driven Multi-Persona Robot Framework

🎯 Key Features

🏗️ Architecture Overview

🤖 Supported Robots

📊 Evaluation Metrics

🔧 Modularity & Extensibility

🚀 Why ROS 2?

📁 Directory Structure

🛠️ System Requirements

📚 Getting Started

📖 Documentation

📄 License

🙏 Acknowledgments

Webots-Nao-Simulation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages