🎭 Improv AI - Real-Time Theater Background Generator

An AI-powered system that listens to improv performers in real-time, detects location context from speech, generates appropriate background images, and automatically sends them to QLab for display. Perfect for improv theaters wanting dynamic, responsive backdrops.

✨ Features

🎤 Real-time speech recognition - Listens to performers via microphone
🧠 AI location detection - Identifies settings from natural speech
🎨 Dynamic image generation - Creates backgrounds using DALL-E 3
📚 Smart library system - Reuses existing environments for speed
🎵 Ambient sound integration - Adds appropriate audio atmospheres
🎬 QLab automation - Seamlessly integrates with theater tech setup
⚡ Intelligent rate limiting - Optimized for live performance

🚀 Quick Start

Prerequisites

macOS (for QLab integration)
Python 3.8+
QLab 5 (theater lighting/sound software)
OpenAI API key with DALL-E access
Microphone for speech input

Installation

Clone the repository:

git clone https://github.com/your-username/improv-ai.git
cd improv-ai

Set up Python virtual environment:

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
# For macOS users (install PortAudio first):
brew install portaudio

# For Ubuntu/Debian users:
# sudo apt-get install portaudio19-dev python3-pyaudio

# For Windows users:
# PyAudio wheel may be needed from: https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio

pip install -r requirements.txt
```
PyAudio Installation Issues?
- macOS: brew install portaudio then pip install pyaudio
- Linux: sudo apt-get install portaudio19-dev
- Windows: Download wheel from unofficial binaries
- Alternative: pip install --global-option='build_ext' --global-option='-I/usr/local/include' --global-option='-L/usr/local/lib' pyaudio

Configure your OpenAI API key:

cp .env.example .env
# Edit .env and add your OpenAI API key

Set up QLab:
- Open QLab 5
- Enable OSC in QLab preferences
- Create a new workspace for your show

Add ambient sounds (optional):

python3 get_ambient_sounds.py  # See guide for sound sources

Usage

Start the system:

python3 main.py                    # Full features (DALL-E 3 + ambient sounds)
python3 main.py --fast             # Faster generation (DALL-E 2)
python3 main.py --no-sounds        # Disable ambient sounds
python3 main.py --auto-default 5   # Auto-default backdrop after 5min

Begin your improv performance! The system will:
- Listen for location mentions ("Let's go to the coffee shop")
- Generate or reuse appropriate backgrounds
- Send images to QLab automatically
- Add ambient sound cues (if enabled)
Tech controls:
- Press d + Enter to trigger default backdrop
- Ctrl+C to exit gracefully

🎭 How It Works

Speech Detection

The system uses Python's SpeechRecognition library to continuously listen for speech patterns that indicate location changes:

# Examples of detected phrases:
"Let's go to the Italian restaurant"    → italian_restaurant.png
"We're at the park now"                → park.png  
"This coffee shop is crowded"          → coffee_shop.png
"Welcome to our office"                → office.png

Location Intelligence

AI analyzes speech context to extract reusable environment names:

"Fancy Italian bistro" → italian_restaurant
"Szechuan noodle place" → chinese_restaurant
"Dark spooky forest" → dark_forest

Image Generation

First mention: Generates new DALL-E 3 image (high quality)
Subsequent mentions: Instantly reuses from library
Optimized prompts: Creates intimate, theater-appropriate backgrounds

QLab Integration

Automatically creates and triggers QLab cues via AppleScript:

Video cues for background images
Audio cues for ambient sounds
Auto-stops previous backgrounds

🔧 Configuration

Environment Variables (.env)

OPENAI_API_KEY=your_api_key_here

Customization Options

Rate limiting: Adjust min_interval in main.py (default: 15 seconds)
Image quality: Use --fast flag for DALL-E 2 vs DALL-E 3 (default)
Ambient sounds: Use --no-sounds flag to disable audio cues
Auto-default: Use --auto-default N for backdrop after N minutes
Speech sensitivity: Modify phrase_time_limit in speech_recognizer.py

📁 Project Structure

improv-ai/
├── main.py                 # Main application orchestrator
├── speech_recognizer.py    # Real-time speech recognition
├── image_generator.py      # AI image generation & library
├── sound_generator.py      # Ambient sound system
├── qlab_integration.py     # QLab AppleScript automation
├── get_ambient_sounds.py   # Sound collection utility
├── generated_images/       # Environment image library
├── generated_sounds/       # Ambient audio files
├── requirements.txt        # Python dependencies
└── README.md              # This file

🎵 Adding Ambient Sounds

The system supports ambient audio for immersive environments:

Run the sound collection guide:
```
python3 get_ambient_sounds.py
```
Download sounds from:
- Freesound.org (free, Creative Commons)
- Zapsplat.com (professional quality)
- YouTube Audio Library
- AI generation tools (Suno, Udio)
Save as: environment_ambient.wav in generated_sounds/
- park_ambient.wav
- restaurant_ambient.wav
- coffee_shop_ambient.wav
- etc.

🎬 Theater Integration Tips

For Tech Operators

Set up QLab workspace before show
Test speech recognition levels during sound check
Use default backdrop feature (d + Enter) between scenes
Monitor rate limiting - system prevents API spam

For Performers

Speak naturally - system detects context, not commands
Be specific about locations: "Italian restaurant" vs just "restaurant"
Allow ~3 seconds for new environments to generate
Library reuse is instant for repeated locations

Performance Optimization

Internet required for new image generation
Offline capable for library reuse
15-second rate limit prevents API overuse
Smart caching balances quality with speed

Speech Recognition Quality Tips

Test your setup: Run python3 test_microphone.py before shows
Microphone placement: 2-6 feet from performers works best
Use quality mics: USB microphones often outperform built-in ones
Reduce echo: Soft furnishings help absorb sound reflections
Consistent volume: Train performers to project consistently
Clear enunciation: Theater projection techniques work well

🛠️ Troubleshooting

Common Issues

"Could not understand speech"

Run microphone test: python3 test_microphone.py
Check microphone permissions in System Settings
Reduce background noise (close windows, turn off fans)
Speak louder and more clearly
Use a better quality microphone if possible
Adjust settings in speech_recognizer.py:
- energy_threshold: Lower for quieter environments
- pause_threshold: Increase if cutting off too soon
- phrase_time_limit: Increase for longer sentences

"QLab connection failed"

Ensure QLab 5 is running
Enable OSC in QLab preferences
Check that workspace is open

"OpenAI API error"

Verify API key in .env file
Check API quota/billing
Ensure DALL-E access is enabled

Images too large/small

Modify prompt generation in image_generator.py
Adjust QLab video cue settings
Check theater projector resolution

🤝 Contributing

Contributions welcome! Areas for improvement:

Additional ambient sound mappings
Enhanced location detection
Support for other theater software
Multi-language speech recognition
Custom prompt templates

📜 License

MIT License - see LICENSE file for details.

🎭 Credits

Created for improv theater communities. Special thanks to:

OpenAI for DALL-E API
Figure 53 for QLab
The improv community for inspiration

🔗 Links

Made with ❤️ for the theater community

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
generated_images		generated_images
generated_sounds		generated_sounds
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
debug_app.py		debug_app.py
default_backdrop.py		default_backdrop.py
demo_app.py		demo_app.py
get_ambient_sounds.py		get_ambient_sounds.py
image_generator.py		image_generator.py
main.py		main.py
main_fast.py		main_fast.py
main_high_quality.py		main_high_quality.py
manage_library.py		manage_library.py
qlab_integration.py		qlab_integration.py
requirements.txt		requirements.txt
sound_generator.py		sound_generator.py
speech_recognizer.py		speech_recognizer.py
tech_control.py		tech_control.py
test_app.py		test_app.py
test_microphone.py		test_microphone.py
test_qlab.py		test_qlab.py
test_qlab_correct.py		test_qlab_correct.py
test_qlab_final.py		test_qlab_final.py
test_qlab_simple.py		test_qlab_simple.py
test_sounds.py		test_sounds.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎭 Improv AI - Real-Time Theater Background Generator

✨ Features

🚀 Quick Start

Prerequisites

Installation

Usage

🎭 How It Works

Speech Detection

Location Intelligence

Image Generation

QLab Integration

🔧 Configuration

Environment Variables (.env)

Customization Options

📁 Project Structure

🎵 Adding Ambient Sounds

🎬 Theater Integration Tips

For Tech Operators

For Performers

Performance Optimization

Speech Recognition Quality Tips

🛠️ Troubleshooting

Common Issues

🤝 Contributing

📜 License

🎭 Credits

🔗 Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎭 Improv AI - Real-Time Theater Background Generator

✨ Features

🚀 Quick Start

Prerequisites

Installation

Usage

🎭 How It Works

Speech Detection

Location Intelligence

Image Generation

QLab Integration

🔧 Configuration

Environment Variables (.env)

Customization Options

📁 Project Structure

🎵 Adding Ambient Sounds

🎬 Theater Integration Tips

For Tech Operators

For Performers

Performance Optimization

Speech Recognition Quality Tips

🛠️ Troubleshooting

Common Issues

🤝 Contributing

📜 License

🎭 Credits

🔗 Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages