This application demonstrates OpenAI Realtime API usage on an ESP32-S3 device with a 5-inch HMI LCD panel. It provides a graphical user interface (GUI) for configuring WiFi settings and entering your OpenAI API key, then establishes a WebRTC communication with the OpenAI Realtime API. Audio input is sent to the model, which returns text responses and a transcription of the audio.
- Embedded Device Focus: Designed for ELECROW CrowPanel Advance 5.0-HMI. For detailed device hardware information, see Device Hardware Documentation.
- Real-time Communication: Establishes a WebRTC connection with OpenAI Realtime API.
- Voice Interaction: Transcribes audio input and displays the model’s text responses.
- OpenAI Responses API: After audio is captured/streamed, the transcription is sent to the OpenAI Responses API for final processing when the mic is toggled off.
- User-friendly GUI: Built using LVGL 8.4.
- Session Persistence: WiFi settings and session configurations are saved in non-volatile storage.
- Easy Build & Flash: Build from source using ESP-IDF v5.4 or flash prebuilt images.
- Install ESP-IDF framework v5.4.
- Clone the repository.
- Dependencies are installed via the framework component manager (see
idf_component.yml). - Build and flash using the following commands:
idf.py build idf.py -p PORT flash
- Use
flash_tool.exeto flash the prebuilt images.
- WiFi Setup: Navigate to the WiFi tab and enter your SSID and password.
- Authentication: Go to the Auth tab and input your OpenAI API key (non-free tier account required).
- Mic Control & Realtime Communication:
- Tap the on-screen mic button to start and stop audio capture.
- While the mic is on, audio is streamed to the OpenAI Realtime API for live transcription.
- When you tap the mic off, the complete audio request is sent to the OpenAI Responses API for final processing.
- Transcriptions, final responses are displayed in the terminal.
- Session Controls: Use the terminal to clear the screen or disconnect and stop communication.
- ESP-IDF Components: All dependencies are listed in the
idf_component.ymlfile and are downloaded automatically. - LVGL 8.4: Used for the user interface.
- ESP WebRTC Examples: Heavily inspired by Espressif's WebRTC Solution.