A collections of audio codecs with a standardized API
-
Updated
May 27, 2025 - Python
A collections of audio codecs with a standardized API
Moshi: open-source speech-text foundation model for real-time full-duplex voice dialogue. Uses Mimi neural audio codec. PyTorch, MLX (Apple Silicon) and Rust backends. Moshika & Moshiko voices. Maintained by KuchikiRenji.
Add a description, image, and links to the mimi topic page so that developers can more easily learn about it.
To associate your repository with the mimi topic, visit your repo's landing page and select "manage topics."