Full-length song generation with latent diffusion. Part of the Zen LM ecosystem.
Mu generates full-length songs (up to 4m45s) using latent diffusion with a Diffusion Transformer backbone. It supports text-conditioned music generation with optional lyrics alignment.
- Full-length song generation (up to 4:45)
- Lyrics-conditioned generation
- Diffusion Transformer architecture
- Audio codec for high-quality reconstruction
- zen-musician — Music generation models
- zen-foley — Sound effect generation
- Zen LM — Full model family
Apache 2.0