Voice

Text-to-speech, speech-to-text, and voice cloning CLI for Apple Silicon, built on mlx-audio.

Requirements

macOS with Apple Silicon (M-series)
uv
ffmpeg (for non-WAV audio conversion in the API server)

Setup

uv sync

CLI Usage

Text-to-Speech

# Use default model (Kokoro) and voice
uv run voice.py say "Hello world!"

# Choose a model and voice
uv run voice.py say "Bonjour !" -m voxtral -v fr_male

# Save without playing
uv run voice.py say "Hello" -o greeting.wav --no-play

Voice Cloning

uv run voice.py clone "Text to speak" reference.wav
uv run voice.py clone "Text to speak" reference.wav -m voxtral

Speech-to-Text

uv run voice.py transcribe audio.wav
uv run voice.py transcribe audio.wav --stream

List Voices and Models

uv run voice.py voices              # all voices
uv run voice.py voices -m kokoro    # kokoro voices only
uv run voice.py models              # available model shortcuts

Available Models

Shortcut	Model ID
`kokoro`, `kokoro-tts` (default)	`mlx-community/Kokoro-82M-bf16`
`voxtral`, `voxtral-tts`	`mlx-community/Voxtral-4B-TTS-2603-mlx-4bit`

You can also pass any full Hugging Face model ID with -m.

API Server

Starts an OpenAI-compatible transcription API on port 4444.

uv run voice.py serve
uv run voice.py serve -p 8080    # custom port

Endpoint

POST /v1/audio/transcriptions

Parameter	Type	Default	Description
`file`	UploadFile	required	Audio file (WAV, WebM, MP3, MP4, OGG, FLAC, AAC)
`model`	string	`"base"`	Model identifier
`language`	string	`"en"`	Language code

Response:

{"text": "transcribed text"}

Example:

curl -X POST http://localhost:4444/v1/audio/transcriptions \
  -F "file=@recording.webm" \
  -F "language=en"

macOS Service

Use ./cli to manage a persistent background service via launchd.

./cli install     # install and start on login
./cli status      # check if running
./cli logs        # tail logs
./cli restart     # restart the service
./cli stop        # stop the service
./cli uninstall   # stop and remove the service

Logs are written to /tmp/voice-tts.log and /tmp/voice-tts.err.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.python-version		.python-version
README.md		README.md
cli		cli
pyproject.toml		pyproject.toml
uv.lock		uv.lock
voice.py		voice.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice

Requirements

Setup

CLI Usage

Text-to-Speech

Voice Cloning

Speech-to-Text

List Voices and Models

Available Models

API Server

Endpoint

macOS Service

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice

Requirements

Setup

CLI Usage

Text-to-Speech

Voice Cloning

Speech-to-Text

List Voices and Models

Available Models

API Server

Endpoint

macOS Service

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages