A fast CLI tool for AI agents to convert their text output to speech using Chatterbox TTS on Apple Silicon.
git clone https://github.com/EmZod/speak.git
cd speak
bun install
# First run auto-installs Python dependencies
bun run src/index.ts "Hello, world!" --playCreate an alias for easier access:
alias speak="bun run $(pwd)/src/index.ts"- macOS with Apple Silicon (M Series)
- Bun
- Python 3.10+
- sox (for long documents):
brew install sox
speak "Hello, world!" --play # Generate and play
speak article.md --stream # Stream long content
speak --clipboard --play # Read from clipboard
speak document.md --output out.wav # Save to file# Long documents - auto-chunk for reliability
speak book.md --auto-chunk --output book.wav
# Resume interrupted generation
speak --resume manifest.json
# Batch processing
speak *.md --output-dir ~/Audio/
# Estimate duration before generating
speak --estimate document.md
# Concatenate audio files
speak concat part1.wav part2.wav --out combined.wav| Command | Description |
|---|---|
speak <text|file> |
Generate speech |
speak health |
Check system status |
speak models |
List available models |
speak concat <files> |
Combine audio files |
speak daemon kill |
Stop TTS server |
| Option | Description |
|---|---|
--play |
Play after generation |
--stream |
Stream as it generates |
--output <path> |
Output file or directory |
--auto-chunk |
Chunk long documents |
--estimate |
Show duration estimate |
--dry-run |
Preview without generating |
- docs/usage.md - Complete usage guide
- docs/configuration.md - Config file, environment variables, shell setup
- docs/troubleshooting.md - Common issues and fixes
- SKILL.md - Agent-optimized reference
- CHANGELOG.md - Version history
- .agentic/ - Agentic engineering artifacts (optimization reports, focus group tests)
bun install # Install dependencies
bun test # Run tests
bun run typecheck # Type checkCopy SKILL.md to your agent's skills directory:
cp SKILL.md ~/.claude/skills/speak-tts/SKILL.mdSee AGENTS.md for setup details.
MIT
