From 4275cfe25d0a43be4739c3c3c1b382d1f59f93d0 Mon Sep 17 00:00:00 2001
From: Clarence Etnel <clarence@example.com>
Date: Thu, 23 Apr 2026 07:47:12 +0200
Subject: [PATCH] docs: add ElevenLabs TTS integration guide

Adds documentation for using ElevenLabs as an alternative TTS provider,
including setup, voice cloning, and a comparison table with Kokoro.

Closes #337
---
 skills/hyperframes/references/tts.md | 83 ++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)
diff --git a/skills/hyperframes/references/tts.md b/skills/hyperframes/references/tts.md
index c403564d8..6fafd8f03 100644
--- a/skills/hyperframes/references/tts.md
+++ b/skills/hyperframes/references/tts.md
@@ -69,7 +69,90 @@ npx hyperframes tts script.txt --voice af_heart --output narration.wav
 npx hyperframes transcribe narration.wav  # → transcript.json with word-level timestamps
 ```
 
+## Alternative: ElevenLabs API
+
+For production-quality voices with custom voice cloning, use [ElevenLabs](https://elevenlabs.io) as an external TTS provider.
+
+### Setup
+
+1. Create an ElevenLabs account and get your API key from [elevenlabs.io/app/settings](https://elevenlabs.io/app/settings)
+2. Set the environment variable:
+
+```bash
+export ELEVENLABS_API_KEY=your_api_key_here
+```
+
+3. Install the ElevenLabs Python SDK:
+
+```bash
+pip install elevenlabs
+```
+
+### Generate Speech
+
+```bash
+# List available voices
+elevenlabs voices list
+
+# Generate speech with a specific voice
+elevenlabs text-to-speech "Your narration script here" --voice Rachel --output narration.wav
+```
+
+### Voice Cloning
+
+ElevenLabs supports instant voice cloning from a 30-second audio sample:
+
+```bash
+# Clone a voice from an audio sample
+elevenlabs voices add --name "MyVoice" --file reference_audio.wav
+
+# Use the cloned voice
+python3 -c "
+from elevenlabs import generate, play
+audio = generate(text='Hello world', voice='MyVoice')
+play(audio)
+"
+```
+
+### Integration with HyperFrames
+
+Generate the narration with ElevenLabs, then use it in your composition:
+
+```bash
+# Step 1: Generate narration
+elevenlabs text-to-speech -f script.txt --voice Rachel --output narration.wav
+
+# Step 2: Transcribe for captions
+npx hyperframes transcribe narration.wav  # → transcript.json
+
+# Step 3: Use in your composition
+```
+
+```html
+<audio
+  id="narration"
+  data-start="0"
+  data-duration="auto"
+  data-track-index="2"
+  src="narration.wav"
+  data-volume="1"
+></audio>
+```
+
+### Kokoro vs ElevenLabs
+
+| Feature | Kokoro (local) | ElevenLabs (API) |
+|---------|---------------|-------------------|
+| Cost | Free | $5+/month |
+| Latency | ~2s | ~0.5s |
+| Voice quality | Good | Excellent |
+| Voice cloning | No | Yes |
+| Languages | 8 | 29+ |
+| Offline | Yes | No |
+| Setup | pip install | API key |
+
 ## Requirements
 
 - Python 3.8+ with `kokoro-onnx` and `soundfile`
 - Model downloads on first use (~311 MB + ~27 MB voices, cached in `~/.cache/hyperframes/tts/`)
+- For ElevenLabs: `pip install elevenlabs` and `ELEVENLABS_API_KEY`