Press a key. Speak. Press again. Text appears.
Voice-to-text for Linux. 100% local, no subscriptions.
# 1. Install Python dependencies
sudo apt install python3-evdev python3-pyaudio python3-yaml python3-gi ydotool
# 2. Get whisper.cpp (OPTION A - Easiest: Download pre-compiled binary)
# Download from: https://github.com/ggerganov/whisper.cpp/releases
# Extract to ./whisper.cpp/
# Download a model: cd whisper.cpp && bash models/download-ggml-model.sh base
# 2. Get whisper.cpp (OPTION B - Compile for better performance)
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp && mkdir build && cd build && cmake .. && make -j$(nproc) && cd ..
bash models/download-ggml-model.sh base && cd ..
# 3. Run it
python3 verbose.py
# 4. Use it - Press F9, speak, press F9 againLocal voice-to-text using whisper.cpp. Works in any application - terminals, browsers, IDEs.
No cloud dependencies. No API keys needed.
- Multiple configs with different hotkeys (F9 for coding, F10 for emails)
- Auto-correct common mistakes ("cloud code" → "Claude Code")
- Phrase shortcuts ("my email" → expands to full address)
- Works on login, lives in system tray
- Cancel anytime with ESC
- Press F9 → Red icon appears
- Speak → "echo hello world"
- Press F9 → Orange icon while processing
- Watch it type → Text appears exactly where your cursor was
Press ESC anytime to cancel.
- Install system dependencies:
sudo apt install python3-evdev python3-pyaudio python3-yaml python3-gi ydotool portaudio19-dev
sudo usermod -a -G input $USER # Required for keyboard/ydotool access
# Log out and back in for group change to take effect- Set up uinput permissions (required for ydotool):
echo 'KERNEL=="uinput", GROUP="input", MODE="0660"' | sudo tee /etc/udev/rules.d/80-uinput.rules
sudo udevadm control --reload-rules && sudo udevadm trigger-
Get whisper.cpp:
- Download the latest release from whisper.cpp releases
- Extract to
./whisper.cpp/in the verbose directory - Make sure the binary is at
./whisper.cpp/build/bin/whisper-cli
-
Download a model:
cd whisper.cpp
bash models/download-ggml-model.sh base # or tiny/small/medium/large-v1
cd ..- Run it:
python3 verbose.pyIf pre-compiled binaries don't work or you want CPU-optimized builds:
# Install build dependencies
sudo apt install cmake build-essential
# Clone and build
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
mkdir build && cd build
cmake ..
make -j$(nproc)
cd ..
# Download model
bash models/download-ggml-model.sh base
cd .../install-service.shThis installs Verbose as a systemd user service. See Auto-start section for details.
# configs/coding.yaml - F9 for CLI safety
hotkey: "<f9>"
avoid_newlines: true # Prevents accidental command execution
dictionary:
"cloud code": "Claude Code"
"postgres": "PostgreSQL"
# configs/writing.yaml - F10 for natural text
hotkey: "<f10>"
avoid_newlines: false # Keeps paragraph breaks
shortcuts:
"my email": "you@example.com"Each hotkey loads its own config. Press F9 for coding, F10 for emails. Simple.
Ubuntu 22.04+ (or any Linux with evdev and ydotool). ~200MB disk space. That's it.
Contributions welcome! Here's where help is needed:
The project currently requires manual installation. I don't know how to package for Linux distributions - if you do, help would be appreciated:
- Snap package
- Flatpak
- AUR (Arch)
- Debian/Ubuntu .deb
- AppImage
Open an issue if you'd like to help with any of these.
- Bug fixes and improvements
- Better error handling
- Documentation improvements
- Multi-language support
Read CLAUDE.md for architecture details. Keep changes simple and focused.
Single Python file. ~500 lines. Built with Claude Code.
MIT - Do whatever you want with it.
- Built with whisper.cpp by @ggerganov
- Inspired by Talon Voice and Wispr Flow (but free and local)
- Designed with Claude Code
Issues? Check docs/TROUBLESHOOTING.md.
Still stuck? Open an issue.