Voca — Your Local Voice Clone Assistant

中文 | English

A local-first desktop app for voice cloning. Download and use — high-quality speech synthesis and voice cloning run entirely on your machine!

Screenshots

Highlights

Fully Offline — After model download, all inference runs locally with no network required and no privacy concerns
Zero Configuration — First launch automatically handles environment detection, runtime download, model download & warm-up
High-Quality Voice Cloning — Powered by the VoxCPM engine, supporting bilingual (Chinese & English) speech synthesis and voice cloning
Fine-Grained Control — Adjustable CFG guidance scale, inference steps, seed, text normalization, post-processing denoising, and more
Extreme Clone Mode — Uses reference audio transcription to further improve voice fidelity
Built-in ASR — Automatically transcribes reference audio with the SenseVoice ASR engine, with manual editing support
Dual Model Sources — Download models from Hugging Face or ModelScope, with automatic source recommendation
Bilingual UI — Chinese and English interface

Getting Started

System Requirements

Item	Requirement
OS	macOS 14.0 (Sonoma) or later
Chip	Apple Silicon (M1/M2/M3/M4)
Disk Space	~6 GB (app + models)

Installation

Go to the Releases page and download the latest .dmg file
Open the DMG and drag Voca into the Applications folder
On first launch, follow the guided setup to download models and start using the app

About App Signing & Notarization

Voca is signed with an Apple Developer ID and has been successfully notarized by Apple, so it is safe to run on macOS.

If you still hit a Gatekeeper warning on first launch (e.g. "Voca" cannot be opened, "Voca is damaged and can't be opened", or "cannot verify the developer"), it's usually because macOS has attached a quarantine attribute to files downloaded via the browser. You can remove the quarantine flag by running the following command in Terminal:
sudo xattr -dr com.apple.quarantine /Applications/Voca.app
Then reopen Voca. Alternatively, open System Settings → Privacy & Security and click Open Anyway.

First Launch

Voca includes a complete onboarding flow:

Environment Check → Runtime Download → Model Download & Verification → Model Warm-up → Ready to Use

Just follow the on-screen instructions — no manual configuration needed.

Features

Speech Generation Workspace

Enter text, select a model and voice, and generate high-quality speech with one click. Supports queued task management for submitting multiple generation requests simultaneously.

Adjustable generation parameters:

Parameter	Description
CFG Scale	Controls generation guidance strength
Inference Steps	Balance between quality and speed
Seed	Fix seed for reproducible results, or randomize
Text Normalization	Automatically handles numbers, abbreviations, etc.
Post-Processing Denoise	Removes background noise after generation
Extreme Clone Mode	Uses reference audio transcription to improve voice cloning fidelity

Voice Library

Manage preset and custom voices. When creating custom voices, upload reference audio and the built-in SenseVoice ASR engine will automatically transcribe the text, with support for manual editing.

Generation History

View all task statuses (queued / generating / completed / failed / cancelled). Completed tasks can be played back and exported as audio files.

Model Management

Built-in model catalog with support for downloading from Hugging Face or ModelScope, with automatic recommendation of the optimal source based on your network. Manage TTS models and auxiliary models (ASR, audio enhancement).

In-App Update Check

Check for new versions in Settings. When an update is available, the app opens the corresponding Release page for download.

Tech Stack

Layer	Technology
Desktop Framework	Tauri 2 (Rust)
Frontend	React 19 + TypeScript + Vite
Inference Service	Python (FastAPI + Uvicorn) sidecar
Speech Engine	VoxCPM
Runtime	Python 3.11+
Platform	macOS 14.0+ (Apple Silicon)

Roadmap

Upcoming development directions. Priorities may shift based on community feedback.

Lighter inference backend — Migrate ASR from PyTorch/FunASR to ONNX Runtime, significantly reducing app size and model download size
Quantized model support — INT8 and other quantized inference to lower memory and disk usage
Richer TTS capabilities — Support for more TTS models and expanded speech synthesis features
Windows support

Have ideas or suggestions? Let us know via Issues.

Contributing

Note: Voca is still in its early stages. The engineering experience (build process, developer docs, code structure, etc.) may not be fully polished yet. If you run into any issues while using or developing, we'd love for you to open an Issue or contribute directly — let's make it better together.

Ways to get involved:

Submit bug reports or feature requests → Issues
Submit code improvements → Pull Request
Improve documentation or translations

Known Limitations

Currently macOS (Apple Silicon) only; Windows support is planned
First launch requires an internet connection to download models (~1–2 GB); fully offline after that
Voice cloning quality depends heavily on reference audio quality — clean audio with no background noise is recommended

Acknowledgments

VoxCPM — Speech synthesis engine
Tauri — Desktop application framework
SenseVoice — Speech recognition model
Model: Claude Opus 4.6 & GPT-5.4

License

This project is licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
VoxCPM @ 8cf36b2		VoxCPM @ 8cf36b2
assets		assets
desktop		desktop
docs		docs
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voca — Your Local Voice Clone Assistant

Screenshots

Highlights

Table of Contents

Getting Started

System Requirements

Installation

First Launch

Features

Speech Generation Workspace

Voice Library

Generation History

Model Management

In-App Update Check

Tech Stack

Roadmap

Contributing

Known Limitations

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voca — Your Local Voice Clone Assistant

Screenshots

Highlights

Table of Contents

Getting Started

System Requirements

Installation

First Launch

Features

Speech Generation Workspace

Voice Library

Generation History

Model Management

In-App Update Check

Tech Stack

Roadmap

Contributing

Known Limitations

Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages