Skip to content

EnjoyCloudDev/Voca

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Voca Logo

Voca — Your Local Voice Clone Assistant

中文 | English

Release Stars Issues License

A local-first desktop app for voice cloning. Download and use — high-quality speech synthesis and voice cloning run entirely on your machine!


Screenshots

Voice Studio   Settings

Highlights

  • Fully Offline — After model download, all inference runs locally with no network required and no privacy concerns
  • Zero Configuration — First launch automatically handles environment detection, runtime download, model download & warm-up
  • High-Quality Voice Cloning — Powered by the VoxCPM engine, supporting bilingual (Chinese & English) speech synthesis and voice cloning
  • Fine-Grained Control — Adjustable CFG guidance scale, inference steps, seed, text normalization, post-processing denoising, and more
  • Extreme Clone Mode — Uses reference audio transcription to further improve voice fidelity
  • Built-in ASR — Automatically transcribes reference audio with the SenseVoice ASR engine, with manual editing support
  • Dual Model Sources — Download models from Hugging Face or ModelScope, with automatic source recommendation
  • Bilingual UI — Chinese and English interface

Table of Contents

Getting Started

System Requirements

Item Requirement
OS macOS 14.0 (Sonoma) or later
Chip Apple Silicon (M1/M2/M3/M4)
Disk Space ~6 GB (app + models)

Installation

  1. Go to the Releases page and download the latest .dmg file
  2. Open the DMG and drag Voca into the Applications folder
  3. On first launch, follow the guided setup to download models and start using the app

About App Signing & Notarization

Voca is signed with an Apple Developer ID and has been successfully notarized by Apple, so it is safe to run on macOS.

If you still hit a Gatekeeper warning on first launch (e.g. "Voca" cannot be opened, "Voca is damaged and can't be opened", or "cannot verify the developer"), it's usually because macOS has attached a quarantine attribute to files downloaded via the browser. You can remove the quarantine flag by running the following command in Terminal:

sudo xattr -dr com.apple.quarantine /Applications/Voca.app

Then reopen Voca. Alternatively, open System Settings → Privacy & Security and click Open Anyway.

First Launch

Voca includes a complete onboarding flow:

Environment CheckRuntime DownloadModel Download & VerificationModel Warm-upReady to Use

Just follow the on-screen instructions — no manual configuration needed.

Features

Speech Generation Workspace

Enter text, select a model and voice, and generate high-quality speech with one click. Supports queued task management for submitting multiple generation requests simultaneously.

Adjustable generation parameters:

Parameter Description
CFG Scale Controls generation guidance strength
Inference Steps Balance between quality and speed
Seed Fix seed for reproducible results, or randomize
Text Normalization Automatically handles numbers, abbreviations, etc.
Post-Processing Denoise Removes background noise after generation
Extreme Clone Mode Uses reference audio transcription to improve voice cloning fidelity

Voice Library

Manage preset and custom voices. When creating custom voices, upload reference audio and the built-in SenseVoice ASR engine will automatically transcribe the text, with support for manual editing.

Generation History

View all task statuses (queued / generating / completed / failed / cancelled). Completed tasks can be played back and exported as audio files.

Model Management

Built-in model catalog with support for downloading from Hugging Face or ModelScope, with automatic recommendation of the optimal source based on your network. Manage TTS models and auxiliary models (ASR, audio enhancement).

In-App Update Check

Check for new versions in Settings. When an update is available, the app opens the corresponding Release page for download.

Tech Stack

Layer Technology
Desktop Framework Tauri 2 (Rust)
Frontend React 19 + TypeScript + Vite
Inference Service Python (FastAPI + Uvicorn) sidecar
Speech Engine VoxCPM
Runtime Python 3.11+
Platform macOS 14.0+ (Apple Silicon)

Roadmap

Upcoming development directions. Priorities may shift based on community feedback.

  • Lighter inference backend — Migrate ASR from PyTorch/FunASR to ONNX Runtime, significantly reducing app size and model download size
  • Quantized model support — INT8 and other quantized inference to lower memory and disk usage
  • Richer TTS capabilities — Support for more TTS models and expanded speech synthesis features
  • Windows support

Have ideas or suggestions? Let us know via Issues.

Contributing

Note: Voca is still in its early stages. The engineering experience (build process, developer docs, code structure, etc.) may not be fully polished yet. If you run into any issues while using or developing, we'd love for you to open an Issue or contribute directly — let's make it better together.

Ways to get involved:

  • Submit bug reports or feature requests → Issues
  • Submit code improvements → Pull Request
  • Improve documentation or translations

Known Limitations

  • Currently macOS (Apple Silicon) only; Windows support is planned
  • First launch requires an internet connection to download models (~1–2 GB); fully offline after that
  • Voice cloning quality depends heavily on reference audio quality — clean audio with no background noise is recommended

Acknowledgments

License

This project is licensed under the Apache License 2.0.


Star History

About

Voca - Your local voice cloning assistant. Powered by VoxCPM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 40.7%
  • Python 27.1%
  • CSS 12.8%
  • Rust 11.4%
  • JavaScript 7.3%
  • Shell 0.6%
  • HTML 0.1%