Sesame CSM Voice Assistant

Overview

A high-performance, local voice assistant with real-time transcription, LLM reasoning, and text-to-speech. Runs fully offline after setup and features Sesame CSM for expressive speech synthesis. Real-time factor: 0.6x with NVIDIA 4070 Ti Super.

Features

Real-time Speech-to-Text using distil-whisper
On-device LLM using Llama 3.2 1B
Natural TTS via Sesame CSM (senstella/csm-expressiva-1b)
Desktop GUI with Tauri/React
Conversation history and speaking animations
GPU acceleration with CUDA
Modular Docker-based backend

Tech Stack

Frontend: Tauri 2.5.1, React 18+, TypeScript
Backend: Python 3.10, FastAPI, Uvicorn
Models: distil-whisper (large-v3.5), Llama 3.2 1B (GGUF), Sesame CSM

Requirements

NVIDIA GPU: 8GB+ VRAM
32GB RAM
Docker Desktop
NVIDIA GPU Drivers (CUDA 12.1+)
NVIDIA Container Toolkit
Node.js & npm (v18+)
Rust & Cargo
Hugging Face access to Llama 3.2 1B

Setup

Prerequisites:
- Install Docker Desktop and ensure it's running
- Install Rust, Tauri, and NVIDIA Container Toolkit
- Request access to Llama 3.2 1B on Hugging Face
Configuration:
- Edit .env file and set HUGGING_FACE_TOKEN=hf_yourTokenHere
Backend:
- Build: docker compose build
- Run: docker compose up -d
Frontend:
- Install dependencies: cd frontend && npm install && npm install uuid
- Start: npm run tauri dev

Usage

Add your huggingface token and request access to the models (need to add links)
Build backend: docker compose build
Start backend: docker compose up -d
Build frontend: npm install && npm install uuid
Start frontend: cd frontend && npm run tauri dev
View logs: docker compose logs -f
Stop: docker compose down

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
backend		backend
frontend		frontend
shared		shared
.env		.env
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Sesame CSM Voice Assistant

Overview

Features

Tech Stack

Requirements

Setup

Usage

About

Uh oh!

Uh oh!

Languages

ReisCook/VoiceAssistant

Folders and files

Latest commit

History

Repository files navigation

Sesame CSM Voice Assistant

Overview

Features

Tech Stack

Requirements

Setup

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages