VoiceNotes

A voice note application with transcription, semantic search, and weekly digests.

Features

Voice Recording: Record voice notes directly in the browser
Automatic Transcription: Uses Whisper large-v3 for accurate transcription
Speaker Diarization: Identify different speakers in recordings (optional, uses pyannote.audio)
Semantic Search: Find notes by meaning using sentence-transformers + ChromaDB
Weekly Digest: See top topics and activity stats for each week

Tech Stack

Layer	Technology
Frontend	React + Vite, TailwindCSS
Backend	Python FastAPI
Transcription	OpenAI Whisper (local, large-v3)
Diarization	pyannote.audio (local, speaker-diarization-3.1)
Embeddings	sentence-transformers (all-MiniLM-L6-v2)
Vector Search	ChromaDB
Database	SQLite

Getting Started

Prerequisites

Python 3.10+
Node.js 18+
uv (Python package manager)
~10GB RAM for Whisper large-v3

Backend Setup

cd backend

# Install dependencies with uv
uv sync

# Run the server
uv run uvicorn main:app --reload --host 0.0.0.0 --port 8000

The first run will download the Whisper model (~3GB) and sentence-transformer model (~90MB).

Frontend Setup

cd frontend

# Install dependencies
npm install

# Run dev server
npm run dev

Open http://localhost:5173 in your browser.

Speaker Diarization (Optional)

To enable speaker identification in recordings:

Create a free account at https://huggingface.co
Accept model terms at https://huggingface.co/pyannote/speaker-diarization-3.1
Get your token at https://huggingface.co/settings/tokens
Set the environment variable:

export HF_TOKEN=your_huggingface_token

If HF_TOKEN is not set, transcription will still work but without speaker labels.

API Endpoints

Endpoint	Method	Description
`/api/notes`	GET	List all notes
`/api/notes`	POST	Upload new voice note
`/api/notes/{id}`	GET	Get single note
`/api/notes/{id}`	PUT	Update note
`/api/notes/{id}`	DELETE	Delete note
`/api/notes/{id}/audio`	GET	Get audio file
`/api/search`	GET	Semantic search
`/api/digest`	GET	Weekly digest

Project Structure

voice-note/
├── backend/
│   ├── main.py              # FastAPI app
│   ├── transcription.py     # Whisper integration
│   ├── diarization.py       # Speaker diarization with pyannote
│   ├── embeddings.py        # Sentence transformers + ChromaDB
│   ├── topics.py            # Topic extraction for digest
│   ├── database.py          # SQLite models
│   └── pyproject.toml       # uv dependencies
├── frontend/
│   ├── src/
│   │   ├── components/      # React components
│   │   ├── pages/           # Route pages
│   │   ├── api.ts           # API client
│   │   └── App.tsx          # Main app
│   └── package.json
└── data/                    # Audio files & databases

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
ADR.md		ADR.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VoiceNotes

Features

Tech Stack

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Speaker Diarization (Optional)

API Endpoints

Project Structure

License

About

Uh oh!

Releases

Packages

Languages

vineet-codes/voice-note

Folders and files

Latest commit

History

Repository files navigation

VoiceNotes

Features

Tech Stack

Getting Started

Prerequisites

Backend Setup

Frontend Setup

Speaker Diarization (Optional)

API Endpoints

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages