Skip to content

vineet-codes/voice-note

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VoiceNotes

A voice note application with transcription, semantic search, and weekly digests.

Features

  • Voice Recording: Record voice notes directly in the browser
  • Automatic Transcription: Uses Whisper large-v3 for accurate transcription
  • Speaker Diarization: Identify different speakers in recordings (optional, uses pyannote.audio)
  • Semantic Search: Find notes by meaning using sentence-transformers + ChromaDB
  • Weekly Digest: See top topics and activity stats for each week

Tech Stack

Layer Technology
Frontend React + Vite, TailwindCSS
Backend Python FastAPI
Transcription OpenAI Whisper (local, large-v3)
Diarization pyannote.audio (local, speaker-diarization-3.1)
Embeddings sentence-transformers (all-MiniLM-L6-v2)
Vector Search ChromaDB
Database SQLite

Getting Started

Prerequisites

  • Python 3.10+
  • Node.js 18+
  • uv (Python package manager)
  • ~10GB RAM for Whisper large-v3

Backend Setup

cd backend

# Install dependencies with uv
uv sync

# Run the server
uv run uvicorn main:app --reload --host 0.0.0.0 --port 8000

The first run will download the Whisper model (~3GB) and sentence-transformer model (~90MB).

Frontend Setup

cd frontend

# Install dependencies
npm install

# Run dev server
npm run dev

Open http://localhost:5173 in your browser.

Speaker Diarization (Optional)

To enable speaker identification in recordings:

  1. Create a free account at https://huggingface.co
  2. Accept model terms at https://huggingface.co/pyannote/speaker-diarization-3.1
  3. Get your token at https://huggingface.co/settings/tokens
  4. Set the environment variable:
export HF_TOKEN=your_huggingface_token

If HF_TOKEN is not set, transcription will still work but without speaker labels.

API Endpoints

Endpoint Method Description
/api/notes GET List all notes
/api/notes POST Upload new voice note
/api/notes/{id} GET Get single note
/api/notes/{id} PUT Update note
/api/notes/{id} DELETE Delete note
/api/notes/{id}/audio GET Get audio file
/api/search GET Semantic search
/api/digest GET Weekly digest

Project Structure

voice-note/
├── backend/
│   ├── main.py              # FastAPI app
│   ├── transcription.py     # Whisper integration
│   ├── diarization.py       # Speaker diarization with pyannote
│   ├── embeddings.py        # Sentence transformers + ChromaDB
│   ├── topics.py            # Topic extraction for digest
│   ├── database.py          # SQLite models
│   └── pyproject.toml       # uv dependencies
├── frontend/
│   ├── src/
│   │   ├── components/      # React components
│   │   ├── pages/           # Route pages
│   │   ├── api.ts           # API client
│   │   └── App.tsx          # Main app
│   └── package.json
└── data/                    # Audio files & databases

License

MIT

About

A voice note application with transcription, semantic search, and weekly digests.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published