A local podcast monitoring and transcription system that:
- Monitors RSS feeds for new podcast episodes
- Downloads new episodes automatically
- Transcribes audio using either OpenAI Whisper API or local mlx-whisper
- Summarizes content using either OpenAI GPT-4 or local LLMs via Ollama
- Stores metadata in SQLite
- Generates daily feed summaries
- Ensure you have Python 3.x installed
- Install ffmpeg (required for audio processing):
brew install ffmpeg- Install and set up Ollama (required only for local summarization):
# Install Ollama
brew install ollama
# Start Ollama server
ollama serve
# In a new terminal, pull the model
ollama pull qwen2.5:3b
# Verify the model is working
ollama run qwen2.5:3b "Hello, how are you?"- Set up Python environment:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt- Configure environment variables (if using OpenAI):
cp .env.example .env
# Edit .env and add your OpenAI API key if using OpenAI servicesThe application can be configured through config.py. Key settings include:
AUDIO_STORAGE_PATH: Where to store downloaded podcasts (default:~/Podcasts)TRANSCRIPT_STORAGE_PATH: Where to store transcripts (default:~/Podcasts/Transcripts)
PODCAST_FEEDS: List of RSS feed URLs to monitorMAX_EPISODES_PER_FEED: Maximum number of episodes to pull from each feed (default: 5)
TRANSCRIPTION_MODE: Choose between "local" or "openai"- "local": Uses mlx-whisper locally (free, no API key needed)
- "openai": Uses OpenAI's Whisper API (requires API key)
WHISPER_MODEL: Model to use for local transcription (default: "mlx-community/distil-whisper-large-v3")
SUMMARIZATION_MODE: Choose between "local" or "openai"- "local": Uses Ollama locally (free, no API key needed)
- "openai": Uses OpenAI's GPT-4 (requires API key)
OPENAI_SUMMARY_MODEL: Model to use for OpenAI summarizationOLLAMA_MODEL: Model to use for local summarization (default: "qwen2.5:3b")OLLAMA_URL: URL for Ollama server (default: "http://localhost:11434")
CHECK_INTERVAL_MINUTES: How often to check feeds (default: 60)RETAIN_DAYS: How many days of history to keep (default: 30)
- Configure your settings in
config.py - If using local summarization, ensure Ollama is running:
# Start Ollama in a separate terminal if not already running
ollama serve- Run the services:
To run the background service that monitors feeds and processes episodes:
python main.pyThe background service will:
- Check configured RSS feeds
- Download new episodes
- Generate transcripts (using chosen transcription method)
- Create summaries (using chosen summarization method)
- Save all metadata to a local SQLite database
Audio files are stored in ~/Podcasts/ by default
Transcripts are stored in ~/Podcasts/Transcripts/
To access the content through a web interface:
python web_server.pyThen open your browser to http://localhost:8000
This project uses OpenLIT to track observability metrics. To enable, set the OTEL_EXPORTER_OTLP_ENDPOINT in your .env file or set the environment variable in your shell.
export OTEL_EXPORTER_OTLP_ENDPOINT=http://127.0.0.1:4318Clone the OpenLIT repo:
git clone git@github.com:openlit/openlit.gitStart Docker Compose:
docker compose up -d