Voice-controlled academic paper reader - fetches from arXiv, summarizes with AI, reads aloud with TTS.
- arXiv Integration: Fetches papers based on your topics
- AI Summarization: Multiple summary levels (brief, standard, detailed, technical)
- Voice Control: Hands-free navigation with Vosk
- Text-to-Speech: Natural voice output with VibeVoice
- SQLite Storage: Persistent queue, history, and saved papers
# Clone the repository
git clone <repository-url>
cd Papercast
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate
# Install with dev dependencies
pip install -e ".[dev]"PaperCast supports real AI backends for summarization and TTS. These are optional - you can use mock backends for testing.
Requires a GitHub Copilot subscription:
- Install Copilot CLI: Follow the official guide
- Verify installation: Run
copilot --versionto confirm it's in your PATH - The
github-copilot-sdkPython package is included in PaperCast dependencies
Requires manual installation from source (GPU recommended):
# Clone VibeVoice repository
git clone https://github.com/microsoft/VibeVoice.git
# Install with TTS dependencies
pip install -e "./VibeVoice[tts]"
# Note: Flash Attention 2 is NOT supported on Windows.
# On Linux with CUDA, you can optionally install flash-attention for better performance:
# pip install flash-attn --no-build-isolation
# On Windows, PaperCast automatically uses SDPA (Scaled Dot Product Attention) instead.VibeVoice voice presets are in VibeVoice/demo/voices/streaming_model/. Available English voices include:
en-Carter_man,en-Davis_man,en-Frank_man,en-Mike_man(male)en-Emma_woman,en-Grace_woman(female)
You can use short names like carter or emma - they will be automatically mapped to the full preset name.
PaperCast uses Vosk for offline voice recognition. You need to download a model:
# Create models directory
mkdir -p models
# Download a small English model (~50MB)
# Option 1: Using curl
curl -L -o models/vosk-model-small-en-us-0.15.zip \
https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip models/vosk-model-small-en-us-0.15.zip -d models/
rm models/vosk-model-small-en-us-0.15.zip
# Option 2: Using PowerShell (Windows)
Invoke-WebRequest -Uri "https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip" -OutFile "models/vosk-model-small-en-us-0.15.zip"
Expand-Archive -Path "models/vosk-model-small-en-us-0.15.zip" -DestinationPath "models/"
Remove-Item "models/vosk-model-small-en-us-0.15.zip"For better accuracy, you can use a larger model from: https://alphacephei.com/vosk/models
Copy the example config and customize:
cp .env.example .envRequired settings for first run:
# Set the Vosk model path (required)
PAPERCAST_VOSK_MODEL_PATH=models/vosk-model-small-en-us-0.15
# For testing without real AI summarization, use mock backend:
PAPERCAST_SUMMARIZER_BACKEND=mock
# For testing without real TTS, use mock backend:
PAPERCAST_VOICE_MODEL=mock
# Your research topics
PAPERCAST_TOPICS=["machine learning", "natural language processing"]# Start the voice-controlled reader
papercast run
# With debug output
papercast run --debug
# Override topics from command line
papercast run --topic "transformers" --topic "large language models"
# Start in daily briefing mode
papercast run --briefing# Show current configuration
papercast config
# Start voice-controlled reader
papercast run
# Show version
papercast version
# Export saved papers (coming soon)
papercast export --format markdownAll settings can be set via environment variables or .env file:
| Variable | Default | Description |
|---|---|---|
PAPERCAST_TOPICS |
["machine learning", "nlp"] |
arXiv search topics (JSON array) |
PAPERCAST_FETCH_DAYS |
7 |
Days to look back for papers (1-30) |
PAPERCAST_MAX_PAPERS_PER_FETCH |
50 |
Max papers per fetch (1-200) |
PAPERCAST_SUMMARIZER_BACKEND |
copilot |
copilot or mock |
PAPERCAST_DEFAULT_SUMMARY_LEVEL |
standard |
brief, standard, detailed, technical |
PAPERCAST_VOICE_MODEL |
vibevoice |
vibevoice or mock |
PAPERCAST_SPEECH_RATE |
1.0 |
Speech rate (0.5-2.0) |
PAPERCAST_WAKE_WORD |
(none) | Optional wake word (e.g., "hey paper") |
PAPERCAST_VOSK_MODEL_PATH |
(none) | Path to Vosk model directory |
PAPERCAST_DATABASE_PATH |
papercast.db |
SQLite database location |
PAPERCAST_LOG_LEVEL |
INFO |
DEBUG, INFO, WARNING, ERROR |
PAPERCAST_DEBUG |
false |
Enable debug mode |
During playback, you can say:
| Command | Aliases | Action |
|---|---|---|
| next | skip, next paper | Go to next paper |
| back | previous, go back | Go to previous paper |
| pause | wait, hold on | Pause playback |
| resume | continue, go on | Resume playback |
| stop | - | Stop playback |
| save | bookmark, save this | Save paper to reading list |
| details | more details, tell me more | Re-read with detailed summary |
| brief | short, summary | Re-read with brief summary |
| repeat | again, read again | Repeat current paper |
| faster | speed up | Increase speech rate |
| slower | slow down | Decrease speech rate |
| search [topic] | find papers about... | Search for papers on topic |
To test PaperCast without Vosk, VibeVoice, or Copilot:
# Set all backends to mock in .env
PAPERCAST_SUMMARIZER_BACKEND=mock
PAPERCAST_VOICE_MODEL=mockThis uses mock implementations that simulate the real components.
# Run tests
pytest
# Run tests with coverage
pytest --cov=papercast
# Type checking
mypy src/papercast
# Linting
ruff check .
# Auto-fix linting issues
ruff check . --fixPapercast/
├── src/papercast/
│ ├── orchestrator/ # Central coordinator
│ ├── scraper/ # arXiv paper fetching
│ ├── queue/ # Paper queue management
│ ├── summarizer/ # AI summarization
│ ├── tts/ # Text-to-speech
│ ├── voice/ # Voice command recognition
│ ├── storage/ # SQLite persistence
│ ├── config.py # Settings management
│ ├── models.py # Data models
│ └── main.py # CLI entry point
├── tests/ # Test suite
├── models/ # Vosk models (download separately)
├── .env # Your configuration
└── .env.example # Configuration template
You need to download a Vosk model. See Download Vosk Model above.
Check that PAPERCAST_VOSK_MODEL_PATH points to the correct model directory (the folder containing am/, conf/, etc.).
Make sure you have audio output devices configured. For testing, use PAPERCAST_VOICE_MODEL=mock.
The Copilot backend requires the GitHub Copilot CLI to be installed and in your PATH. Install it from: https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli
For testing without Copilot, use PAPERCAST_SUMMARIZER_BACKEND=mock.
VibeVoice must be installed from source:
git clone https://github.com/microsoft/VibeVoice.git
pip install -e "./VibeVoice[tts]"For testing without VibeVoice, use PAPERCAST_VOICE_MODEL=mock.
MIT