PaperCast

Voice-controlled academic paper reader - fetches from arXiv, summarizes with AI, reads aloud with TTS.

Features

arXiv Integration: Fetches papers based on your topics
AI Summarization: Multiple summary levels (brief, standard, detailed, technical)
Voice Control: Hands-free navigation with Vosk
Text-to-Speech: Natural voice output with VibeVoice
SQLite Storage: Persistent queue, history, and saved papers

Quick Start

1. Installation

# Clone the repository
git clone <repository-url>
cd Papercast

# Create virtual environment
python -m venv .venv

# Activate virtual environment
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

2. Install AI Backends (Optional)

PaperCast supports real AI backends for summarization and TTS. These are optional - you can use mock backends for testing.

GitHub Copilot (for AI Summarization)

Requires a GitHub Copilot subscription:

Install Copilot CLI: Follow the official guide
Verify installation: Run copilot --version to confirm it's in your PATH
The github-copilot-sdk Python package is included in PaperCast dependencies

VibeVoice (for Text-to-Speech)

Requires manual installation from source (GPU recommended):

# Clone VibeVoice repository
git clone https://github.com/microsoft/VibeVoice.git

# Install with TTS dependencies
pip install -e "./VibeVoice[tts]"

# Note: Flash Attention 2 is NOT supported on Windows.
# On Linux with CUDA, you can optionally install flash-attention for better performance:
# pip install flash-attn --no-build-isolation
# On Windows, PaperCast automatically uses SDPA (Scaled Dot Product Attention) instead.

VibeVoice voice presets are in VibeVoice/demo/voices/streaming_model/. Available English voices include:

en-Carter_man, en-Davis_man, en-Frank_man, en-Mike_man (male)
en-Emma_woman, en-Grace_woman (female)

You can use short names like carter or emma - they will be automatically mapped to the full preset name.

3. Download Vosk Model (Required for Voice Recognition)

PaperCast uses Vosk for offline voice recognition. You need to download a model:

# Create models directory
mkdir -p models

# Download a small English model (~50MB)
# Option 1: Using curl
curl -L -o models/vosk-model-small-en-us-0.15.zip \
  https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip models/vosk-model-small-en-us-0.15.zip -d models/
rm models/vosk-model-small-en-us-0.15.zip

# Option 2: Using PowerShell (Windows)
Invoke-WebRequest -Uri "https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip" -OutFile "models/vosk-model-small-en-us-0.15.zip"
Expand-Archive -Path "models/vosk-model-small-en-us-0.15.zip" -DestinationPath "models/"
Remove-Item "models/vosk-model-small-en-us-0.15.zip"

For better accuracy, you can use a larger model from: https://alphacephei.com/vosk/models

4. Configuration

Copy the example config and customize:

cp .env.example .env

Required settings for first run:

# Set the Vosk model path (required)
PAPERCAST_VOSK_MODEL_PATH=models/vosk-model-small-en-us-0.15

# For testing without real AI summarization, use mock backend:
PAPERCAST_SUMMARIZER_BACKEND=mock

# For testing without real TTS, use mock backend:
PAPERCAST_VOICE_MODEL=mock

# Your research topics
PAPERCAST_TOPICS=["machine learning", "natural language processing"]

5. Run PaperCast

# Start the voice-controlled reader
papercast run

# With debug output
papercast run --debug

# Override topics from command line
papercast run --topic "transformers" --topic "large language models"

# Start in daily briefing mode
papercast run --briefing

All Commands

# Show current configuration
papercast config

# Start voice-controlled reader
papercast run

# Show version
papercast version

# Export saved papers (coming soon)
papercast export --format markdown

Configuration Options

All settings can be set via environment variables or .env file:

Variable	Default	Description
`PAPERCAST_TOPICS`	`["machine learning", "nlp"]`	arXiv search topics (JSON array)
`PAPERCAST_FETCH_DAYS`	`7`	Days to look back for papers (1-30)
`PAPERCAST_MAX_PAPERS_PER_FETCH`	`50`	Max papers per fetch (1-200)
`PAPERCAST_SUMMARIZER_BACKEND`	`copilot`	`copilot` or `mock`
`PAPERCAST_DEFAULT_SUMMARY_LEVEL`	`standard`	`brief`, `standard`, `detailed`, `technical`
`PAPERCAST_VOICE_MODEL`	`vibevoice`	`vibevoice` or `mock`
`PAPERCAST_SPEECH_RATE`	`1.0`	Speech rate (0.5-2.0)
`PAPERCAST_WAKE_WORD`	(none)	Optional wake word (e.g., "hey paper")
`PAPERCAST_VOSK_MODEL_PATH`	(none)	Path to Vosk model directory
`PAPERCAST_DATABASE_PATH`	`papercast.db`	SQLite database location
`PAPERCAST_LOG_LEVEL`	`INFO`	`DEBUG`, `INFO`, `WARNING`, `ERROR`
`PAPERCAST_DEBUG`	`false`	Enable debug mode

Voice Commands

During playback, you can say:

Command	Aliases	Action
next	skip, next paper	Go to next paper
back	previous, go back	Go to previous paper
pause	wait, hold on	Pause playback
resume	continue, go on	Resume playback
stop	-	Stop playback
save	bookmark, save this	Save paper to reading list
details	more details, tell me more	Re-read with detailed summary
brief	short, summary	Re-read with brief summary
repeat	again, read again	Repeat current paper
faster	speed up	Increase speech rate
slower	slow down	Decrease speech rate
search [topic]	find papers about...	Search for papers on topic

Testing Mode (No External Dependencies)

To test PaperCast without Vosk, VibeVoice, or Copilot:

# Set all backends to mock in .env
PAPERCAST_SUMMARIZER_BACKEND=mock
PAPERCAST_VOICE_MODEL=mock

This uses mock implementations that simulate the real components.

Development

# Run tests
pytest

# Run tests with coverage
pytest --cov=papercast

# Type checking
mypy src/papercast

# Linting
ruff check .

# Auto-fix linting issues
ruff check . --fix

Project Structure

Papercast/
├── src/papercast/
│   ├── orchestrator/    # Central coordinator
│   ├── scraper/         # arXiv paper fetching
│   ├── queue/           # Paper queue management
│   ├── summarizer/      # AI summarization
│   ├── tts/             # Text-to-speech
│   ├── voice/           # Voice command recognition
│   ├── storage/         # SQLite persistence
│   ├── config.py        # Settings management
│   ├── models.py        # Data models
│   └── main.py          # CLI entry point
├── tests/               # Test suite
├── models/              # Vosk models (download separately)
├── .env                 # Your configuration
└── .env.example         # Configuration template

Troubleshooting

"Folder '' does not contain model files"

You need to download a Vosk model. See Download Vosk Model above.

"Failed to initialize Vosk"

Check that PAPERCAST_VOSK_MODEL_PATH points to the correct model directory (the folder containing am/, conf/, etc.).

No audio output

Make sure you have audio output devices configured. For testing, use PAPERCAST_VOICE_MODEL=mock.

Copilot CLI not found

The Copilot backend requires the GitHub Copilot CLI to be installed and in your PATH. Install it from: https://docs.github.com/en/copilot/how-tos/set-up/install-copilot-cli

For testing without Copilot, use PAPERCAST_SUMMARIZER_BACKEND=mock.

VibeVoice not installed

VibeVoice must be installed from source:

git clone https://github.com/microsoft/VibeVoice.git
pip install -e "./VibeVoice[tts]"

For testing without VibeVoice, use PAPERCAST_VOICE_MODEL=mock.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.beads		.beads
.ralph-tui		.ralph-tui
.sisyphus/plans		.sisyphus/plans
papers		papers
src/papercast		src/papercast
tasks		tasks
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
papercast-architecture.md		papercast-architecture.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaperCast

Features

Quick Start

1. Installation

2. Install AI Backends (Optional)

GitHub Copilot (for AI Summarization)

VibeVoice (for Text-to-Speech)

3. Download Vosk Model (Required for Voice Recognition)

4. Configuration

5. Run PaperCast

All Commands

Configuration Options

Voice Commands

Testing Mode (No External Dependencies)

Development

Project Structure

Troubleshooting

"Folder '' does not contain model files"

"Failed to initialize Vosk"

No audio output

Copilot CLI not found

VibeVoice not installed

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PaperCast

Features

Quick Start

1. Installation

2. Install AI Backends (Optional)

GitHub Copilot (for AI Summarization)

VibeVoice (for Text-to-Speech)

3. Download Vosk Model (Required for Voice Recognition)

4. Configuration

5. Run PaperCast

All Commands

Configuration Options

Voice Commands

Testing Mode (No External Dependencies)

Development

Project Structure

Troubleshooting

"Folder '' does not contain model files"

"Failed to initialize Vosk"

No audio output

Copilot CLI not found

VibeVoice not installed

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages