Transform academic papers into clear video explainers with AI-powered narration, automated slide generation, and professional video production.
- PDF Analysis: Extract structured content from research papers using MinerU API
- Slide Generation: AI-powered content summarization and slide layout design
- Multi-language Support: Generate slides and narration in English or Chinese
- Voice Cloning: Custom TTS with voice sample cloning capabilities
- Video Production: Automated video generation from slides with synchronized narration
- Real-time Progress: Track pipeline stages from parsing to final video rendering
- Bun runtime
- System fonts for slide rendering (e.g.
fonts-noto-cjkfor Chinese) - MinerU API key
git clone <repository-url>
cd Paper2Video
bun install
cp .env.example .envConfigure MinerU in .env:
MINERU_API_KEY: MinerU key (required for real parsing)MINERU_API_URL: MinerU API host (defaulthttps://mineru.net)MINERU_UPLOAD_PATH: single-file upload endpoint pathMINERU_STATUS_PATH_TEMPLATE: task status endpoint template ({taskId}placeholder)MINERU_RESULT_PATH_TEMPLATE: task result endpoint template ({taskId}placeholder)
bun run devThe application will be available at:
- Next.js UI: http://localhost:3000
bun run build
bun run startMIT
