A Model Context Protocol (MCP) server that integrates with AssemblyAI's transcription API, enabling AI assistants to upload audio/video files and retrieve transcriptions.
- File Upload: Upload audio/video files up to 2.2GB to AssemblyAI
- Transcription: Submit files for transcription with automatic language detection
- Speaker Diarization: Identify and label different speakers in audio
- Status Checking: Poll for transcription status and retrieve results
- Railway Deployment: Easy deployment to Railway for remote access
- Python 3.10 or higher
- AssemblyAI API key (get one at assemblyai.com/dashboard)
- Clone the repository:
git clone <repository-url>
cd assemblyai-mcp- Install the package:
pip install -e .- Set up your API key:
cp .env.example .env
# Edit .env and add your API key
export ASSEMBLYAI_API_KEY=your_api_key_hereAdd to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"assemblyai": {
"command": "python",
"args": ["-m", "assemblyai_mcp.server"],
"env": {
"ASSEMBLYAI_API_KEY": "your_api_key_here"
}
}
}
}Upload an audio or video file to AssemblyAI.
Parameters:
file_path(string, required): Path to the audio/video file
Returns:
{
"upload_url": "https://cdn.assemblyai.com/upload/...",
"file_name": "audio.mp3",
"file_size_mb": 5.23
}Example:
upload_file("/path/to/audio.mp3")Supported formats: MP3, WAV, FLAC, MP4, WebM, OGG, and many others
Submit a file for transcription and wait for completion.
Parameters:
upload_url(string, required): The URL returned fromupload_file()language_code(string, optional): Language code ('en', 'es', 'fr', etc.). Auto-detected if not specified.speaker_labels(boolean, optional): Enable speaker diarization (default: false)
Returns:
{
"status": "completed",
"transcript_id": "abc123",
"text": "Hello, how are you?",
"confidence": 0.95,
"audio_duration": 120.5,
"language_code": "en",
"word_count": 150
}With speaker labels:
{
"status": "completed",
"transcript_id": "abc123",
"text": "Speaker A: Hello. Speaker B: Hi there.",
"utterances": [
{
"speaker": "A",
"text": "Hello.",
"start": 0,
"end": 1000
},
{
"speaker": "B",
"text": "Hi there.",
"start": 1500,
"end": 2500
}
]
}On timeout (>10 minutes):
{
"status": "timeout",
"transcript_id": "abc123",
"message": "Transcription still processing after 600s. Use get_transcript_status('abc123') to check later."
}Example:
# Basic transcription
transcribe("https://cdn.assemblyai.com/upload/...")
# With language and speakers
transcribe(
"https://cdn.assemblyai.com/upload/...",
language_code="es",
speaker_labels=True
)Get the status and results of a transcript by ID.
Parameters:
transcript_id(string, required): The transcript ID
Returns:
{
"status": "completed",
"transcript_id": "abc123",
"text": "Full transcript text...",
"confidence": 0.95,
"audio_duration": 120.5,
"language_code": "en",
"word_count": 150
}Possible statuses:
queued: Waiting in queueprocessing: Transcription in progresscompleted: Transcription finished successfullyerror: Transcription failed
Example:
get_transcript_status("abc123")# 1. Upload the file
result = upload_file("/Users/john/podcast.mp3")
upload_url = result["upload_url"]
# 2. Transcribe with speaker labels
transcript = transcribe(
upload_url,
speaker_labels=True
)
# 3. If timeout occurs, check status later
if transcript["status"] == "timeout":
transcript_id = transcript["transcript_id"]
# Wait a bit, then check
transcript = get_transcript_status(transcript_id)# Spanish transcription
transcribe(upload_url, language_code="es")
# French transcription
transcribe(upload_url, language_code="fr")
# Auto-detect (default)
transcribe(upload_url)Deploy your MCP server to Railway for remote access:
Ensure your code is in a Git repository:
git init
git add .
git commit -m "Initial commit"- Install Railway CLI:
npm install -g @railway/cli- Login and deploy:
railway login
railway init
railway up- Set environment variables in Railway dashboard:
ASSEMBLYAI_API_KEY=your_api_key_here
Update your Claude Desktop configuration to use the Railway server:
{
"mcpServers": {
"assemblyai": {
"url": "https://your-app.up.railway.app",
"transport": "sse"
}
}
}The server provides clear error messages for common issues:
File not found: /path/to/file- Check the file pathFile is empty: /path/to/file- File has no contentFile exceeds 2.2GB limit- Split or compress the file
Invalid API key- Check yourASSEMBLYAI_API_KEYenvironment variableRate limit exceeded- Wait or upgrade your planAssemblyAI service error- Temporary service issue, try again
Network error: ...- Check your internet connection
pip install -e ".[dev]"
pytestTest the server locally using the MCP Inspector:
npx @modelcontextprotocol/inspector python -m assemblyai_mcp.serverAssemblyAI has the following rate limits:
- Free tier: 100 hours/month
- Pro tier: Unlimited (subject to fair use)
See AssemblyAI pricing for details.
The server supports all formats that AssemblyAI accepts:
- Audio: MP3, WAV, FLAC, OGG, AAC, M4A, WMA, OPUS
- Video: MP4, WebM, MOV, AVI, FLV, MKV
Video files will have their audio extracted automatically.
- API keys are loaded from environment variables, never hardcoded
- Files are validated before upload (size, existence, readability)
- All HTTP requests use HTTPS
- No file data is stored on the server (ephemeral processing only)
Set your API key:
export ASSEMBLYAI_API_KEY=your_key_here- Verify your API key at https://www.assemblyai.com/dashboard
- Check for typos or extra spaces
- Ensure the key is active and not expired
Split your file or compress it:
# Split with ffmpeg
ffmpeg -i large_file.mp3 -f segment -segment_time 3600 -c copy output_%03d.mp3
# Compress with ffmpeg
ffmpeg -i input.mp3 -b:a 64k output.mp3Long files may timeout. Use the returned transcript_id to check status:
get_transcript_status("your_transcript_id")Contributions are welcome! Please open an issue or submit a pull request.
MIT License - see LICENSE file for details.
- Issues: Open a GitHub issue
- Email: support@assemblyai.com (for API-related questions)
- Docs: https://www.assemblyai.com/docs