Skip to content

AssemblyAI-Community/assemblyai-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AssemblyAI MCP Server

A Model Context Protocol (MCP) server that integrates with AssemblyAI's transcription API, enabling AI assistants to upload audio/video files and retrieve transcriptions.

Features

  • File Upload: Upload audio/video files up to 2.2GB to AssemblyAI
  • Transcription: Submit files for transcription with automatic language detection
  • Speaker Diarization: Identify and label different speakers in audio
  • Status Checking: Poll for transcription status and retrieve results
  • Railway Deployment: Easy deployment to Railway for remote access

Prerequisites

Installation

Local Installation

  1. Clone the repository:
git clone <repository-url>
cd assemblyai-mcp
  1. Install the package:
pip install -e .
  1. Set up your API key:
cp .env.example .env
# Edit .env and add your API key
export ASSEMBLYAI_API_KEY=your_api_key_here

Using with Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "assemblyai": {
      "command": "python",
      "args": ["-m", "assemblyai_mcp.server"],
      "env": {
        "ASSEMBLYAI_API_KEY": "your_api_key_here"
      }
    }
  }
}

Tools

1. upload_file

Upload an audio or video file to AssemblyAI.

Parameters:

  • file_path (string, required): Path to the audio/video file

Returns:

{
  "upload_url": "https://cdn.assemblyai.com/upload/...",
  "file_name": "audio.mp3",
  "file_size_mb": 5.23
}

Example:

upload_file("/path/to/audio.mp3")

Supported formats: MP3, WAV, FLAC, MP4, WebM, OGG, and many others

2. transcribe

Submit a file for transcription and wait for completion.

Parameters:

  • upload_url (string, required): The URL returned from upload_file()
  • language_code (string, optional): Language code ('en', 'es', 'fr', etc.). Auto-detected if not specified.
  • speaker_labels (boolean, optional): Enable speaker diarization (default: false)

Returns:

{
  "status": "completed",
  "transcript_id": "abc123",
  "text": "Hello, how are you?",
  "confidence": 0.95,
  "audio_duration": 120.5,
  "language_code": "en",
  "word_count": 150
}

With speaker labels:

{
  "status": "completed",
  "transcript_id": "abc123",
  "text": "Speaker A: Hello. Speaker B: Hi there.",
  "utterances": [
    {
      "speaker": "A",
      "text": "Hello.",
      "start": 0,
      "end": 1000
    },
    {
      "speaker": "B",
      "text": "Hi there.",
      "start": 1500,
      "end": 2500
    }
  ]
}

On timeout (>10 minutes):

{
  "status": "timeout",
  "transcript_id": "abc123",
  "message": "Transcription still processing after 600s. Use get_transcript_status('abc123') to check later."
}

Example:

# Basic transcription
transcribe("https://cdn.assemblyai.com/upload/...")

# With language and speakers
transcribe(
  "https://cdn.assemblyai.com/upload/...",
  language_code="es",
  speaker_labels=True
)

3. get_transcript_status

Get the status and results of a transcript by ID.

Parameters:

  • transcript_id (string, required): The transcript ID

Returns:

{
  "status": "completed",
  "transcript_id": "abc123",
  "text": "Full transcript text...",
  "confidence": 0.95,
  "audio_duration": 120.5,
  "language_code": "en",
  "word_count": 150
}

Possible statuses:

  • queued: Waiting in queue
  • processing: Transcription in progress
  • completed: Transcription finished successfully
  • error: Transcription failed

Example:

get_transcript_status("abc123")

Usage Examples

End-to-End Transcription

# 1. Upload the file
result = upload_file("/Users/john/podcast.mp3")
upload_url = result["upload_url"]

# 2. Transcribe with speaker labels
transcript = transcribe(
  upload_url,
  speaker_labels=True
)

# 3. If timeout occurs, check status later
if transcript["status"] == "timeout":
  transcript_id = transcript["transcript_id"]
  # Wait a bit, then check
  transcript = get_transcript_status(transcript_id)

Multi-Language Support

# Spanish transcription
transcribe(upload_url, language_code="es")

# French transcription
transcribe(upload_url, language_code="fr")

# Auto-detect (default)
transcribe(upload_url)

Railway Deployment

Deploy your MCP server to Railway for remote access:

1. Prepare Your Repository

Ensure your code is in a Git repository:

git init
git add .
git commit -m "Initial commit"

2. Deploy to Railway

  1. Install Railway CLI:
npm install -g @railway/cli
  1. Login and deploy:
railway login
railway init
railway up
  1. Set environment variables in Railway dashboard:
ASSEMBLYAI_API_KEY=your_api_key_here

3. Connect to Your Railway Server

Update your Claude Desktop configuration to use the Railway server:

{
  "mcpServers": {
    "assemblyai": {
      "url": "https://your-app.up.railway.app",
      "transport": "sse"
    }
  }
}

Error Handling

The server provides clear error messages for common issues:

File Errors

  • File not found: /path/to/file - Check the file path
  • File is empty: /path/to/file - File has no content
  • File exceeds 2.2GB limit - Split or compress the file

API Errors

  • Invalid API key - Check your ASSEMBLYAI_API_KEY environment variable
  • Rate limit exceeded - Wait or upgrade your plan
  • AssemblyAI service error - Temporary service issue, try again

Network Errors

  • Network error: ... - Check your internet connection

Development

Running Tests

pip install -e ".[dev]"
pytest

Local Testing

Test the server locally using the MCP Inspector:

npx @modelcontextprotocol/inspector python -m assemblyai_mcp.server

API Rate Limits

AssemblyAI has the following rate limits:

  • Free tier: 100 hours/month
  • Pro tier: Unlimited (subject to fair use)

See AssemblyAI pricing for details.

Supported File Formats

The server supports all formats that AssemblyAI accepts:

  • Audio: MP3, WAV, FLAC, OGG, AAC, M4A, WMA, OPUS
  • Video: MP4, WebM, MOV, AVI, FLV, MKV

Video files will have their audio extracted automatically.

Security

  • API keys are loaded from environment variables, never hardcoded
  • Files are validated before upload (size, existence, readability)
  • All HTTP requests use HTTPS
  • No file data is stored on the server (ephemeral processing only)

Troubleshooting

"ASSEMBLYAI_API_KEY environment variable is required"

Set your API key:

export ASSEMBLYAI_API_KEY=your_key_here

"Invalid API key"

"File exceeds 2.2GB limit"

Split your file or compress it:

# Split with ffmpeg
ffmpeg -i large_file.mp3 -f segment -segment_time 3600 -c copy output_%03d.mp3

# Compress with ffmpeg
ffmpeg -i input.mp3 -b:a 64k output.mp3

Timeout After 10 Minutes

Long files may timeout. Use the returned transcript_id to check status:

get_transcript_status("your_transcript_id")

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

License

MIT License - see LICENSE file for details.

Links

Support

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors