Multiply your LLM rate limits by routing across multiple providers
Quick Start • Integrations • API Reference • Configuration
| Without Conductor | With Conductor |
|---|---|
| 1 provider, 1 rate limit | Combined rate limits from ALL providers |
| Provider down = you're down | Automatic failover to other providers |
| Different APIs per provider | One unified API for everything |
Example: Cerebras (30 RPM) + NVIDIA (40 RPM) = 70 requests/minute 🚀
git clone https://github.com/yourusername/TrainForgeConductor.git
cd TrainForgeConductor
python -m venv .venv
source .venv/bin/activate
pip install -e .cp config/config.example.yaml config/config.yamlAdd your API keys to config/config.yaml:
providers:
cerebras:
enabled: true
keys:
- name: my-cerebras
api_key: YOUR_CEREBRAS_KEY # Get from https://cloud.cerebras.ai
requests_per_minute: 30
tokens_per_minute: 60000
nvidia:
enabled: true
keys:
- name: my-nvidia
api_key: YOUR_NVIDIA_KEY # Get from https://build.nvidia.com
requests_per_minute: 40
tokens_per_minute: 100000trainforge-conductorfrom examples.client import ConductorClient
client = ConductorClient()
# Simple chat - automatically routes to available provider
response = client.chat("Hello!")
print(response)
# With system prompt
response = client.chat("Write a haiku", system="You are a poet")
# Batch multiple questions at once
answers = client.batch([
"What is Python?",
"What is JavaScript?",
"What is Rust?"
])That's it! The conductor automatically distributes requests across Cerebras and NVIDIA.
Use simple names that work on all providers:
| Model | Description |
|---|---|
llama-70b |
Llama 3.3 70B (default, best quality) |
llama-8b |
Llama 3.1 8B (faster) |
Configure custom model names in config/config.yaml:
models:
my-model:
cerebras: "llama-3.3-70b"
nvidia: "meta/llama-3.3-70b-instruct"See Configuration docs for more details.
Your App TrainForgeConductor Providers
│ │ │
│ POST /v1/chat/completions │ │
│ model: "llama-70b" │ │
│─────────────────────────────▶│ │
│ │ Round-robin scheduling │
│ │──────────────────────────────▶ │ Cerebras (30 RPM)
│ │ │
│ │──────────────────────────────▶ │ NVIDIA (40 RPM)
│ │ │
│◀─────────────────────────────│ Combined: 70 RPM! │
| Provider | Requests/min | Tokens/min | Get API Key |
|---|---|---|---|
| Cerebras | 30 | 60,000 | cloud.cerebras.ai |
| NVIDIA NIM | 40 | 100,000 | build.nvidia.com |
Tip: Add multiple API keys to multiply your limits!
| Document | Description |
|---|---|
| Integrations | Python, JavaScript, curl, OpenAI SDK |
| API Reference | All endpoints and parameters |
| Configuration | Config options, model mapping, strategies |
| Docker | Container deployment |
MIT
