Skip to content

Alcray/TrainForgeConductor

Repository files navigation

TrainForgeConductor

🚂 TrainForgeConductor

Multiply your LLM rate limits by routing across multiple providers

Quick StartIntegrationsAPI ReferenceConfiguration


Why TrainForgeConductor?

Without Conductor With Conductor
1 provider, 1 rate limit Combined rate limits from ALL providers
Provider down = you're down Automatic failover to other providers
Different APIs per provider One unified API for everything

Example: Cerebras (30 RPM) + NVIDIA (40 RPM) = 70 requests/minute 🚀


Quick Start

1. Install

git clone https://github.com/yourusername/TrainForgeConductor.git
cd TrainForgeConductor

python -m venv .venv
source .venv/bin/activate
pip install -e .

2. Configure

cp config/config.example.yaml config/config.yaml

Add your API keys to config/config.yaml:

providers:
  cerebras:
    enabled: true
    keys:
      - name: my-cerebras
        api_key: YOUR_CEREBRAS_KEY    # Get from https://cloud.cerebras.ai
        requests_per_minute: 30
        tokens_per_minute: 60000

  nvidia:
    enabled: true
    keys:
      - name: my-nvidia
        api_key: YOUR_NVIDIA_KEY      # Get from https://build.nvidia.com
        requests_per_minute: 40
        tokens_per_minute: 100000

3. Run

trainforge-conductor

4. Use

from examples.client import ConductorClient

client = ConductorClient()

# Simple chat - automatically routes to available provider
response = client.chat("Hello!")
print(response)

# With system prompt
response = client.chat("Write a haiku", system="You are a poet")

# Batch multiple questions at once
answers = client.batch([
    "What is Python?",
    "What is JavaScript?",
    "What is Rust?"
])

That's it! The conductor automatically distributes requests across Cerebras and NVIDIA.


Unified Model Names

Use simple names that work on all providers:

Model Description
llama-70b Llama 3.3 70B (default, best quality)
llama-8b Llama 3.1 8B (faster)

Configure custom model names in config/config.yaml:

models:
  my-model:
    cerebras: "llama-3.3-70b"
    nvidia: "meta/llama-3.3-70b-instruct"

See Configuration docs for more details.


How It Works

Your App                    TrainForgeConductor                 Providers
   │                              │                                │
   │  POST /v1/chat/completions   │                                │
   │  model: "llama-70b"          │                                │
   │─────────────────────────────▶│                                │
   │                              │   Round-robin scheduling       │
   │                              │──────────────────────────────▶ │ Cerebras (30 RPM)
   │                              │                                │
   │                              │──────────────────────────────▶ │ NVIDIA (40 RPM)
   │                              │                                │
   │◀─────────────────────────────│   Combined: 70 RPM!            │

Rate Limits

Provider Requests/min Tokens/min Get API Key
Cerebras 30 60,000 cloud.cerebras.ai
NVIDIA NIM 40 100,000 build.nvidia.com

Tip: Add multiple API keys to multiply your limits!


Documentation

Document Description
Integrations Python, JavaScript, curl, OpenAI SDK
API Reference All endpoints and parameters
Configuration Config options, model mapping, strategies
Docker Container deployment

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •