Unified AI operations β governance, routing, cost management, and deployment across any AI provider.
Bonito gives engineering and platform teams a single operational layer to manage AI workloads across AWS Bedrock, Azure AI Foundry, Google Vertex AI, and more. Connect providers, enforce governance policies, track costs in real time, and manage team access β all from one platform with an AI copilot that helps you move faster.
AI adoption is accelerating, but operational tooling hasn't kept up. Teams juggle separate consoles for each cloud provider, have no unified view of costs, and struggle to enforce consistent governance across providers.
Bonito solves this with:
- Operational control β One dashboard for all your AI providers. Manage models, deployments, and routing policies without switching between cloud consoles.
- Governance & compliance β Built-in policy engine for SOC-2, HIPAA, and GDPR compliance checks. Audit logging across every action.
- Cost visibility β Real-time cost aggregation, forecasting, and optimization recommendations across all providers.
- AI Context (Knowledge Base) β Cross-cloud RAG pipeline. Upload company docs, embed with any provider's model, and inject context into any LLM query β vendor-neutral knowledge that works with every model on every cloud.
- Team management β Role-based access control, team seats, and SSO/SAML for enterprise identity management.
- SAML SSO β Enterprise single sign-on with SAML 2.0. Supports Okta, Azure AD, Google Workspace, and custom SAML providers. SSO enforcement, break-glass admin, JIT user provisioning.
- AI copilot β An intelligent assistant that helps with onboarding, configuration, troubleshooting, and infrastructure-as-code generation.
- Multi-cloud gateway β OpenAI-compatible API proxy with intelligent routing, automatic cross-region inference profiles (AWS Bedrock
us.prefix handled transparently), and multi-provider failover that catches rate limits, timeouts, 5xx errors, and model unavailability to automatically route to equivalent models on other providers. - Bonobot β AI Agents β Enterprise AI agent framework with visual canvas (React Flow), project-based organization, built-in tools (KB search, HTTP requests, agent-to-agent invocation), and enterprise security (default deny, budget enforcement, rate limiting, SSRF protection, full audit trail). All agent inference routes through the Bonito gateway for cost tracking and governance.
- Persistent Agent Memory β Long-term memory system with pgvector similarity search. Agents store and retrieve facts, patterns, interactions, preferences, and contextual information across sessions. AI-powered memory extraction from conversations.
- Scheduled Autonomous Execution β Cron-based task scheduling with timezone support. Agents run tasks automatically, deliver results via webhook/email/Slack, with full execution history and retry logic.
- Approval Queue / Human-in-the-Loop β Configurable approval workflows for sensitive agent actions. Risk assessment, timeout handling, auto-approval conditions, and comprehensive audit trails for enterprise compliance.
We're not the only platform in this space. Here's an honest look at how we fit:
| Capability | Bonito | Portkey | LiteLLM | Helicone | Guild.ai |
|---|---|---|---|---|---|
| Multi-cloud gateway | β | β | β | β | β |
| Auto cross-region inference | β Built-in | β | β | β | β |
| Intelligent multi-provider failover | β Built-in | Basic | Basic | β | β |
| Cross-cloud Knowledge Base (RAG) | β Built-in | β | β | β | β |
| AI Agent Framework | β Built-in | β | β | β | Planned |
| SAML SSO | β Built-in | β | β | β | β |
| Governance & compliance checks | β Built-in | β | β | β | Planned |
| Infrastructure-as-Code (Terraform) | β Built-in | β | β | β | β |
| AI copilot for operations | β Built-in | β | β | β | β |
| Cost management & forecasting | β | β | Basic | β | β |
| Provider count | 6 | 200+ | 100+ | 30+ | N/A |
| Open source | No | Partial | Yes | Yes | No |
| SOC-2 certified | Roadmap | Yes | No | Yes | No |
| Self-hosted option | Yes (Docker) | Yes | Yes | Yes | No |
Where Bonito shines:
- Auto cross-region inference profiles -- Customers register canonical model IDs (e.g.
anthropic.claude-sonnet-4-20250514-v1:0) and the gateway transparently routes via AWS cross-region inference profiles (us.prefix) when required. No competitor handles this automatically. When AWS changes their inference profile scheme, you update one function on the platform -- zero customer impact. - Intelligent multi-provider failover -- Not just rate-limit retries. Bonito detects rate limits, timeouts, server errors (5xx), model unavailability, and overloaded providers, then automatically routes to equivalent models on other providers. A Claude Sonnet request that gets throttled on Anthropic Direct automatically retries on AWS Bedrock. No client code changes needed. Portkey and LiteLLM offer basic retry/fallback, but not cross-provider model equivalence mapping with transparent re-routing.
- Cross-cloud RAG (no competitor has this), integrated governance, IaC generation, and an AI copilot that ties it all together.
Where others lead: Provider breadth (Portkey/LiteLLM support far more providers today), open-source community (LiteLLM), and compliance certifications (Portkey and Helicone have SOC-2 today).
# Clone the repo
git clone <repo-url> && cd bonito
# Copy env file
cp .env.example .env
# Start everything
docker compose up --build -d
# Run database migrations
docker compose exec backend env PYTHONPATH=/app alembic upgrade head
# Open the app
open http://localhost:3001βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend β
β Next.js 14 Β· TypeScript Β· Tailwind β
β shadcn/ui Β· Framer Motion β
β localhost:3001 β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββ
β Backend β
β FastAPI Β· Python 3.12 Β· Async β
β localhost:8001 β
ββββββββββββ¬ββββββββββββ¬ββββββββββββ¬βββββββββββββββ€
βPostgreSQLβ Redis β Vault β Cloud APIs β
β pgvector β :6380 β :8200 β Bedrock etc β
β :5433 β β β β
ββββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββββββββββ
| Layer | Tech |
|---|---|
| Frontend | Next.js 14, TypeScript, Tailwind CSS, shadcn/ui, Framer Motion |
| Backend | Python FastAPI, async/await, uvicorn |
| Database | PostgreSQL 18.2 + pgvector (HNSW), SQLAlchemy, Alembic |
| Vector Store | pgvector with 768-dim embeddings (GCP text-embedding-005) |
| Cache | Redis 7 |
| Secrets | HashiCorp Vault (prod), SOPS + age (dev) |
| Infra | Docker Compose (local), Vercel + Railway (prod) |
bonito/
βββ frontend/ # Next.js app
β βββ src/
β βββ app/ # App Router pages
β βββ components/ # UI components
βββ backend/ # FastAPI app
β βββ app/
β β βββ api/ # Route handlers
β β βββ core/ # Config, DB, Vault client
β β βββ models/ # SQLAlchemy models
β β βββ schemas/ # Pydantic schemas
β β βββ services/ # Business logic
β βββ alembic/ # DB migrations
βββ vault/ # Vault init scripts
βββ secrets/ # SOPS encrypted secrets
βββ docker-compose.yml
βββ README.md
| Service | Port | Description |
|---|---|---|
| Frontend | 3001 | Next.js web app |
| Backend | 8001 | FastAPI REST API |
| PostgreSQL + pgvector | 5433 | Primary database + vector store |
| Redis | 6380 | Cache & sessions |
| Vault | 8200 | Secrets management (UI available) |
Local dev: SOPS + age for encrypted secrets in git.
# Decrypt secrets
SOPS_AGE_KEY_FILE=secrets/age-key.txt sops decrypt secrets/dev.enc.yaml
# Edit secrets
SOPS_AGE_KEY_FILE=secrets/age-key.txt sops edit secrets/dev.enc.yamlVault UI: http://localhost:8200 (token: bonito-dev-token)
Production: HashiCorp Vault with AppRole/Kubernetes auth, HA mode.
With the backend running: http://localhost:8001/docs (Swagger UI)
All 18 core phases are complete. Bonito is live at getbonito.com with 12 active deployments across 3 clouds and 171+ gateway requests tracked.
- β Core platform (auth, RBAC, multi-cloud connections)
- β Cloud integrations (AWS Bedrock, Azure AI Foundry, GCP Vertex AI)
- β AI-powered chat & intelligent routing
- β Compliance & governance engine (SOC-2, HIPAA, GDPR policy checks)
- β Cost intelligence (aggregation, optimization, forecasting)
- β Production deployment (Docker, CI/CD, deployment configs)
- β Onboarding wizard with IaC template generation
- β API Gateway (OpenAI-compatible proxy via LiteLLM)
- β AI Copilot (Groq-powered operations assistant)
- β Engagement & retention (notifications, analytics, digests)
- β Model details & playground (live testing, parameter tuning)
- β Visual routing policy builder (A/B testing, load balancing)
- β Deployment provisioning (cloud endpoints, Terraform, auto-scaling)
- β AI Context (Knowledge Base) β Cross-cloud RAG pipeline with pgvector, document upload/parse/chunk/embed, HNSW vector search, gateway context injection, and source citations
- β Database migration to pgvector PG18.2
- β AI Context onboarding integration (optional KB toggle, storage provider picker)
- β IaC templates updated with KB storage permissions (S3, Azure Blob, GCS)
- β One-click model activation across all 3 clouds
- β
Auto Cross-Region Inference Profiles β Gateway automatically detects newer Bedrock models (Claude Sonnet 4, Opus 4, Llama 3.3/4, Mistral Large 2) and routes via
us.cross-region inference profiles. Customers register canonical model IDs; the platform handles routing transparently. - β Intelligent Multi-Provider Failover β Gateway detects rate limits, timeouts, server errors (500/502/503), model unavailability, and capacity issues, then automatically retries on equivalent models across different providers (e.g. Anthropic Direct -> AWS Bedrock). Model equivalence mapping covers Claude, Llama, Mixtral, and Gemma families.
- β SAML SSO β Enterprise SSO with SAML 2.0 (Okta, Azure AD, Google Workspace, Custom SAML), SSO enforcement, break-glass admin, JIT provisioning
- β Bonobot v1 β AI Agents β Enterprise agent framework with visual canvas, OpenClaw-inspired execution engine, built-in tools, enterprise security (default deny, budget stops, rate limiting, SSRF protection, audit trail)
- β Bonobot Enterprise Features β Persistent Agent Memory (pgvector, 5 memory types, AI extraction), Scheduled Autonomous Execution (cron, timezone, multi-channel delivery), and Approval Queue / Human-in-the-Loop (risk assessment, auto-approve, timeout handling)
π SSO/SAML integrationβ Shipped- π SOC-2 Type II certification β Roadmap
- π Smart routing (complexity-aware model selection)
- π VPC Gateway Agent (enterprise self-hosted data plane)
- π Additional provider integrations (Cohere, Mistral, custom endpoints)
- π Advanced audit log export & SIEM integration
Connect Claude Desktop, Cowork, or any MCP-compatible client to Bonito with the MCP server:
pip install bonito-mcpExposes 18 tools covering providers, models, gateway, agents, knowledge bases, and cost monitoring. See mcp-server/ for full documentation and configuration.
- AI Context / Knowledge Base β Architecture, API design, and RAG pipeline details
- Known Issues β Tracking document for known issues and fixes
- Pricing β Plans and pricing structure
- SOC-2 Roadmap β Path to SOC-2 Type II certification
- SSO Scoping β SSO/SAML implementation plan
- Vault Production β Vault hardening guide
Built with π by the Bonito team.