AI-powered test automation and benchmarking platform running on Kubernetes.
OpenOperator is a cloud-native platform for testing AI agents against real operating systems (Windows, macOS, Linux). It provisions virtual computers on Kubernetes, deploys AI agents, records interactions via video, and provides real-time monitoring through a web dashboard. Designed for benchmarking AI agent performance at scale with full Azure cloud integration.
- Multi-OS Virtual Computers — Spin up Windows, macOS, and Linux VMs as Kubernetes pods with VNC/RDP access
- AI Agent Orchestration — Deploy and manage AI agents that interact with virtual desktops
- Test & Benchmark Execution — Run tests individually or as benchmark suites with configurable repetition
- Video Recording — Capture and store agent interactions as recordings in Azure Blob Storage
- Real-time Monitoring — Live desktop streaming via Guacamole/noVNC with WebSocket communication
- LLM Observability — Trace and analyze LLM calls with Langfuse integration
- Result Analytics — Track cost, time, and validation results per test step and test run
- Auto-scaling Infrastructure — Node pool auto-scaling for agents, benchmarks, and user workloads
graph TD
subgraph Azure Cloud
AKS[Azure Kubernetes Service]
BLOB[Azure Blob Storage]
UAMI[Managed Identity]
end
subgraph AKS Cluster
subgraph OOAdmin[Admin Namespace]
API[FastAPI API]
UI[React Dashboard]
WSB[WebSocket Broker]
DB[(MongoDB)]
end
subgraph OOUser[User Namespace]
WIN[Windows VM]
MAC[macOS VM]
LNX[Linux VM]
AGT[AI Agents]
end
subgraph OOAnalytics[Analytics Namespace]
ES[Elasticsearch]
GF[Grafana]
LF[Langfuse]
end
subgraph Infra
TFK[Traefik Ingress]
LH[Longhorn Storage]
end
end
USER((User)) -->|HTTPS| TFK
TFK --> UI
TFK --> API
TFK --> WSB
UI -->|REST + WS| API
API --> DB
API -->|Manage| WIN & MAC & LNX
API -->|Deploy| AGT
AGT -->|Interact| WIN & MAC & LNX
API -->|Upload recordings| BLOB
API -->|Workload Identity| UAMI
UAMI --> BLOB
| Layer | Technology | Purpose |
|---|---|---|
| Cloud | Azure (AKS, Blob Storage, UAMI) | Cloud infrastructure & identity |
| IaC | Terraform (azurerm ~> 3.0) | Infrastructure provisioning |
| Orchestration | Kubernetes + Helm | Container scheduling & packaging |
| Backend | Python, FastAPI, Uvicorn | REST API & job execution |
| Frontend | React, TypeScript, Vite, Fluent UI | Web dashboard |
| Database | MongoDB | Configuration & instance storage |
| Ingress | Traefik v3.3 | Routing, TLS termination |
| Remote Desktop | Guacamole, noVNC, VNC/RDP proxies | Live desktop access |
| Storage | Longhorn | Distributed persistent volumes |
| Logging | Elasticsearch + Kibana | Log aggregation |
| Monitoring | Grafana | Metrics dashboards |
| LLM Tracing | Langfuse | LLM observability |
├── Infra-Terraform/ # Terraform modules for Azure infrastructure
│ ├── k8s/ # AKS cluster with multi-node-pool setup
│ ├── storage/ # Azure Storage account
│ ├── security/ # Managed identity & RBAC
│ ├── LLMs/ # LLM model infrastructure
│ └── omniparser-server/ # Vision/UI parsing service
│
├── k8s-Helm/ # Helm charts for Kubernetes deployments
│ ├── OOAdmin/ # Admin services (API, UI, DB, WS-Broker, Guacamole, Proxy)
│ ├── OOUser/ # User workloads (Computers: Win/Mac/Linux, Agents)
│ ├── OOAnalytics/ # Observability (Elasticsearch, Grafana, Langfuse)
│ ├── OOBenchmark/ # Benchmark & test job runners
│ └── OOStorage/ # Longhorn distributed storage
│
├── Projects/ # Application source code
│ ├── OOAdmin/ # Admin control plane
│ │ ├── api/ # FastAPI REST API
│ │ ├── ui/ # React/TypeScript dashboard
│ │ ├── ws-broker/ # WebSocket relay broker
│ │ ├── core/ # Shared library (DB, K8s, storage, WS)
│ │ └── jobs/ # Test & benchmark job runners
│ └── OOUser/ # User-facing workloads
│ └── computers/ # VM init containers, VNC/RDP proxies
│
└── docs/ # Documentation & architecture diagrams
├── *.drawio # Infrastructure & benchmarking diagrams
└── img/ # Reference screenshots
- Terraform >= 1.0
- Helm >= 3.x
- kubectl configured for AKS
- Azure subscription with AKS, Storage, and Managed Identity access
- uv (Python package manager)
- Node.js >= 18 (for UI development)
cd Infra-Terraform/k8s
terraform init
terraform plan
terraform apply# Install storage layer
helm upgrade --install longhorn longhorn/longhorn -f k8s-Helm/OOStorage/longhorn/values.yaml
# Deploy admin services
helm upgrade --install oo-api ./k8s-Helm/OOAdmin/api
helm upgrade --install oo-ui ./k8s-Helm/OOAdmin/ui
helm upgrade --install oo-db ./k8s-Helm/OOAdmin/db
helm upgrade --install oo-ws-broker ./k8s-Helm/OOAdmin/ws-broker
# Deploy analytics
helm upgrade --install elasticsearch ./k8s-Helm/OOAnalytics/elasticsearch
helm upgrade --install grafana ./k8s-Helm/OOAnalytics/grafana
helm upgrade --install langfuse ./k8s-Helm/OOAnalytics/langfusecd Projects/OOAdmin/api
uv sync
uv run uvicorn server:app --reloadcd Projects/OOAdmin/ui
npm install
npm run dev| Variable | Description | Default |
|---|---|---|
DB_URL |
MongoDB server hostname | — |
PORT |
API service port | 8000 |
LOG_PATH |
Log directory path | — |
MY_BASE_URL |
Root path for API routing | — |
TEST_JOB_VERSION |
Test job container image version | — |
BENCHMARK_JOB_VERSION |
Benchmark job container image version | — |
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request