Bee2Bee is a peer-to-peer network that allows you to easily deploy, route, and access AI models across any infrastructure (Local, Cloud, Colab) without complex networking configuration.
The ecosystem consists of four main components:
- Main Point (Tracker/API): The central supervisor that tracks active peers and exposes the HTTP API.
- Worker Nodes (Providers): Machines that run
deploy-hfto host AI models and serve requests. - Desktop App (Frontend): An Electron+React UI for managing the network and chatting with models.
- Bee2Bee Cloud (Colab): A Google Colab notebook acting as a Cloud Node using Hybrid Tunneling.
This runs the core API server. Every network needs at least one Main Point.
Run Locally:
# Starts the API on Port 4002 and P2P Server on Port 4003
python -m bee2bee apiOutput:
- API:
http://127.0.0.1:4002(Docs:/docs) - P2P:
ws://127.0.0.1:4003
A modern UI to visualize the network and chat with models.
Prerequisites: Node.js 20+
Run Locally:
cd electron-app
npm install # First time only
npm run devUsage:
- Open the App.
- It connects to
http://localhost:4002by default. - Go to "Chat" to talk to available providers.
- See MANUAL_TESTING.md for detailed testing steps.
Run this on any machine (or the same machine) to share an AI model.
Step A: Configure (Tell the node where the Main Point is)
# If running on the SAME machine as Main Point:
python -m bee2bee config bootstrap_url ws://127.0.0.1:4003
# If running on a DIFFERENT machine (LAN/WAN):
python -m bee2bee config bootstrap_url ws://<MAIN_POINT_IP>:4003Step B: Deploy Model
Option 1: Hugging Face (Default)
Uses transformers to run models like GPT-2, Llama, etc. on CPU/GPU.
# Deploys distilgpt2 (CPU friendly)
python -m bee2bee deploy-hf --model distilgpt2Option 2: Ollama (Universal) Uses your local Ollama instance to serve models like Llama3, Mistral, Gemma, etc. Prerequisite: Install and run Ollama
# Serve a model (e.g., llama3)
python -m bee2bee serve-ollama --model llama3Note: This creates a separate peer node on your machine.
Run a powerful node on Google's free GPU infrastructure using our Hybrid Tunneling setup.
Notebook Location: notebook/ConnectIT_Cloud_Node.ipynb
How it Works (Hybrid Tunneling): To bypass Colab's network restrictions, we use two tunnels:
- API Tunnel (Cloudflare): Provides a stable HTTPS URL (
trycloudflare.com) for the Desktop App to connect to. - P2P Tunnel (Bore): Provides a raw WebSocket URL (
bore.pub) for other Worker Nodes to connect to.
Instructions:
- Open the Notebook in Google Colab.
- Run "Install Dependencies".
- Run "Configure Hybrid Tunnels" (Installs
cloudflared&bore).- Wait for it to output the URLs.
- Run "Run Bee2Bee Node".
- It automatically configures itself to announce the Bore address.
Connecting your Desktop App to Colab:
- Copy the Cloudflare URL (e.g.,
https://funny-remote-check.trycloudflare.com). - Open Desktop App -> Settings.
- Paste into "Main Point URL".
You can override settings using ENV vars:
| Variable | Description | Default |
|---|---|---|
BEE2BEE_PORT |
Port for P2P Server | 4003 (Worker) / 4003 (API) |
BEE2BEE_HOST |
Bind Interface | 0.0.0.0 |
BEE2BEE_ANNOUNCE_HOST |
Public Hostname (for NAT/Tunnel) | Auto-detected |
BEE2BEE_ANNOUNCE_PORT |
Public Port (for NAT/Tunnel) | Auto-detected |
BEE2BEE_BOOTSTRAP |
URL of Main Point | None |
- "Connection Refused": Ensure the
bootstrap_urlis correct and reachable (tryping). - "0 Nodes Connected": Check if the Worker Node can reach the Main Point's P2P address (WSS).
- Colab Disconnects: Ensure the Colab tab stays open. Tunnels change if you restart the notebook.
Contributions are welcome! Please open an issue or PR on GitHub.