Chat, API, and fine-tuning in your browser. One command to start. Automatic multi-node clustering. Apache 2.0.
curl -fsSL https://ainode.dev/install | bashThat's it. Pulls the unified container image, registers a systemd service, and opens the chat UI at http://localhost:3000.
Distributed (multi-node) install:
AINODE_PEERS="10.0.0.2,10.0.0.3" curl -fsSL https://ainode.dev/install | bashTwo DGX Sparks, one sharded model, 244 GB aggregated VRAM. NCCL over RoCE at 200 Gbps on ConnectX-7.
| Repo | What it is |
|---|---|
| ainode | Product source — CLI, API, web UI, engines, training |
| ainode.dev | Marketing site at ainode.dev |
Public on both GHCR (canonical) and Docker Hub (mirror):
# GHCR — used by the installer, no rate limits
docker pull ghcr.io/getainode/ainode:latest
# Docker Hub — mirror, for discoverability
docker pull argentaios/ainode:latestMost "local AI" tooling bails out the moment your model is bigger than one GPU. The moment you need two, you're writing Ray configs, debugging NCCL, patching vLLM, and wiring SSH bootstrap — by hand, at 2 AM.
AINode bundles that entire stack into a single container and turns multi-node inference into a UI checkbox:
- Auto-discovery over UDP on your cluster subnet
- Tensor-parallel sharding across every GPU the cluster sees
- Ray head + worker formation via eugr's launcher
- NCCL over RoCE when ConnectX-7 + RDMA are present
- Graceful fallback to single-node when the cluster shrinks
If you have one GB10 or ten of them, you run the same install command, and the thing just adds up the VRAM.
AINode doesn't reinvent inference — it composes the best OSS runtimes and makes them boring to operate:
- vLLM — the inference engine
- Ray — cross-node orchestration
- NCCL — patched for 3-node ring on GB10
- eugr/spark-vllm-docker — the blessed GB10 base image
- Hugging Face — model catalog
v0.4.0 shipped (April 2026). Distributed TP=2 verified on real GB10 hardware. See the full state-of-play in the product README — including what works, what doesn't, and the lessons learned getting to "it just runs."
Powered by argentos.ai · Apache 2.0 · Made with NVIDIA GB10
If this saved you a weekend, consider sponsoring the work.





