Experimental Raspberry Pi Linux mod with optimised local LLM inference. Runs quantised models on-device with a browser chat UI — no cloud, no GPU, just a Pi.
Potato sold separately.
Supports Raspberry Pi 5 (8 GB / 16 GB) and Raspberry Pi 4 (8 GB). Pi 5 is the recommended target (~8 tok/sec). Pi 4 works but expect roughly 1/4 of the speed.
Runtime: ik_llama.cpp (IQK-optimised, Pi 5) + upstream llama.cpp (Pi 4 / Pi 5) | Default model: Qwen3.5-2B (Pi 5), Qwen3.5-0.8B (Pi 4)
- Download the latest SD card image from Releases
- Flash it to a microSD card with Raspberry Pi Imager
- Insert the card, power on the Pi, and wait for first boot to complete
- Open
http://potato.localin a browser
A starter model downloads automatically on first boot (~1.8 GB on Pi 5, ~0.5 GB on Pi 4). Chat is ready once the download finishes and the status shows CONNECTED.
See Flashing Guide for detailed step-by-step instructions, including how to flash directly from Raspberry Pi Imager without a manual download.
- Raspberry Pi 5 (8 GB or 16 GB) or Raspberry Pi 4 (8 GB)
- microSD card (16 GB minimum)
- Power supply (20W USB-C minimum, 27W recommended if using a USB SSD)
- Ethernet or Wi-Fi connection (for first-boot model download)
| Device | Model | Speed | Notes |
|---|---|---|---|
| Pi 5 (8 GB) | Qwen3-30B-A3B Q3_K_S (2.66 bpw, ~10 GB) | ~8-9 tok/sec | 30B MoE with SSD offload, ik_llama |
| Pi 5 (8 GB) | Qwen3.5-2B Q4_K_M | ~7.8 tok/sec | ik_llama runtime |
| Pi 4 (8 GB) | Qwen3.5-0.8B IQ4_NL | ~5 tok/sec | llama.cpp universal |
| Pi 4 (8 GB) | Qwen3-30B-A3B Q3_K_S | ~2 tok/sec | 30B on 8 GB RAM |
The standout: 30B-parameter Qwen3-30B-A3B (mixture-of-experts, only 3B active) runs at full conversational speed on Pi 5 with a USB SSD. That's a frontier-class model on a $80 board.
Potato OS is an early release meant for testing and tinkering, not production use.
- Chat with streaming responses
- Vision — attach a photo and ask about it
- Multi-chat sessions (persisted in your browser)
- Model management — download by URL, upload, delete, switch active model
- System monitoring — CPU, GPU, temperature, memory, storage, power draw
- Dual inference runtime — ik_llama (Pi 5 default) and upstream llama.cpp (Pi 4 default, Pi 5 fallback)
Updates are reflash-only for now — there is no OTA or in-place upgrade path yet.
Need to back out or recover from a failed setup? See docs/recovery.md for the practical rollback paths for:
- restoring a previous Raspberry Pi OS or other system image
- reflashing to a newer Potato OS image
Everything below is for contributors and developers.
uv sync
POTATO_ENABLE_ORCHESTRATOR=0 uv run uvicorn app.main:app --host 0.0.0.0 --port 1983uv run python -m pytest tests/unit tests/api -q -n auto
npx playwright test --reporter=dot --timeout=15000 --workers=75%Runtimes are built on the Pi (aarch64, no cross-compilation). From your Mac:
# Build both ik_llama + llama_cpp from latest source on Pi, sync back
./bin/build_and_publish_remote.sh
# Build + publish to GitHub Releases
./bin/build_and_publish_remote.sh --publish
# Just one family
./bin/build_and_publish_remote.sh --family ik_llamaThis SSHs to potato.local (default: pi/raspberry), clones latest source, builds, and syncs the binaries back. Use --publish to upload tarballs to GitHub Releases.
Or build directly on the Pi:
./bin/build_llama_runtime.sh --family both --fetch --cleanPublished runtimes are available at GitHub Releases. Fresh installs auto-download when no local build is present:
POTATO_LLAMA_RELEASE_AUTO=1 ./bin/install_dev.shBuild a flashable Potato OS image from macOS:
./bin/build_local_image.sh --setup-dockerSee Building Images for prerequisites, variants, flashing, and publishing releases.
- Active board: github.com/orgs/potato-os/projects/1
- Defaults: hostname
potato, SSHpi/raspberry
Apache License 2.0. See LICENSE for details. Third-party component attribution is in THIRD_PARTY_NOTICES.md.

