Skip to content

kenlasko/omni

Repository files navigation

Introduction

This repository is used to install a Talos Kubernetes cluster using on-prem Omni in a declarative manner. Most of these steps should work without modification. Obviously paths and domain names should change as required.

In my situation, I have 6 NUCs/mini-PCs that I use for my cluster. Three are old Intel NUCs that are used as control planes. The other 3 are Beelink mini-PCs. My desire was to setup an easily reproducible Talos Kubernetes cluster and maintain my creative node name strategy of NUC1 through NUC6.

Once complete, you will have a Kubernetes cluster running the latest Kubernetes flavour, but without a CNI. This means your cluster won't actually be running until a CNI is installed. I used Cilium for my cluster following these steps.

Related Repositories

Links to my other repositories mentioned or used in this repo:

  • NetbootXYZ: Simplified PXE boot setup for Omni-managed Talos nodes.
  • K8s Cluster Configuration: Manages Kubernetes cluster manifests and workloads.
  • NixOS: A declarative OS modified to support my Kubernetes cluster

Omni On-Prem Installation

My initial installation closely followed the Omni on-prem install instructions provided by SideroLabs. This worked well, but required an additional container to make the certificate files generated by Traefik available to Omni. This sometimes caused issues when the helper container stopped running without me noticing. Eventually, the Omni certificate expired and caused all sorts of hard-to-diagnose issues with the cluster, especially as nodes were rebooted.

My updated installation is based on the example configuration provided by SideroLabs but differs in that I'm using Traefik and I'm also not proxying everything through port 443, which requires unique URLs for the different services. The reason is that I wanted to ensure my existing Omni-managed clusters would not be affected (which they might be if I followed the example config closely). This has the added benefit in that I don't require a gRPC tunnel, which is discouraged unless absolutely necessary.

In my homelab, Omni is installed on a Raspberry Pi I'm using for other Docker-related stuff.

  1. Follow the Omni on-prem install instructions. This will get the basics running, including setting up an OIDC provider for authentication. I use Auth0 as per the Omni documentation.
  2. Configure docker-compose.yaml file:
    • The most important app is obviously omni
    • traefik is used for HTTPS traffic management and generates certificates via LetsEncrypt
  3. Make sure an A record for omni.ucdialplans.com pointing to the IP of the host Omni is running on is added to your DNS server config.
  4. If you plan on using workload proxying, also add a DNS record for *.omni.ucdialplans.com to DNS, also pointing to the IP of the host Omni is running on.

Certificate Management

Omni requires a public certificate for nodes to connect to. It is very important to keep this certificate up-to-date, or else things will start to go very bad when the certificate expires. When nodes reboot with an expired Omni certificate, Kubernetes pods will react in strange ways that will be hard to diagnose. Omni uses certificates automatically issued by LetsEncrypt via the Traefik section of my docker-compose file.

PXEBoot Configuration

Talos can be installed on nodes via ISO, but doing it via PXEBoot is so much nicer. Follow the NetBootXYZ Configuration instructions to configure.

Omnictl/Talosctl installation

The cluster is managed from the CLI using omnictl and talosctl. While cluster operations can be performed from the Omni UI, I think its better to do so from the CLI for a declarative approach and get away from "click-ops". For my deployment, the installation/configuration of these tools is managed via NixOS, but the manual steps are included here for those who haven't discovered NixOS yet.

  1. Download omnictl from https://omni.ucdialplans.com and put in proper locations on your workstation.
# Remove old version of talosctl, if present, then install latest version
sudo rm /usr/local/bin/talosctl
curl -sL https://talos.dev/install | sh

# Assumes the latest omnictl has already been downloaded and placed in current directory
ARCH_TYPE=$(dpkg --print-architecture)
sudo mv omnictl-linux-${ARCH_TYPE} /usr/local/bin/omnictl
sudo chmod u+x /usr/local/bin/omnictl
  1. Download omniconfig.yaml and talosconfig.yaml from omni.ucdialplans.com and put in proper locations on your workstation.
mkdir -p ~/.config/omni/
mkdir -p ~/.talos/
cp ~/omni/omniconfig.yaml ~/.config/omni/config
cp ~/omni/talosconfig.yaml ~/.talos/config

Omni/Kubectl Integration

An OIDC plugin provided by Krew is required for connecting to the cluster via Omni. Assumes Kubectl is already installed.

  1. Install Krew
(
  set -x; cd "$(mktemp -d)" &&
  OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
  ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
  KREW="krew-${OS}_${ARCH}" &&
  curl -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
  tar zxvf "${KREW}.tar.gz" &&
  ./"${KREW}" install krew
)

# Add Krew path to ~/.bashrc
echo 'export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

# Install OIDC-Login in Kubectl
kubectl krew install oidc-login

Browser Redirection for Windows Subsystem for Linux

If you're using WSL, install wslu. This allows for external browser redirection from the WSL session to your main browser in Windows

sudo apt install wslu -y

Omni cluster creation/update

Make sure all nodes are up and running in maintenance mode and are visible in https://omni.ucdialplans.com. I did this via a NetbootXYZ installation on the same Raspberry Pi node as my Omni installation.

You will need to modify the machine GUIDs in cluster-template-home.yaml to suit your needs. I have multiple cluster templates for home, cloud and lab to test various things. You may not need all this.

I setup a pass-through container cache in Docker on my NAS, which is defined in machine-registries.yaml. You probably won't be using this.

If any of your machine GUIDs are not randomly assigned and the BIOS is American Megatrends (AMI)-based, you may be able to create a bootable USB from the files in uuid-gen to set a random machine GUID.

Once you're ready for creating your cluster, run the below command from your workstation. Yep, that's it.

omnictl cluster template sync -f ~/omni/cluster-template-(home|cloud|lab).yaml

Then install Cilium using whatever method you desire. In my case, I use Terraform/OpenTofu to install the core apps that would allow me to log into ArgoCD and install everything else:

  • Cilium
  • External Secrets
  • Cert Manager

Using remote SSH shell for kubectl

If you're using a remote SSH shell to connect to the cluster, add the following to your ~/.ssh/config on your local machine you're using to connect to myhost

Host myhost
  LocalForward 8000 127.0.0.1:8000
  LocalForward 18000 127.0.0.1:18000

Or alternatively, connect directly via SSH using:

ssh -i "~/.ssh/id_rsa" ken@rpi1. -L 8000:localhost:8000

Add - --skip-open-browser to the Omni user account in the Users: section of your ~/.kube/config for Omni as in the example below:

users:
- name: onprem-omni-home-ken.lasko@gmail.com
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - oidc-login
      - get-token
      - --oidc-issuer-url=https://omni.ucdialplans.com/oidc
      - --oidc-client-id=native
      - --oidc-extra-scope=cluster:home
      - --skip-open-browser
      command: kubectl
      env: null
      interactiveMode: IfAvailable
      provideClusterInfo: false

Omni Backup/Restore

It is important to backup the Omni etcd database as well as the omni.asc key in case of disaster. Below is a simple script to back this up. Requires installation of etcdctl client.

Installing etcdctl client on Ubuntu/Raspbian

sudo apt install etcd-client

Sample Backup Script

This script takes a snapshot of the etcd database as well as the entire contents of the Omni folder. Keeps daily, weekly and monthly backups. This example goes to a NAS folder mounted at /mnt/omni-backup. Add to crontab to run it daily by running crontab -e and inserting an appropriate daily schedule.

#!/bin/sh

ETCDCTL_API=3 etcdctl snapshot save /docker/omni/snapshot.db
day=$(date +%A)
dayofmonth=$(date +%-d)
echo "$(date +%F_%T) Backing up omni.asc..."
sudo cp -f /docker/omni/omni.asc /mnt/omni-backup/
echo "$(date +%F_%T) Backing up Omni etcd database..."
sudo zip -r /mnt/omni-backup/etcdbackup-$day.zip /docker/omni/
if [ "$dayofmonth" -eq 1 ]; then echo "Creating monthly backup..."; cp /mnt/omni-backup/etcdbackup-$day.zip /mnt/omni-backup/etcdbackup-monthly-$(date +%m).zip; fi
case $dayofmonth in 7|14|21|28) echo "Creating weekly backup..."; cp /mnt/omni-backup/etcdbackup-$day.zip /mnt/omni-backup/etcdbackup-weekly-$dayofmonth.zip; ;; *) ;; esac
echo "$(date +%F_%T) Omni etcd database has been backed up."

Restoring Omni

  1. Copy omni.asc to the omni folder on your Docker host (or wherever the Omni Docker folder resides)
  2. Copy snapshot.db to the omni folder on your Docker host
  3. Run the following commands to restore the Omni database:
cd /docker/omni
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db
mv default.etcd etcd
  1. Start the Omni container.

About

My Omni/Talos Kubernetes installation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •