This repository contains the Infrastructure as Code (IaC) for my home Kubernetes clusters, managed using GitOps principles with Flux. The infrastructure runs on Talos Linux, a modern OS designed specifically for Kubernetes.
I run two Kubernetes clusters:
- Chongus (Primary) - Dell R730 servers with NVIDIA GPUs
- Bitty (Secondary) - Intel NUC cluster
- cilium - eBPF-based CNI with native routing and Gateway API support
- cert-manager - Automated TLS certificate management
- external-dns - Automatic DNS management via Cloudflare
- external-secrets - Kubernetes External Secrets Operator integrated with Doppler
- envoy-gateway - Gateway API implementation for HTTP routing
- rook-ceph - Distributed storage with Ceph
- volsync - PVC backup and restore to B2 and MinIO
- cloudnative-pg - PostgreSQL operator with HA and backup
Flux watches the cluster-apps/ directory and reconciles the cluster state automatically. The workflow:
graph LR
A[Git Push] --> B[Flux Detects Change]
B --> C[Reconcile Kustomization]
C --> D[Deploy HelmRelease]
D --> E[Update Cluster State]
This repository follows a structured GitOps layout:
π cluster-apps/ # Application definitions (Flux source)
βββ π base/ # Shared applications across clusters
βββ π chongus/ # Chongus cluster applications (NEW PATTERN)
β βββ π [namespace]/
β βββ π [app]/
β βββ π app/ # HelmRelease + configs
β βββ ks.yaml # Flux Kustomization
βββ π bitty/ # Bitty cluster (deprecated pattern)
βββ π components/ # Reusable Kustomize components
π clusters/ # Cluster bootstrap configurations
βββ π chongus/
β βββ π bootstrap/ # Helmfile-based bootstrap
β βββ π flux/ # Flux Kustomizations
β βββ π talos/ # Talos configuration
βββ π bitty/
π .taskfiles/ # Operational automation
Container Networking:
- Cilium CNI with eBPF
- Native routing mode (10.244.0.0/16)
- KubeProxy replacement enabled
- Hubble for observability
Gateway API:
envoy-external(172.22.12.2) - Internet-accessible via Cloudflare Tunnelenvoy-internal(172.22.12.1) - Local network only (Tailscale)
Load Balancing:
- Cilium LBIPAM (172.22.12.0/24)
- Maglev algorithm with DSR mode
DNS:
- External-DNS with Cloudflare provider
- Automatic record creation from Gateway API HTTPRoutes
Certificates:
- Let's Encrypt via cert-manager
- Automatic TLS for all HTTPRoutes
While most infrastructure runs on-premises, some cloud services are used:
| Service | Purpose | Cost |
|---|---|---|
| Cloudflare | DNS, Tunnel, CDN | ~$0/month (free tier) |
| Doppler | Secret management | ~$0/month (free tier) |
| Backblaze B2 | Backup storage | ~$5/month |
| GitHub | Git hosting, CI/CD | ~$0/month (free tier) |
| Device | CPU | RAM | Storage | Purpose |
|---|---|---|---|---|
| Dell R730 x3 | Intel Xeon | 256GB+ | 2x Samsung 870 EVO 2TB | Kubernetes nodes with NVIDIA GPUs |
Storage:
- Rook-Ceph: 6x 2TB SSDs (2 per node)
- Storage Class:
ceph-block(default) - Replication: 3 replicas
| Device | CPU | RAM | Storage | Purpose |
|---|---|---|---|---|
| Intel NUC x3 | Intel i5/i7 | 32GB+ | NVMe | Kubernetes nodes with QuickSync |
- Rook-Ceph: 3x 512GB SSDs (1 per node)
- Storage Class:
ceph-block(default) - Replication: 3 replicas
| Device | Purpose |
|---|---|
| True NAS | NFS storage for media and shared files |
| Raspberry Pi | Ansible-managed DNS, Tailscale, and mDNS repeater |
Tools are managed via mise:
# Install mise
curl https://mise.run | sh
# Install all tools
mise install# 1. Generate Talos configuration
task talos:generate-clusterconfig
# 2. Apply to nodes
task talos:apply-clusterconfig
# 3. Bootstrap cluster
task k8s-bootstrap:talos-cluster
# 4. Deploy core apps and CRDs
task k8s-bootstrap:appsFlux will then automatically sync applications from cluster-apps/.
# Validate Flux resources locally
task flux:validate
# Force reconcile an application
flux reconcile helmrelease [app-name] -n [namespace]
# Check cluster status
kubectl get kustomization -A
kubectl get helmrelease -A
# View Ceph storage health
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph statusFor detailed information about the repository structure, patterns, and best practices, see:
- CLAUDE.md - Comprehensive architectural context and patterns
- Taskfile Reference - Task automation commands
This repository is inspired by the k8s-at-home community and draws patterns from:
- onedr0p/home-ops - Excellent reference implementation
- k8s-at-home - Amazing community and support
- kubesearch.dev - Discovery of Helm charts and deployment examples
Special thanks to the maintainers of all the open-source projects used in this cluster.