A GitOps-managed Kubernetes infrastructure running on bare metal with Talos Linux, featuring automated deployments via Flux v2 and comprehensive application hosting for home lab services.
This repository manages two Kubernetes clusters using a GitOps approach:
- atlantis-k8s01: 5-node cluster (3 control plane, 2 workers) with high-availability networking running in a Colo
- fairy-k8s01: 3-node cluster (all control plane) running at home
- OS: Talos Linux - Immutable, secure Kubernetes OS
- GitOps: Flux v2 - Automated deployment and reconciliation
- CNI: Cilium - eBPF-based networking with Gateway API support
- Storage: Rook Ceph - Distributed storage cluster
- Secrets: External Secrets Operator with 1Password integration
- Monitoring: VictoriaMetrics stack with Grafana
- Load Balancing: MetalLB in BGP mode
kubernetes/
βββ apps/ # Application definitions (shared across clusters)
β βββ auth/ # Authentication services (Authentik, LLDAP)
β βββ cert-manager/ # Certificate management
β βββ flux/ # Flux operator and instance configs
β βββ home-automation/ # Home Assistant, ESPHome, Zigbee2MQTT
β βββ media/ # Media stack
β βββ networking/ # Network services (Cilium, MetalLB, DNS)
β βββ observability/ # Monitoring and alerting stack
β βββ secrets/ # Secret management
β βββ storage/ # Storage solutions
β βββ ...
βββ clusters/ # Cluster-specific configurations
β βββ atlantis-k8s01/ # atlantis cluster configuration
β β βββ apps/ # Cluster-specific app deployments
β β βββ flux/ # Flux bootstrap configuration
β β βββ talos/ # Talos machine configurations
β βββ fairy-k8s01/ # fairy cluster configuration
βββ components/ # Reusable Kustomize components
- Flux v2 continuously monitors this repository and applies changes automatically
- Renovate keeps dependencies updated with automated PRs
- GitHub Actions provide CI/CD pipeline for validation and deployment
- 1Password Connect integration for secure secret management
- External Secrets Operator syncs secrets from 1Password to Kubernetes
- Cert-Manager with Let's Encrypt for automatic TLS certificate provisioning
- Authentik provides SSO and identity management
- Rook Ceph cluster provides distributed, replicated storage
- Spegel for distributed container image caching
- VictoriaMetrics for metrics collection and storage
- Victoria Logs for log aggregation and analysis
- Grafana for visualization and dashboards
- Gatus for uptime monitoring and status pages
- Silence Operator for intelligent alert management
- Prometheus Operator for metrics collection and alerting
- Cilium with eBPF for high-performance networking
- Gateway API for modern ingress management
- MetalLB in BGP mode for LoadBalancer services
- Tailscale integration for secure remote access
- Multus for multi-network interface support
- Cloudflare Tunnel for secure external connectivity
- Emby/Jellyfin: Media streaming servers
- Plex: Media streaming server
- Sonarr/Radarr/Lidarr: Media acquisition and management
- Bazarr: Subtitle management
- SABnzbd: Usenet downloader
- Prowlarr: Indexer management
- Recyclarr: Quality profile management
- Webhook: Automation webhook handler
- Home Assistant: Home automation platform
- ESPHome: ESP device management
- Zigbee2MQTT: Zigbee device integration
- Scrypted: Camera and NVR management
- Mosquitto: MQTT message broker
- rtl_433: 433MHz radio receiver for IoT devices
- GitHub Actions Runners: Self-hosted CI/CD runners
- IT Tools: Collection of useful web tools
- Golink: Internal URL shortener
- Netbox: Infrastructure documentation
- Homebox: Home inventory management
- Mealie: Recipe and meal planning management
- Authentik: Identity provider and SSO
- LLDAP: Lightweight LDAP server
- Pocket ID: Identity management platform
- External DNS: Automatic DNS record management
- Cloudflare Tunnel: Secure tunnel for external access
- System Upgrade Controller: Automated node updates (Kubernetes and Talos)
- CloudNativePG: PostgreSQL operator for database management
- 5 nodes with Intel hardware and 10Gb networking
- Bonded network interfaces with LACP for redundancy
- NVMe boot storage for quick boot speed
- SSD ceph storage for high-availablity cluster storage
- Intel integrated graphics support for hardware transcoding
- 3 nodes (all control plane) with advanced security features
- Secure Boot and UKI enabled for enhanced security
- NVMe storage for boot device and ceph storage
- Intel integrated graphics support for media workloads
- Talos Linux knowledge for cluster management
- Flux CLI for GitOps operations
- 1Password account for secrets management
- Task for automation scripts
- Prepare hardware with Talos Linux installation
- Configure Talos using the provided
talconfig.yamlfiles - Bootstrap Flux using the cluster-specific configurations
- Set up secrets in 1Password and configure External Secrets
- Deploy applications by committing changes to this repository
This repository uses Task for automation:
# Generate Talos configurations
task talos:generate CLUSTER=atlantis-k8s01
# Apply Talos configuration to a node
task talos:apply-config CLUSTER=atlantis-k8s01 node=atlantis-compute01
# Update Talos configuration
task talos:talosconfig CLUSTER=atlantis-k8s01- Renovate automatically creates PRs for dependency updates
- Flux applies approved changes within minutes
- System Upgrade Controller handles node OS updates
- Reloader restarts applications when configurations change
- VictoriaMetrics collects metrics from all cluster components
- Victoria Logs aggregates and analyzes logs from all services
- Grafana provides comprehensive dashboards
- Gatus monitors service availability
- Alert routing via various notification channels
This repository is tailored for personal use but serves as a reference implementation. Feel free to:
- Fork and adapt for your own infrastructure
- Open issues for questions or suggestions
- Submit PRs for improvements or bug fixes
- Secrets: All secrets are managed via 1Password and External Secrets Operator
- Networking: BGP configuration required for MetalLB LoadBalancer services
- Storage: Rook Ceph requires dedicated storage devices on cluster nodes
- Updates: Automated updates are enabled - monitor the deployment pipeline
This infrastructure powers a comprehensive home lab environment with production-grade reliability and security.