Skip to content

philipjohn05/homelab

Repository files navigation

🏠 Kubernetes Homelab

📖 Introduction

This repository contains all of the configuration and documentation of my Kubernetes homelab.

The purpose of my homelab is to learn production patterns at scale and to have fun experimenting with cloud-native technologies. As someone working with Kubernetes professionally, my homelab is where I can break things, learn from mistakes, and understand infrastructure deeply - without the pressure of production incidents.

Self-hosting applications forces me to think about the entire lifecycle: backup strategies, disaster recovery, security hardening, GitOps workflows, and operational excellence. It's one thing to deploy an app - it's another to maintain it reliably and recover from failures.

Key Principles

  • Everything is code - Infrastructure, applications, and configurations
  • Zero secrets in Git - All sensitive data in Azure Key Vault
  • Immutable infrastructure - Talos Linux enforces declarative configuration
  • Full automation - FluxCD handles deployments from Git
  • 10-minute recovery - Entire cluster can be rebuilt from this repo

🏗️ Cluster Provisioning & Architecture

I use Talos Linux to provision my Kubernetes nodes. Talos is minimal, immutable, and API-driven (no SSH access). It provides production-grade security out of the box and eliminates configuration drift entirely.

Current Architecture

I run a single 6-node cluster with high availability:

Cluster Description
lothbrok Production cluster. 3 control plane nodes (HA with kube-vip) + 3 worker nodes. Runs all infrastructure components and applications. Fully provisioned from code via FluxCD GitOps. Can be destroyed and rebuilt in 10 minutes.

Infrastructure Components

Control Plane:

  • 3 nodes for high availability
  • kube-vip provides virtual IP for API server
  • etcd cluster tolerates single node failure
  • Any control plane node can fail - cluster stays operational

Workers:

  • 3 nodes for application workloads
  • Dedicated to running pods
  • Resource isolation from control plane

Network:

  • Flannel CNI (via Talos)
  • MetalLB for LoadBalancer services (L2 mode)
  • 4 segmented IP pools for different workload types

GitOps Architecture

3-Phase Deployment Pipeline:

Phase 1: infrastructure ↓ (deploys controllers)

Phase 2: infrastructure-secrets ↓ (syncs from Azure Key Vault)

Phase 3: infrastructure-config ↓ (applies configs with substitution)

Result: Fully automated, zero manual intervention

Why 3 phases?

  • Prevents race conditions (configs needing secrets that don't exist yet)
  • Proper dependency ordering (operators before CRDs)
  • Clean separation of concerns (controllers vs. secrets vs. configs)

💻 Hardware

Nodes

Running on Proxmox VE 8.x as the hypervisor layer:

Node Role vCPU RAM Disk Purpose
talos-cp-01 Control Plane 4 6 GB 32 GB etcd, kube-apiserver, scheduler, controller-manager
talos-cp-02 Control Plane 4 6 GB 32 GB etcd, kube-apiserver, scheduler, controller-manager
talos-cp-03 Control Plane 4 6 GB 32 GB etcd, kube-apiserver, scheduler, controller-manager
talos-worker-01 Worker 4 8 GB 100 GB Application workloads
talos-worker-02 Worker 4 8 GB 100 GB Application workloads
talos-worker-03 Worker 4 8 GB 100 GB Application workloads

Physical Hardware:

  • Proxmox Host: ThinkCentre M90T, 48GB RAM & GPU 1660 Super
  • Storage: local storage

🚀 Installed Apps & Tools

📱 End User Applications

Logo Name Description
Linkding Self-hosted bookmark manager with tagging and full-text search
pgAdmin Web-based PostgreSQL database management tool

🔧 Infrastructure

Everything needed to run the cluster and deploy applications:

Logo Name Description
Talos Linux Immutable, API-driven Kubernetes OS. No SSH, minimal attack surface, production-grade security
FluxCD GitOps operator. Watches Git, applies changes automatically. 3-phase pipeline for dependency management
External Secrets Operator Syncs secrets from Azure Key Vault to Kubernetes. Zero secrets in Git. ~2,880 API calls/month = $0.01
cert-manager Automated certificate management. Let's Encrypt integration via HTTP-01 and DNS-01 challenge. 30-day auto-renewal
Renovate Automated dependency updates. Tracks 9 Helm releases + 15+ container images. Weekly scans with GitOps validation
MetalLB LoadBalancer implementation for bare-metal. L2 mode. 4 IP pools: ingress, services, database, reserved
Traefik Cloud-native ingress controller. Automatic routing based on Ingress resources. Gets IP from MetalLB
Cloudflare Tunnel Zero-trust access without port forwarding. Secure tunnel from cluster to Cloudflare edge
SOPS Encrypted secrets in Git. Used for non-dynamic sensitive configs

📊 Monitoring

Logo Name Description
Prometheus Metrics collection and alerting
Grafana Dashboards and visualization

💾 Data

Logo Name Description
CloudNativePG PostgreSQL operator for Kubernetes. Production-grade database management

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors