Kubeforge builds production-style Kubernetes clusters on Proxmox using kubeadm, OpenTofu, cloud-init, and Ansible.
It is designed for environments that range from homelabs to more serious lab, edge, and enterprise-style deployments where you want repeatable cluster lifecycle management without giving up control over the underlying Kubernetes stack.
Out of the box it handles:
- VM provisioning on Proxmox with OpenTofu
- cloud image bootstrapping with cloud-init
- Kubernetes node preparation with Ansible
kubeadm+containerd- Cilium with native LoadBalancer IPAM and L2 announcements
- Traefik
- kube-vip for multi-control-plane API high availability
- optional Proxmox CSI
Ubuntu and Rocky Linux cloud images are both supported.
- Interactive
configureflow with curated Ubuntu and Rocky presets - VM/IP/VMID planning with subnet validation and safer config guardrails
- Proxmox-managed NIC MAC addresses by default
- kube-vip for lightweight multi-control-plane API high availability without a separate HAProxy VM
- Cilium-native service load balancing without MetalLB
- Automatic kubeconfig fetch, install, and merge into
~/.kube/config - Cluster-aware
destroyandupgradeworkflows for multi-cluster workstations - Built-in
healthcommand for fast post-bootstrap validation - Upstream Helm-based Proxmox CSI support, with
pvecsictlavailable for manual local-storage PV moves
Run the deployer from macOS, Ubuntu/Debian, or Rocky/RHEL/Fedora.
Required tools:
tofuansible-playbookansible-inventorypython3sshssh-keygenkubectljq
Optional but recommended local tools:
ciliumCLI for inspecting Cilium networking, policies, and LoadBalancer statessh-copy-idfor./deploy.sh proxmox-ssh-setupkubectxfor faster multi-cluster context switchingFreelensfor a local Kubernetes GUIk9sfor a terminal Kubernetes UIpvecsictlfor moving local Proxmox CSI volumes between Proxmox nodes- pvecsictl requires Go when installed via
go install
- pvecsictl requires Go when installed via
The deployer checks for missing required commands and:
- prompts to install supported ones automatically when it knows how
- otherwise prints install hints for the missing tool
Supported auto-install prompts use:
- macOS via Homebrew
- Debian/Ubuntu via
apt - Rocky/RHEL/Fedora via
dnf
./deploy.sh configure
./deploy.sh apply
./deploy.sh bootstrap
./deploy.sh healthThat flow gives you:
- rendered Terraform and Ansible inputs
- provisioned Proxmox VMs
- a bootstrapped Kubernetes cluster
- merged kubeconfig in
~/.kube/config - a quick post-bootstrap health check
configure
- Writes
terraform.tfvars.json - Validates Proxmox connectivity and VMID choices
- Lets you choose Ubuntu or Rocky image presets, plus custom image URL/file overrides
apply
- Applies infrastructure with OpenTofu
- Downloads the selected cloud image
- Creates VMs and renders fresh inventory / Ansible vars
- Refreshes your local kubeconfig from
out/kubeconfigonly when one already exists from a previous successful bootstrap
bootstrap
- Waits for SSH reachability
- Prepares nodes
- Bootstraps kubeadm
- Installs Cilium, Cilium LoadBalancer IPAM / L2 announcements, Traefik, kube-vip for HA clusters, and optional Proxmox CSI
- Uses the upstream Helm-based Proxmox CSI install path and creates a
proxmoxStorageClass for the selected Proxmox datastore - Fetches
out/kubeconfig - Backs up and merges the cluster kubeconfig into
~/.kube/configas soon asout/kubeconfigis available during bootstrap - Prints optional local tool suggestions at the end of a successful bootstrap
health
- Uses the installed kubeconfig if available, otherwise
out/kubeconfig - Validates nodes, core cluster services, cluster API reachability, Cilium LoadBalancer IP pools, and assigned LoadBalancer service IPs
upgrade
- Prompts for the target tracked cluster when more than one cluster exists
- Checks your configured Kubernetes and chart versions against newer available upstream versions
- Prompts you which discovered upgrades you want to apply before running the upgrade playbook
- Updates
terraform.tfvars.jsonand the rendered Ansible vars for the selected target versions
./deploy.sh configure
./deploy.sh plan
./deploy.sh apply
./deploy.sh bootstrap
./deploy.sh upgrade
./deploy.sh destroy
./deploy.sh output
./deploy.sh install-kubeconfig
./deploy.sh proxmox-ssh-setup
./deploy.sh healthUseful outputs:
out/inventory.ymlout/ansible-vars.ymlout/kubeconfigout/ssh/id_cluster_ed25519out/deployment-history/<workspace>/last-applied.tfvars.jsonout/deployment-history/<workspace>/ssh/id_cluster_ed25519
After a successful bootstrap:
- the cluster kubeconfig is fetched to
out/kubeconfig - the deployer installs it into
~/.kube/config - if
~/.kube/configalready exists, it is:- backed up with a timestamped
.bak.*suffix - merged with the new cluster config while preserving cluster-specific authinfo names
- backed up with a timestamped
If bootstrap later fails after kubeconfig has already been fetched, the deployer still refreshes ~/.kube/config from that fetched file instead of losing it.
You do not need to edit your shell profile when using ~/.kube/config.
If you want to reinstall or re-merge it later:
./deploy.sh install-kubeconfigAfter apply:
- if
out/kubeconfigalready exists from an earlier successful bootstrap, the deployer refreshes~/.kube/config - if no kubeconfig has been fetched yet, the deployer leaves kubeconfig alone and tells you that installation will happen after bootstrap
- Uses
apt - Uses the configured bootstrap user, default
ubuntu - Applies node updates before Kubernetes prep
- Uses
dnf - Uses the configured bootstrap user, default
rocky - Can optionally install and enable Cockpit
- Writes explicit DNS settings before package work to avoid cloud-init / NetworkManager DNS surprises
- Enables
sshdcorrectly in cloud-init
- Single control plane:
- Kubernetes API endpoint is the control-plane node IP
- Two or more control planes:
- kube-vip provides a floating virtual IP for the Kubernetes API
- Kubernetes API endpoint points at the kube-vip IP
- no separate HAProxy VM is required
The deployer keeps a history of the last applied config per cluster workspace.
That allows destroy to:
- target the correct cluster even if your current
terraform.tfvars.jsonhas changed - prompt you when multiple tracked clusters exist
- fall back to legacy root state when cleaning up older deployments created before workspace-aware state handling
- prune only the destroyed cluster from
~/.kube/config, while leaving other contexts intact - remove shared
out/artifacts only when they belong to the cluster being destroyed - preserve the shared bootstrap SSH key in
out/ssh/by restoring it from another remaining cluster history when appropriate
Run:
./deploy.sh healthIt validates:
- node readiness
- cluster API access
corednscilium-operatortraefik- Cilium LoadBalancer IP pool ranges and usage
- LoadBalancer services and assigned external IPs
- all pods across namespaces
If you changed terraform.tfvars.json, run:
./deploy.sh applybootstrap uses rendered outputs from out/inventory.yml and out/ansible-vars.yml. It should not run against old rendered data.
If Proxmox is using the local snippets datastore, the host needs:
sudo install -d -m 0755 /var/lib/vz/snippetsThe deployer can create that automatically if Proxmox SSH access has been set up with:
./deploy.sh proxmox-ssh-setupThe guest package alone is not enough. The Proxmox VM option must also enable the QEMU guest agent device. If you changed that option on an existing VM, a full stop/start may be required.
The deployer writes explicit DNS configuration and bounds package-manager timeouts, but mirror issues can still happen. Re-running bootstrap after connectivity stabilizes is usually enough.
Use full IPv4 range syntax in the rendered config. The deployer now normalizes shorthand like:
192.168.1.80-89
into:
192.168.1.80-192.168.1.89
If you changed the tfvars manually, rerun:
./deploy.sh applybefore bootstrap.
The deployer installs Proxmox CSI using the upstream Helm chart flow and creates a proxmox StorageClass that points at your selected Proxmox datastore.
For local Proxmox storage like lvm, lvm-thin, zfs, ext4, or xfs, cross-node PV moves are still a manual workflow. The upstream project documents pvecsictl for those offline PV migrations.
If you changed configuration and want to rerun bootstrap cleanly, refresh the rendered outputs first:
./deploy.sh apply
./deploy.sh bootstrapIf you want a smoother day-to-day operator experience, these are the most useful optional tools to add:
ciliumCLI for Cilium networking and service IP troubleshootingFreelensfor a desktop Kubernetes UIkubectxfor fast context switching between clustersk9sfor terminal-based Kubernetes inspectionpvecsictlfor manual local Proxmox CSI volume moves between Proxmox nodespvecsictlrequires Go when installed withgo install
On macOS:
## Follow the official Cilium CLI install instructions:
## https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/
brew install --cask freelens
brew install kubectx
brew install k9s
GOBIN="$HOME/.local/bin" go install github.com/sergelogvinov/proxmox-csi-plugin/cmd/pvecsictl@latestOn Linux:
- install the
ciliumCLI using the official Cilium install instructions - install
kubectxandk9sfrom your distro package manager when available - install
Freelensfrom the official DEB, RPM, Flatpak, or Snap packages - install
pvecsictlwith Go:
GOBIN="$HOME/.local/bin" go install github.com/sergelogvinov/proxmox-csi-plugin/cmd/pvecsictl@latestOfficial project links:
- Freelens: https://freelensapp.github.io/
- kubectx: https://github.com/ahmetb/kubectx
- k9s: https://k9scli.io/
- pvecsictl: https://github.com/sergelogvinov/proxmox-csi-plugin
See terraform.tfvars.example for the generated config structure.
Kubeforge is licensed under the Apache License 2.0. See LICENSE.