HA K8s cluster using Keepalived and HAProxy
Source: Dev.to
Overview
A stacked HA cluster is a topology where the distributed data storage cluster provided by etcd is stacked on top of the cluster formed by the nodes managed by kubeadm that run control‑plane components.
Each control‑plane node runs an instance of the kube‑apiserver, kube‑scheduler, and kube‑controller‑manager. The kube‑apiserver is exposed to worker nodes using a load balancer.
Each control‑plane node creates a local etcd member and this etcd member communicates only with the kube‑apiserver of the same node. The same applies to the local kube‑controller‑manager and kube‑scheduler instances.
This topology couples the control planes and etcd members on the same nodes. It is simpler to set up than a cluster with external etcd nodes, and simpler to manage for replication.
What happens in a 3‑node stacked cluster
Each control‑plane node runs:
- an etcd member
- kube‑apiserver, scheduler, controller‑manager
So you have:
- 3 etcd members → quorum = 2
- 3 API servers → load‑balanced (can handle 1 down)
If one node fails you still have:
- 2 etcd members → quorum maintained
- 2 control‑plane instances → still available
This is the default topology deployed by kubeadm. A local etcd member is created automatically on control‑plane nodes when using kubeadm init and kubeadm join --control-plane.
Assumptions: You have performed cluster bootstrapping with kubeadm before, as this document does not cover those steps in detail.
Setting up the machines
To set up HAProxy + Keepalived for Kubernetes High Availability (HA) with 3 master nodes and a Virtual IP (VIP), follow this structured approach.
Masters
- 10.238.40.162
- 10.238.40.163
- 10.238.40.164
VIP: 10.238.40.166
Install HAProxy and Keepalived on all masters
sudo apt update
sudo apt install -y haproxy keepalived
HAProxy configuration
Edit /etc/haproxy/haproxy.cfg on all three master nodes:
global
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
option httplog
option dontlognull
frontend kubernetes-apiserver
bind *:8443
mode tcp
option tcplog
default_backend kubernetes-apiserver
backend kubernetes-apiserver
mode tcp
balance roundrobin
option tcp-check
server master1 10.238.40.162:6443 check fall 3 rise 2
server master2 10.238.40.163:6443 check fall 3 rise 2
server master3 10.238.40.164:6443 check fall 3 rise 2
Keepalived configuration
Only one node at a time will “own” the VIP (managed by Keepalived), but the configuration is present on all nodes.
Edit /etc/keepalived/keepalived.conf on each master node.
Note: Adjust the priority value for each node:
| Node | Priority | Role |
|---|---|---|
| Master1 | 110 | MASTER |
| Master2 | 100 | BACKUP |
| Master3 | 90 | BACKUP |
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_haproxy {
script "/bin/curl -f http://localhost:6443/healthz || exit 1"
interval 2
weight -2
fall 3
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface enp19s0
virtual_router_id 51
priority 110 # change per node as described above
advert_int 1
authentication {
auth_type PASS
auth_pass k8s-ha-cluster
}
virtual_ipaddress {
10.238.40.166/24
}
track_script {
chk_haproxy
}
}
Restart services
sudo systemctl restart haproxy keepalived
sudo systemctl enable haproxy keepalived
Verify the VIP
ip addr show | grep 10.238.40.166

Check service status
sudo systemctl status haproxy
sudo systemctl status keepalived
Bootstrap the cluster
Create a kubeadm-config.yaml file on the first master node.
Use the VIP as the control‑plane endpoint and include it in apiServer.certSANs.
Important: Change the
advertiseAddressfield inInitConfigurationto match each master node’s IP address.
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.32.6
apiServer:
certSANs:
- "10.238.40.166" # VIP
- "127.0.0.1" # localhost
- "0.0.0.0" # wildcard
- "10.96.0.1" # Kubernetes service IP
- "10.238.40.162"
- "10.238.40.163"
- "10.238.40.164"
extraArgs:
# (add any extra arguments you need here)
Continue with the usual kubeadm init / kubeadm join steps, referencing the VIP (10.238.40.166) as the control‑plane endpoint.
kubeadm Configuration
horization-mode: Node,RBAC
certificatesDir: /etc/kubernetes/pki
clusterName: pcai
controlPlaneEndpoint: "10.238.40.166:8443"
controllerManager:
extraArgs:
bind-address: 0.0.0.0
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
networking:
dnsDomain: cluster.local
podSubnet: "172.20.0.0/16"
serviceSubnet: "172.30.0.0/16"
scheduler:
extraArgs:
bind-address: 0.0.0.0
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: "10.238.40.162"
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
Initialize the cluster
kubeadm init --upload-certs --config kubeadm-config.yaml -v=5
Note: Save the output! It contains the join commands for control‑plane and worker nodes.
Configure kubectl access
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Install a networking solution (Calico)
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml
Wait for networking pods to be ready
kubectl wait --for=condition=ready pod -l k8s-app=calico-node -n kube-system --timeout=300s
Join additional control‑plane nodes
Run the control‑plane join command (output of kubeadm init) on the other master nodes:
kubeadm join 10.238.40.166:8443 --token \
--discovery-token-ca-cert-hash sha256: \
--control-plane --certificate-key
Note: The certificate key is only valid for 2 hours. If it expires, generate a new one:
kubeadm init phase upload-certs --upload-certs
Verification and Health Checks
Check that all nodes are ready
kubectl get nodes -o wide
Verify control‑plane components
kubectl get pods -n kube-system
Check etcd cluster health
kubectl exec -n kube-system etcd- -- etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
member list
Test VIP failover
# Stop keepalived on the master node that owns the VIP
sudo systemctl stop keepalived
# Verify VIP moves to another node
ip addr show | grep 10.238.40.166
# Test API access via VIP
curl -k https://10.238.40.166:8443/healthz
# Restart keepalived
sudo systemctl start keepalived
Conclusion
Congratulations! You have successfully deployed a highly available Kubernetes cluster using a stacked etcd topology with HAProxy and Keepalived. This setup provides:
Key Benefits
- High Availability: Automatic failover with no single point of failure
- Load Distribution: Traffic distributed across all API servers via HAProxy
- Automatic Recovery: Keepalived handles VIP failover in seconds
- Simplified Architecture: Stacked topology reduces complexity compared to external etcd
Cluster Capabilities
With this 3‑master node configuration:
- Tolerates 1 node failure while maintaining full cluster functionality
- Maintains etcd quorum with 2 out of 3 members
- Continues serving API requests through the remaining healthy masters
- Automatically fails over VIP to operational nodes
