Misadventures in Kubernetes: Autoscaling Workers

Published: (May 9, 2026 at 09:31 PM EDT)
3 min read
Source: Dev.to

Source: Dev.to

Overview

Our initial cluster was built manually: the control plane was configured by hand and each worker node was joined individually. While this approach is great for learning, it isn’t scalable or resilient for production. To move beyond a static, single‑node worker pool we need three capabilities:

  • Automatic joining – new VMs should join the cluster as soon as they boot.
  • Self‑healing – if a node fails, a replacement should be provisioned automatically.
  • Smart scaling – the cluster should grow when load increases and shrink when it drops, saving cost.

The key is to have a startup script on each VM that runs kubeadm join automatically.

Create a Permanent Join Token

Standard kubeadm tokens expire after 24 hours, which isn’t suitable for an autoscaling group that may run for months. Create a token with no TTL on the control plane:

kubeadm token create --print-join-command --ttl 0

Copy the full command that is printed; you’ll embed it in the startup script later.

Build an Instance Template

An Instance Template tells GCP how to build a VM (image, machine type, metadata, etc.). Replace “ with the command you copied above.

gcloud compute instance-templates create k8s-worker-template \
  --image-family=k8s-node-family \
  --machine-type=e2-standard-2 \
  --tags=k8s-worker \
  --metadata startup-script='#! /bin/bash
'

Note: k8s-node-family is the custom image created in the earlier part of this series.

Create a Regional Managed Instance Group (MIG)

A regional MIG spreads nodes across multiple zones for high availability.

gcloud compute instance-groups managed create k8s-worker-mig \
  --template=k8s-worker-template \
  --size=1 \
  --region=us-central1

GCP will immediately spin up one node, which will boot, execute the startup script, and join the cluster automatically.

Enable Autoscaling

Configure the MIG to scale based on CPU utilization. The group will keep at least one node and can grow to five nodes when the average CPU usage exceeds 60 %.

gcloud compute instance-groups managed set-autoscaling k8s-worker-mig \
  --max-num-replicas=5 \
  --min-num-replicas=1 \
  --target-cpu-utilization=0.60 \
  --region=us-central1

Test the Autoscaler

  1. Create a load generator – a busybox pod that burns CPU in an infinite loop.

    kubectl create deployment load-generator --image=busybox -- /bin/sh -c "while true; do :; done"
  2. Request CPU – tell Kubernetes how much CPU the pod needs (required for the cluster autoscaler to act).

    kubectl set resources deployment load-generator --requests=cpu=200m
  3. Scale the load – increase the replica count to generate enough demand.

    kubectl scale deployment load-generator --replicas=20
  4. Watch the scaling in action

    • In one terminal, watch the node list:

      kubectl get nodes -w
    • In another terminal, watch the MIG instances:

      gcloud compute instance-groups managed list-instances k8s-worker-mig --region=us-central1

    You should see the initial node fill up, pods enter the Pending state, and the GCP autoscaler provisioning additional VMs until the load is satisfied.

Comparison with GKE

If you were using Google Kubernetes Engine, the entire setup could be replaced by a single command:

gcloud container clusters create k8s-easy-cluster \
  --zone us-central1-a \
  --num-nodes 3 \
  --machine-type e2-medium

While GKE abstracts away the underlying plumbing, building the cluster “the hard way” gives you deeper insight into each component, making you a better operator when troubleshooting.

Next Steps

  • Upgrade the control plane without downtime – continue the series to learn rolling upgrades.
  • Explore additional automation (e.g., monitoring, logging, network policies).

References

0 views
Back to Blog

Related posts

Read more »