Kubernetes HPA Not Scaling: Debugging Guide

Published: 3 days ago (February 16, 2026 at 03:01 AM EST)

6 min read

Source: Dev.to

Cover image for Kubernetes HPA Not Scaling: Debugging Guide

Introduction

In production, the ability to scale on demand is essential for performance and reliability. HPA is a core component for achieving that, but when it doesn’t work, diagnosing the problem can be challenging. By the end of this article you will:

Understand why HPA may not scale.
Follow a clear, step‑by‑step troubleshooting workflow.
Apply best practices to prevent future scaling issues.

Understanding the Problem

Typical symptoms of a non‑scaling HPA include:

Pods not scaling up or down as expected.
HPA not reacting to changes in CPU, memory, or custom metrics.
Errors in the HPA controller logs.

Real‑world example: A marketing campaign drives a traffic spike, but the pods stay at the original replica count, causing latency spikes and possible downtime. To resolve this, you need to understand the HPA internals and the components that influence its decisions.

Prerequisites

To follow this guide you need:

Basic knowledge of Kubernetes and HPA.
A Kubernetes cluster with the HPA feature gate enabled.
kubectl installed and configured to talk to the cluster.
A text editor or IDE for editing YAML manifests.
Access to a terminal/command prompt.

Step‑by‑Step Solution

Step 1: Diagnosis

Check HPA objects
```
kubectl get hpa -A
```
Inspect pod health
```
kubectl get pods -A
```
Find non‑Running pods
```
kubectl get pods -A | grep -v Running
```

Examine HPA events and controller logs

kubectl describe hpa -n <namespace> <hpa-name>

# If you have access to the controller manager logs
kubectl logs -n kube-system -l component=horizontal-pod-autoscaler

Look for warnings such as “failed to get metrics” or “unable to scale”.

Step 2: Implementation (Creating a Correct HPA)

Below is a minimal example of a Deployment together with a matching HPA. Adjust the resource requests/limits and metric targets to suit your workload.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-deployment
  labels:
    app: example
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example
  template:
    metadata:
      labels:
        app: example
    spec:
      containers:
        - name: example
          image: example/image:latest
          resources:
            requests:
              cpu: 100m
            limits:
              cpu: 200m
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: example-deployment
  minReplicas: 3
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50   # Adjust as needed

Key points

Use scaleTargetRef (not selector) to point the HPA at the Deployment.
Ensure the Deployment’s pod template contains the same labels (app: example).
Set realistic requests/limits so the HPA can calculate utilization correctly.
Choose an appropriate averageUtilization (or use custom metrics if needed).

Step 3: Verify the HPA Works

Apply the manifests

kubectl apply -f deployment-and-hpa.yaml

Generate load (e.g., with hey or wrk) to push CPU usage above the target.
Watch the HPA status
```
kubectl get hpa example-hpa -w
```
You should see the CURRENT replica count increase as the metric crosses the threshold.

Best Practices & Common Pitfalls

Pitfall	Why it Happens	Fix
Missing resource requests	HPA can’t compute utilization without a request value.	Define `resources.requests.cpu` (and memory if needed).
Incorrect `scaleTargetRef`	HPA points to the wrong object, so no scaling occurs.	Verify `apiVersion`, `kind`, and `name` match the target workload.
Metrics Server not installed	HPA can’t fetch CPU/memory metrics.	Deploy the Metrics Server (`kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml`).
Too low `maxReplicas`	HPA hits the ceiling before meeting demand.	Set `maxReplicas` high enough for expected spikes.
Pod Disruption Budgets blocking scale‑down	PDB prevents pods from terminating, causing HPA to think it can’t scale down.	Adjust PDB `minAvailable` or `maxUnavailable` as appropriate.
Custom metrics not exposed	HPA using custom metrics fails silently.	Ensure the custom metrics API (e.g., Prometheus Adapter) is correctly configured and metrics are exposed.

Utilization: 50

kubectl apply -f example.yaml

Verification

To verify that the HPA setup is working correctly, run:

kubectl get hpa example-hpa -o yaml

This displays the current HPA configuration, including the number of replicas.

You can also check the pod status:

kubectl get pods -A

Complete Manifest Example

A full Kubernetes manifest with HPA enabled:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example
  template:
    metadata:
      labels:
        app: example
    spec:
      containers:
      - name: example
        image: example/image
        resources:
          requests:
            cpu: 100m
          limits:
            cpu: 200m
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  selector:
    matchLabels:
      app: example
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
---
apiVersion: v1
kind: Service
metadata:
  name: example-service
spec:
  selector:
    app: example
  ports:
  - name: http
    port: 80
    targetPort: 8080
  type: LoadBalancer

This manifest creates a deployment with three replicas, an HPA that targets 50 % CPU utilization, and a LoadBalancer service exposing the pods.

Common Pitfalls and How to Avoid Them

Insufficient resources – Ensure the cluster has enough capacity to scale.
Incorrect metrics – Verify the HPA uses the correct metric (CPU, memory, or custom).
Inadequate monitoring – Set up alerts to detect HPA issues early.
Inconsistent labels – Keep deployment and HPA labels in sync so the controller can match them.
Inadequate testing – Simulate load to confirm the HPA behaves as expected.

Best Practices Summary

Combine resource‑based and custom metrics for responsive scaling.
Monitor HPA status and pod performance continuously.
Use consistent labels and annotations across resources.
Test scaling behavior under realistic workloads.
Deploy a load balancer or ingress controller to distribute traffic evenly.

Conclusion

We examined common reasons why an HPA might not scale and provided a step‑by‑step troubleshooting guide. By following the best‑practice recommendations, you can ensure your Kubernetes cluster scales efficiently, delivering a reliable experience for users.

🚀 Level Up Your DevOps Skills

Want to master Kubernetes troubleshooting? Check out these resources:

📚 Recommended Tools

Lens – The Kubernetes IDE that makes debugging 10× faster.
k9s – Terminal‑based Kubernetes dashboard.
Stern – Multi‑pod log tailing.

📖 Courses & Books

Kubernetes Troubleshooting in 7 Days – Step‑by‑step email course ($7).
Kubernetes in Action – Definitive guide (Amazon).
Cloud Native DevOps with Kubernetes – Production best practices.

Kubernetes HPA Not Scaling: Debugging Guide

Introduction

Understanding the Problem

Prerequisites

Step‑by‑Step Solution

Step 1: Diagnosis

Step 2: Implementation (Creating a Correct HPA)

Step 3: Verify the HPA Works

Best Practices & Common Pitfalls

Utilization: 50

Verification

Complete Manifest Example

Common Pitfalls and How to Avoid Them

Best Practices Summary

Conclusion

Further Reading

🚀 Level Up Your DevOps Skills

📚 Recommended Tools

📖 Courses & Books

📬 Stay Updated

Related posts

State of cloud native 2026: CNCF CTO’s insights and predictions

Introducing Red Hat build of Podman Desktop: Enterprise-ready local container development environments

Workload identity federation is generally available

Announcing Kyverno 1.17!

Introduction

Understanding the Problem

Prerequisites

Step‑by‑Step Solution

Step 1: Diagnosis

Step 2: Implementation (Creating a Correct HPA)

Step 3: Verify the HPA Works

Best Practices & Common Pitfalls

Utilization: 50

Verification

Complete Manifest Example

Common Pitfalls and How to Avoid Them

Best Practices Summary

Conclusion

Further Reading

🚀 Level Up Your DevOps Skills

📚 Recommended Tools

📖 Courses & Books

📬 Stay Updated

Related posts

State of cloud native 2026: CNCF CTO’s insights and predictions

Introducing Red Hat build of Podman Desktop: Enterprise-ready local container development environments

Workload identity federation is generally available

Announcing Kyverno 1.17!

Step 1: Diagnosis

Step 2: Implementation (Creating a Correct HPA)

Step 3: Verify the HPA Works