Kubernetes Networking — Broken Labs & Incident Response
Source: Dev.to
When traffic fails, never guess.
Always follow this order:
Ingress
↓
Service
↓
Endpoints
↓
Pod
↓
Container
If one layer fails, everything above it fails.
LAB 1 — ClusterIP Service (Most Common Production Failure)


Scenario
- Pods are Running
- Service exists
- Browser /
curlreturns nothing
Broken Setup
Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 2
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api-v1 # ❌ wrong label
spec:
containers:
- name: app
image: hashicorp/http-echo:0.2.3
args:
- "-listen=:8080"
- "-text=API OK"
ports:
- containerPort: 8080
Service
apiVersion: v1
kind: Service
metadata:
name: api-svc
spec:
selector:
app: api # ❌ mismatch
ports:
- port: 80
targetPort: 8080
Symptoms
kubectl get pods
kubectl get svc
kubectl get endpoints api-svc
Output
ENDPOINTS:
Root Cause
Service selector does not match the Pod labels.
Fix
kubectl edit deployment api
Change the pod label to match the service selector:
labels:
app: api
Verify:
kubectl get endpoints api-svc
DevOps Interview Answer
Q: Service exists but no traffic, pods running. What do you check?
A: Check the Endpoints. Empty endpoints indicate a selector mismatch or readiness problem.
When to Use ClusterIP
- Internal APIs
- Backend services
- Microservices
Pros / Cons
Pros
- Secure (not exposed externally)
- Stable IP within the cluster
- Scales with the number of pods
Cons
- Accessible only inside the cluster
LAB 2 — NodePort (Why It’s Dangerous)


Scenario
- NodePort is exposed
- Works sometimes
- Fails after a node change
Setup
apiVersion: v1
kind: Service
metadata:
name: node-svc
spec:
type: NodePort
selector:
app: api
ports:
- port: 80
targetPort: 8080
nodePort: 30080
Symptoms
- Works when accessing the service via one node’s IP
- Fails when using another node’s IP
- Security team flags the open ports
Root Cause
NodePort opens the same port on every node in the cluster, giving no control over routing and making the service dependent on which node IP you hit.
DevOps Fix
Replace NodePort with one of the following:
ClusterIP+ IngressLoadBalancer(if the cloud provider supports it)
Interview Answer
Q: Why is NodePort rarely used in production?
A: It exposes every node, lacks fine‑grained security and routing, and doesn’t scale well.
When NodePort Is Acceptable
- Debugging / quick tests
- Temporary external access
- Learning or sandbox environments
LAB 3 — LoadBalancer Service (Cloud Reality)

Scenario
- A
LoadBalancerservice creates an external IP - The application remains unreachable
Setup
apiVersion: v1
kind: Service
metadata:
name: lb-svc
spec:
type: LoadBalancer
selector:
app: api
ports:
- port: 80
targetPort: 8080
Symptoms
kubectl get svc
Typical output shows an external IP assigned, but curl/browser cannot reach the service.
Common Root Causes
- Cloud provider delay – the external load balancer may still be provisioning.
- Missing firewall rules – the cloud firewall blocks traffic to the allocated port.
- Pod readiness – pods are not ready, so the load balancer has no healthy endpoints.
Fix Checklist
- Wait for the
EXTERNAL-IPcolumn to show a real IP (not “). - Verify cloud firewall / security group allows inbound traffic on the service port (usually 80 or 443).
- Ensure pods are
Ready(kubectl get pods) and that the service has endpoints (kubectl get endpoints lb-svc). - If using a private VPC, confirm you have a way to reach the IP (VPN, bastion host, etc.).
DevOps Interview Answer
Q: A LoadBalancer service shows an external IP but the app is unreachable. What do you check?
A:
- Cloud provider’s load‑balancer provisioning status.
- Firewall / security‑group rules.
- Pod readiness and endpoint population.
Additional LoadBalancer Troubleshooting
LoadBalancer Issues & Troubleshooting
- External IP exists
- Browser timeout
Troubleshooting
kubectl describe svc lb-svc
kubectl get endpoints lb-svc
Check Cloud
- Health checks
- Security groups
- Target port mismatch
Root Cause
Cloud LB health check fails because:
- Wrong port
- App not listening
- Readiness probe failing
DevOps Fix
- Align ports
- Add readiness probe
- Validate security groups
Interview Answer
Q: Why not use LoadBalancer for every service?
A: Cost, lack of routing, and limited flexibility compared to Ingress.
LAB 4 — Ingress (Most Interviewed Topic)


Scenario
- Ingress created
- 404 error returned
Broken Ingress Manifest
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
spec:
rules:
- http:
paths:
- path: /app
pathType: Prefix
backend:
service:
name: wrong-svc # ❌ wrong name
port:
number: 80
Symptoms
- Ingress IP works
- Always returns 404
Troubleshooting
kubectl describe ingress
kubectl get svc
kubectl get pods -n ingress-nginx
Root Cause
Ingress routes to a non‑existent service.
Fix
Correct the backend service name in the Ingress spec.
Interview Answer
Q: Ingress returns 404, where do you check first?
A: Ingress rules, service name, service port, and controller logs.
LAB 5 — DNS Failure (Hidden Killer)


Scenario
- Services exist
- DNS name fails
Test
kubectl run test --rm -it --image=busybox -- sh
nslookup api-svc
Root Cause
- CoreDNS not running
- Wrong namespace
- Service deleted
Fix
kubectl get pods -n kube-system | grep dns
Restart the DNS pods if needed.
Interview Answer
Q: How do Pods discover services?
A: Via Kubernetes DNS, which resolves Service names to their ClusterIP.
INCIDENT RESPONSE PLAYBOOK (Real DevOps)
Step‑by‑Step
- Check Ingress
- Check Service
- Check Endpoints
- Check Pod readiness
- Check container logs
Never skip steps.
FINAL DECISION MATRIX (Very Important)
| Requirement | Use |
|---|---|
| Internal traffic | ClusterIP |
| External production traffic | Ingress |
| Cloud simple exposure | LoadBalancer |
| Debug only | NodePort |
INTERVIEW RAPID FIRE (Must Memorize)
-
Q: Empty endpoints means?
A: Selector mismatch or readiness failure. -
Q: Most used service in prod?
A:ClusterIP. -
Q: Why Ingress?
A: Routing, TLS, cost efficiency. -
Q: NodePort in prod?
A: Avoid.
REAL DEVOPS TRUTH
Networking issues are:
- Predictable
- Layered
- Always observable
The difference between struggling and solving fast is methodical thinking.