Implementing a Cost-Efficient Micro services Platform on Azure Kubernetes

Published: 1 day ago (January 18, 2026 at 11:39 PM EST)

4 min read

Source: Dev.to

Source: Dev.to

Scope and Assumptions

This post assumes:

Familiarity with Kubernetes fundamentals
Comfort reading Terraform and Helm
Interest in operating systems, not just deploying them

The platform runs on Azure Kubernetes Service, provisioned with Terraform, and deployed using Helm.

1. AKS Baseline: Start Small, Scale on Demand

The most common AKS cost mistake is provisioning for peak load.
We instead:

Start with minimal baseline capacity
Enable the cluster autoscaler
Let demand drive node count

AKS Cluster with Autoscaling

resource "azurerm_kubernetes_cluster" "aks" {
  name                = var.cluster_name
  location            = var.location
  resource_group_name = var.resource_group_name
  dns_prefix          = var.cluster_name

  default_node_pool {
    name                 = "default"
    vm_size              = "Standard_D2s_v5"
    auto_scaling_enabled = true
    min_count            = 1
    max_count            = 10
  }

  identity {
    type = "SystemAssigned"
  }
}

Why this works

Idle cost remains low
Nodes are added only when pods are pending
Capacity matches actual demand, not estimates

2. Horizontal Pod Autoscaling with Predictable Behavior

Autoscaling defaults are aggressive and often unstable.
We explicitly tune scale behavior to reduce churn and latency spikes.

HPA with Stabilization

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-service
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300

Key outcomes

Prevents rapid scale‑down during brief traffic dips
Improves tail latency
Reduces unnecessary pod restarts

3. Spot Node Pools for Fault‑Tolerant Workloads

Spot capacity is one of the highest‑leverage cost optimizations—when isolated properly.

Terraform: Spot Node Pool

resource "azurerm_kubernetes_cluster_node_pool" "spot" {
  name                  = "spot"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.aks.id
  vm_size               = "Standard_D2as_v5"

  priority        = "Spot"
  eviction_policy = "Delete"
  spot_max_price  = -1

  auto_scaling_enabled = true
  min_count            = 0
  max_count            = 10

  node_taints = [
    "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
  ]
}

Scheduling Workers on Spot Nodes

tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
  operator: "Equal"
  value: "spot"
  effect: "NoSchedule"

nodeSelector:
  kubernetes.azure.com/scalesetpriority: spot

Rules we followed

APIs never run on Spot
Workers handle SIGTERM cleanly
All state lives outside the worker

Used this way, Spot capacity delivered substantial savings without user‑visible impact.

4. Managed PostgreSQL with Private Networking

Databases are not where cost experiments belong.
PostgreSQL runs as a managed service with:

Subnet delegation
Private DNS
No public access

Delegated Subnet for PostgreSQL

resource "azurerm_subnet" "postgres" {
  name                 = "postgres-subnet"
  virtual_network_name = azurerm_virtual_network.main.name
  resource_group_name  = var.resource_group_name
  address_prefixes     = ["10.0.2.0/24"]

  delegation {
    name = "postgres"
    service_delegation {
      name = "Microsoft.DBforPostgreSQL/flexibleServers"
    }
  }
}

Why this matters

Database is unreachable from the public internet
Access is restricted at the network layer
Operational risk is significantly reduced

5. Secure Deployments with Helm Defaults

Helm charts were written with secure‑by‑default assumptions.

Pod Security Context

securityContext:
  runAsUser: 1000
  runAsGroup: 1000
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true

This immediately:

Shrinks the attack surface
Prevents runtime mutation
Surfaces insecure images early

Health Probes That Matter

livenessProbe:
  httpGet:
    path: /health/db-cache
    port: 8080
  initialDelaySeconds: 60

We intentionally check dependencies, not just process health.

6. Workload Identity: No Secrets in Kubernetes

Storing cloud credentials in Kubernetes secrets is unnecessary.
We use Workload Identity for pod‑to‑Azure authentication.

Federated Identity Credential

resource "azurerm_federated_identity_credential" "api" {
  name       = "api-federated"
  parent_id  = azurerm_user_assigned_identity.api.id
  issuer     = azurerm_kubernetes_cluster.aks.oidc_issuer_url
  subject    = "system:serviceaccount:default:api"
  audiences  = ["api://AzureADTokenExchange"]
}

7. Observability Without Ingestion‑Based Pricing

Instead of managed log ingestion, we use:

Prometheus for metrics
Loki for logs
Object storage for retention

Loki Storage Configuration

storage_config:
  azure:
    container_name: logs
    account_name: ${ACCOUNT_NAME}
    access_tier: Cool

Why this works

Logs are queried infrequently
Storage is inexpensive
Ingestion costs dominate managed observability pricing

8. Infrastructure Access Patterns (Secure and Practical)

Cluster Access

Azure AD‑backed kubectl

Auditable
No shared credentials

In practice, cluster‑admin access is restricted to a small bootstrap group; most teams use namespace‑scoped roles.

Database Access (Occasional Admin Tasks)

kubectl run psql \
  --image=postgres:16 \
  --rm -it -- \
  psql -h <private‑endpoint> -U admin

The database remains private; access is authenticated and auditable.

Observability Dashboards

Grafana is protected behind OAuth using Azure AD.
OAuth2‑Proxy runs with multiple replicas and rotated cookie secrets to avoid becoming a single point of failure.

No VPNs
No bastion hosts
No additional managed services required

Operational Lessons (The Real Ones)

Defaults Are Rarely Production‑Safe
Autoscaling, probes, and security contexts need explicit tuning.
Spot Capacity Requires Discipline
It works extremely well—but only when isolated.
Identity Scales Better Than Secrets
It reduces operational load and security risk.
Kubernetes Is a Tool, Not a Destination
Use it where it adds leverage—not as a dumping ground.

Closing Thoughts

This implementation isn’t about clever tricks—it’s about intentional trade‑offs.

By:

Scaling only when needed
Using spot capacity responsibly
Keeping critical state managed
Avoiding ingestion‑based observability costs
Treating security as a default

we ended up with a system that is:

Predictable to operate
Cost‑efficient at idle
Resilient under load
Easy to evolve

Architecture sets direction.
Implementation determines whether it survives contact with reality.