Implementing a Cost-Efficient Micro services Platform on Azure Kubernetes

Published: (January 18, 2026 at 11:39 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

Implementing cost‑efficient microservices on AKS

Scope and Assumptions

This post assumes:

  • Familiarity with Kubernetes fundamentals
  • Comfort reading Terraform and Helm
  • Interest in operating systems, not just deploying them

The platform runs on Azure Kubernetes Service, provisioned with Terraform, and deployed using Helm.

1. AKS Baseline: Start Small, Scale on Demand

The most common AKS cost mistake is provisioning for peak load.
We instead:

  • Start with minimal baseline capacity
  • Enable the cluster autoscaler
  • Let demand drive node count

AKS Cluster with Autoscaling

resource "azurerm_kubernetes_cluster" "aks" {
  name                = var.cluster_name
  location            = var.location
  resource_group_name = var.resource_group_name
  dns_prefix          = var.cluster_name

  default_node_pool {
    name                 = "default"
    vm_size              = "Standard_D2s_v5"
    auto_scaling_enabled = true
    min_count            = 1
    max_count            = 10
  }

  identity {
    type = "SystemAssigned"
  }
}

Why this works

  • Idle cost remains low
  • Nodes are added only when pods are pending
  • Capacity matches actual demand, not estimates

2. Horizontal Pod Autoscaling with Predictable Behavior

Autoscaling defaults are aggressive and often unstable.
We explicitly tune scale behavior to reduce churn and latency spikes.

HPA with Stabilization

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-service
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300

Key outcomes

  • Prevents rapid scale‑down during brief traffic dips
  • Improves tail latency
  • Reduces unnecessary pod restarts

3. Spot Node Pools for Fault‑Tolerant Workloads

Spot capacity is one of the highest‑leverage cost optimizations—when isolated properly.

Terraform: Spot Node Pool

resource "azurerm_kubernetes_cluster_node_pool" "spot" {
  name                  = "spot"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.aks.id
  vm_size               = "Standard_D2as_v5"

  priority        = "Spot"
  eviction_policy = "Delete"
  spot_max_price  = -1

  auto_scaling_enabled = true
  min_count            = 0
  max_count            = 10

  node_taints = [
    "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"
  ]
}

Scheduling Workers on Spot Nodes

tolerations:
- key: "kubernetes.azure.com/scalesetpriority"
  operator: "Equal"
  value: "spot"
  effect: "NoSchedule"

nodeSelector:
  kubernetes.azure.com/scalesetpriority: spot

Rules we followed

  • APIs never run on Spot
  • Workers handle SIGTERM cleanly
  • All state lives outside the worker

Used this way, Spot capacity delivered substantial savings without user‑visible impact.

4. Managed PostgreSQL with Private Networking

Databases are not where cost experiments belong.
PostgreSQL runs as a managed service with:

  • Subnet delegation
  • Private DNS
  • No public access

Delegated Subnet for PostgreSQL

resource "azurerm_subnet" "postgres" {
  name                 = "postgres-subnet"
  virtual_network_name = azurerm_virtual_network.main.name
  resource_group_name  = var.resource_group_name
  address_prefixes     = ["10.0.2.0/24"]

  delegation {
    name = "postgres"
    service_delegation {
      name = "Microsoft.DBforPostgreSQL/flexibleServers"
    }
  }
}

Why this matters

  • Database is unreachable from the public internet
  • Access is restricted at the network layer
  • Operational risk is significantly reduced

5. Secure Deployments with Helm Defaults

Helm charts were written with secure‑by‑default assumptions.

Pod Security Context

securityContext:
  runAsUser: 1000
  runAsGroup: 1000
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true

This immediately:

  • Shrinks the attack surface
  • Prevents runtime mutation
  • Surfaces insecure images early

Health Probes That Matter

livenessProbe:
  httpGet:
    path: /health/db-cache
    port: 8080
  initialDelaySeconds: 60

We intentionally check dependencies, not just process health.

6. Workload Identity: No Secrets in Kubernetes

Storing cloud credentials in Kubernetes secrets is unnecessary.
We use Workload Identity for pod‑to‑Azure authentication.

Federated Identity Credential

resource "azurerm_federated_identity_credential" "api" {
  name       = "api-federated"
  parent_id  = azurerm_user_assigned_identity.api.id
  issuer     = azurerm_kubernetes_cluster.aks.oidc_issuer_url
  subject    = "system:serviceaccount:default:api"
  audiences  = ["api://AzureADTokenExchange"]
}

7. Observability Without Ingestion‑Based Pricing

Instead of managed log ingestion, we use:

  • Prometheus for metrics
  • Loki for logs
  • Object storage for retention

Loki Storage Configuration

storage_config:
  azure:
    container_name: logs
    account_name: ${ACCOUNT_NAME}
    access_tier: Cool

Why this works

  • Logs are queried infrequently
  • Storage is inexpensive
  • Ingestion costs dominate managed observability pricing

8. Infrastructure Access Patterns (Secure and Practical)

Cluster Access

Azure AD‑backed kubectl

  • Auditable
  • No shared credentials

In practice, cluster‑admin access is restricted to a small bootstrap group; most teams use namespace‑scoped roles.

Database Access (Occasional Admin Tasks)

kubectl run psql \
  --image=postgres:16 \
  --rm -it -- \
  psql -h <private‑endpoint> -U admin

The database remains private; access is authenticated and auditable.

Observability Dashboards

Grafana is protected behind OAuth using Azure AD.
OAuth2‑Proxy runs with multiple replicas and rotated cookie secrets to avoid becoming a single point of failure.

  • No VPNs
  • No bastion hosts
  • No additional managed services required

Operational Lessons (The Real Ones)

  • Defaults Are Rarely Production‑Safe
    Autoscaling, probes, and security contexts need explicit tuning.

  • Spot Capacity Requires Discipline
    It works extremely well—but only when isolated.

  • Identity Scales Better Than Secrets
    It reduces operational load and security risk.

  • Kubernetes Is a Tool, Not a Destination
    Use it where it adds leverage—not as a dumping ground.

Closing Thoughts

This implementation isn’t about clever tricks—it’s about intentional trade‑offs.

By:

  • Scaling only when needed
  • Using spot capacity responsibly
  • Keeping critical state managed
  • Avoiding ingestion‑based observability costs
  • Treating security as a default

we ended up with a system that is:

  • Predictable to operate
  • Cost‑efficient at idle
  • Resilient under load
  • Easy to evolve

Architecture sets direction.
Implementation determines whether it survives contact with reality.

Back to Blog

Related posts

Read more »

Rapg: TUI-based Secret Manager

We've all been there. You join a new project, and the first thing you hear is: > 'Check the pinned message in Slack for the .env file.' Or you have several .env...

Technology is an Enabler, not a Saviour

Why clarity of thinking matters more than the tools you use Technology is often treated as a magic switch—flip it on, and everything improves. New software, pl...