Ephemeral Storage in AKS — A Practical Hands-On Lab
Source: Dev.to
Why use Ephemeral Storage?
Ephemeral storage is ideal for workloads that do not require data to survive beyond the lifetime of a Pod, such as:
- CI/CD pipelines
- Batch jobs
- Video transcoding
- ETL pipelines
- AI preprocessing
Using persistent volumes for these cases can:
- Increase cost
- Add lifecycle complexity
- Leak unused PVCs
- Slow down I/O
Ephemeral storage primitives are fast, automatically cleaned up, and live on node‑local disk or memory.
Ephemeral storage primitives
Kubernetes provides several built‑in mechanisms for temporary storage:
| Primitive | Backing | Characteristics |
|---|---|---|
emptyDir (disk) | Node‑local disk | Fast local storage, deleted with the Pod |
emptyDir (memory) | RAM (tmpfs) | Extremely fast, zero disk I/O, evicts Pod if memory is exceeded |
| Generic Ephemeral PVC | Dynamic provisioner (via StorageClass) | Scoped to a Pod, automatically deleted when the Pod is removed |
emptyDir (disk‑backed)
The simplest form of ephemeral storage. It is created when the Pod starts, lives as long as the Pod runs, and is removed when the Pod is deleted.
apiVersion: v1
kind: Pod
metadata:
name: emptydir-disk
spec:
containers:
- name: writer
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
dd if=/dev/zero of=/scratch/test.img bs=1M count=100
sleep 600
volumeMounts:
- mountPath: /scratch
name: scratch
volumes:
- name: scratch
emptyDir: {}
Result: A 100 MiB file is created under /scratch. The data resides on node‑local disk and disappears when the Pod is deleted—perfect for CI build artifacts or temporary data transforms.
emptyDir (memory‑backed)
Mount emptyDir into RAM by setting medium: Memory. This creates a tmpfs mount.
apiVersion: v1
kind: Pod
metadata:
name: emptydir-memory
spec:
containers:
- name: writer
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
dd if=/dev/zero of=/scratch/test.img bs=1M count=100
sleep 600
volumeMounts:
- mountPath: /scratch
name: memvol
volumes:
- name: memvol
emptyDir:
medium: Memory
Characteristics:
- Extremely fast (RAM access)
- Zero disk I/O
- Ideal for caching or scratch processing
⚠️ Note: If the memory limit is exceeded, the Pod may be evicted.
Generic Ephemeral PVC (dynamic)
Kubernetes can create a temporary PersistentVolumeClaim (PVC) scoped to a Pod. The PVC is automatically deleted when the Pod terminates.
apiVersion: v1
kind: Pod
metadata:
name: eph-pvc
spec:
containers:
- name: app
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
echo "using temporary storage"
sleep 600
volumeMounts:
- mountPath: /data
name: eph
volumes:
- name: eph
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
Workflow:
- When the Pod starts, Kubernetes creates a PVC and provisions a backing volume using the default
StorageClass. - When the Pod is deleted, the PVC and its volume are automatically removed—no leaks, no manual cleanup.
Typical use cases include:
- Data‑processing jobs
- Machine‑learning training scratch space
- Large temporary workloads
Comparison of Ephemeral Options
| Feature | emptyDir (disk) | emptyDir (memory) | Generic Ephemeral PVC |
|---|---|---|---|
| Backing storage | Node‑local disk | RAM (tmpfs) | Dynamically provisioned volume (e.g., Azure Disk, AWS EBS) |
| Performance | Fast local I/O | Ultra‑fast (memory) | Depends on provisioner (often network‑attached) |
| Size limit | Node disk capacity | Node memory limit | Defined in PVC spec |
| Automatic cleanup | Yes (Pod deletion) | Yes (Pod deletion) | Yes (Pod deletion) |
| Use case | Build artifacts, temporary files | Caching, scratch processing | Jobs needing larger, possibly network‑backed storage |
Typical Workflows
CI/CD pipeline
- Job start – Clone repository into an
emptyDir. - Build – Generate artifacts inside the same
emptyDir. - Publish – Push Docker image or upload artifacts.
- Job end – Pod exits;
emptyDiris removed automatically.
Data processing job
- Download dataset into an Ephemeral PVC.
- Transform data locally.
- Upload results to external storage.
- Terminate – PVC is deleted, leaving no residual cost.
Best Practices
- Always set
ephemeral-storageresource requests and limits for Pods. - Prefer
emptyDirfor small, fast scratch space; use memory‑backedemptyDirwhen I/O latency must be minimized. - Use Generic Ephemeral PVC when you need larger volumes or a specific storage class (e.g., Azure Managed Disk).
- Combine ephemeral volumes with local SSD nodes, ephemeral OS disks, or autoscaling node pools for maximum performance and cost efficiency.
When to Use Persistent Volumes
Avoid ephemeral storage if any of the following apply:
- Data must survive Pod restarts or node failures.
- Multiple Pods need shared access to the same data.
- You require backups, snapshots, or long‑term retention.
In those scenarios, provision a regular PersistentVolume and PersistentVolumeClaim.
Conclusion
Ephemeral storage is a highly underutilized optimization in Kubernetes. For temporary workloads—CI/CD pipelines, batch processing, media transcoding, AI preprocessing—it delivers:
- Speed (local or memory‑backed I/O)
- Simplicity (no manual cleanup)
- Cost efficiency (no lingering storage charges)
Adopting ephemeral storage as the default for short‑lived workloads can dramatically improve performance, reduce operational overhead, and lower cloud spend.