Kubernetes v1.36: Pod-Level Resource Managers (Alpha)
Source: Kubernetes Blog
Why do we need pod‑level resource managers?
When running performance‑critical workloads such as:
- Machine‑learning (ML) training
- High‑frequency trading applications
- Low‑latency databases
you often need exclusive, NUMA‑aligned resources for your primary application containers to ensure predictable performance.
Modern Kubernetes pods, however, rarely consist of a single container. They frequently include sidecars for:
- Logging
- Monitoring
- Service meshes
- Data ingestion
Before this feature, achieving NUMA‑aligned, exclusive resources for the main container required allocating exclusive, integer‑based CPU resources to every container in the pod. This is wasteful for lightweight sidecars and, if omitted, forfeits the pod’s Guaranteed QoS class, losing the performance benefits.
Introducing pod‑level resource managers
Enabling pod‑level resources (via the PodLevelResourceManagers and PodLevelResources feature gates) allows the kubelet to create hybrid allocation models. This brings flexibility and efficiency to high‑performance workloads without sacrificing NUMA alignment.
Real‑world use cases
1. Tightly‑coupled database (Topology Manager’s pod scope)
A latency‑sensitive database pod includes:
- Main database container
- Local metrics exporter sidecar
- Backup‑agent sidecar
With the pod scope, the kubelet performs a single NUMA alignment based on the entire pod’s budget. The database container receives exclusive CPU and memory slices from that NUMA node, while the remaining resources form a pod‑shared pool for the sidecars.
apiVersion: v1
kind: Pod
metadata:
name: tightly-coupled-database
spec:
# Pod‑level resources establish the overall budget and NUMA alignment size.
resources:
requests:
cpu: "8"
memory: "16Gi"
limits:
cpu: "8"
memory: "16Gi"
initContainers:
- name: metrics-exporter
image: metrics-exporter:v1
restartPolicy: Always
- name: backup-agent
image: backup-agent:v1
restartPolicy: Always
containers:
- name: database
image: database:v1
# This Guaranteed container gets an exclusive 6 CPU slice from the pod's budget.
# The remaining 2 CPUs and 4 Gi memory form the pod‑shared pool for the sidecars.
resources:
requests:
cpu: "6"
memory: "12Gi"
limits:
cpu: "6"
memory: "12Gi"
Result:
Auxiliary containers can co‑locate on the same NUMA node as the primary workload without wasting dedicated cores.
2. ML workload with infrastructure sidecars (Topology Manager’s container scope)
A pod runs a GPU‑accelerated ML training container alongside a generic service‑mesh sidecar.
With the container scope, the kubelet evaluates each container individually. The ML container receives exclusive, NUMA‑aligned CPUs and memory, while the sidecar runs in the node‑wide shared pool.
apiVersion: v1
kind: Pod
metadata:
name: ml-workload
spec:
# Pod‑level resources establish the overall budget constraint.
resources:
requests:
cpu: "4"
memory: "8Gi"
limits:
cpu: "4"
memory: "8Gi"
initContainers:
- name: service-mesh-sidecar
image: service-mesh:v1
restartPolicy: Always
containers:
- name: ml-training
image: ml-training:v1
# Under the 'container' scope, this Guaranteed container receives exclusive,
# NUMA‑aligned resources, while the sidecar runs in the node's shared pool.
resources:
requests:
cpu: "3"
memory: "6Gi"
limits:
cpu: "3"
memory: "6Gi"
Result:
Only containers that truly need NUMA‑aligned exclusivity consume those resources, while others share the node’s general pool.
CPU quotas (CFS) and isolation
| Allocation type | CFS quota handling |
|---|---|
| Exclusive containers | CFS quota enforcement disabled at the container level → no throttling. |
| Pod‑shared‑pool containers | CFS quotas enforced at the pod level → containers cannot exceed the leftover pod budget. |
How to enable Pod‑Level Resource Managers
Prerequisite: Kubernetes v1.36 or newer.
- Enable feature gates in the kubelet configuration:
featureGates: PodLevelResources: true PodLevelResourceManagers: true - Configure the Topology Manager with a policy other than
none(best-effort,restricted, orsingle-numa-node). - Set the Topology Manager scope (
podorcontainer) via thetopologyManagerScopefield inKubeletConfiguration. - Configure the CPU Manager with the
staticpolicy. - Configure the Memory Manager with the
Staticpolicy.
Observability
When the feature gate is enabled, the kubelet exposes additional metrics to help administrators monitor and debug the new allocation models, e.g.:
resource_manager_allocations_total– counts the total number of exclusive resource allocations performed.
(Additional metrics are documented in the official kubelet metrics reference.)
Resource Manager Metrics
-
resource_manager_allocation_total– counts the total number of exclusive resource allocations. Thesourcelabel ("pod"or"node") distinguishes between allocations drawn from the node‑level pool versus a pre‑allocated pod‑level pool. -
resource_manager_allocation_errors_total– counts errors encountered during exclusive resource allocation, distinguished by the intended allocation source ("pod"or"node"). -
resource_manager_container_assignments– tracks the cumulative number of containers running with specific assignment types. Theassignment_typelabel ("node_exclusive","pod_exclusive","pod_shared") provides visibility into how workloads are distributed.
Current Limitations and Caveats
While this feature opens up new possibilities, there are a few things to keep in mind during its Alpha phase. Be sure to review the Limitations and Caveats section in the official documentation for full details on:
- Compatibility
- Requirements
- Downgrade instructions
Getting Started and Providing Feedback
Technical Details & Configuration
For a deep dive into the technical details and configuration of this feature, check out the official concept documentation:
- Pod‑level Resource Managers – Overview of the pod‑level resources feature and how to assign resources to pods.
- Assign Pod‑level CPU and Memory Resources – Step‑by‑step guide for assigning CPU and memory at the pod level.
Feedback Channels
As this feature moves through Alpha, your feedback is invaluable. Please report any issues or share your experiences via the standard Kubernetes communication channels:
- Slack:
#sig-node - Mailing List: (link to the relevant mailing list)
- Open Community Issues / PRs: (link to the GitHub repository or issue tracker)