Red Hat OpenShift Service on AWS supports Capacity Reservations and Capacity Blocks for Machine Learning
Source: Red Hat Blog
Why guaranteed capacity matters
Maintaining guaranteed or uninterrupted access to a specific infrastructure type in a particular Availability Zone (AZ) is important for several critical scenarios:
- GPU‑based accelerated computing workloads – Uninterrupted access to GPU instances is vital for AI/ML teams conducting training, fine‑tuning, or inference. Capacity reservations eliminate the risk of compute unavailability for these time‑sensitive, resource‑intensive tasks.
- Planned scaling events – Confidently support peak traffic seasons, major product launches, or scheduled batch processing without provisioning delays.
- High availability and disaster recovery – Enhance resiliency by guaranteeing capacity when deploying workloads across multiple AZs or executing disaster‑recovery protocols across regions.
Capacity Reservations and Capacity Blocks for ML
- Amazon EC2 Capacity Reservations let you reserve compute capacity for EC2 instances in a specific AZ for any duration.
- Capacity Blocks for ML let you reserve GPU‑based accelerated computing instances on a future date to support short‑duration ML workloads.
With support for Capacity Reservations for clusters with hosted control planes (HCP), platform administrators can create ROSA machine pools that directly consume the capacity already reserved with AWS.
Key best practices for leveraging Capacity Reservations with ROSA
- Pre‑plan AZs, instance types, and capacity – Ensure a precise match between the reserved capacity and the ROSA machine‑pool attributes (VPC subnets, node replica count, instance type). Wait until the AWS Capacity Reservation status is active before provisioning ROSA machine pools that use it.
- Choose the appropriate instance‑matching criteria – AWS provides two matching criteria for ODCRs: “Open” and “Targeted.” For workloads that should exclusively use reserved capacity for ROSA clusters, the targeted criteria is strongly recommended. Remember that ODCRs operate on a “use it or lose it” principle and are billed at on‑demand rates regardless of utilization.
- Control how reserved capacity is consumed – ROSA allows you to define whether a machine pool should fall back to on‑demand instances when the reservation is exhausted, or fail outright.
- Centralize purchase and allocation – Organizations with multiple AWS accounts can centralize ODCR purchases and allocate them across member accounts using AWS Resource Access Manager. ROSA fully supports utilizing Capacity Reservations shared to the AWS account where the cluster is created, simplifying financial management.
- Monitor reservation utilization proactively – Because reservations may be shared across workloads or accounts, continuously monitor utilization. Planning for potential exhaustion helps prevent ROSA cluster nodes from becoming unavailable for critical workloads.
Further reading
- Learn how to purchase Capacity Reservations and Capacity Blocks for ML in the AWS documentation.
- Manage machine pools and set capacity preferences in your ROSA cluster in the Managing Nodes chapter of the ROSA documentation.
- Get started with ROSA on the ROSA product page.