A faster way to estimate AI power consumption
Source: MIT News - AI
AI‑Driven Energy Estimation for Data Centers
Due to the explosive growth of artificial intelligence, it is estimated that data centers will consume up to 12 % of total U.S. electricity by 2028 — according to the Lawrence Berkeley National Laboratory1. Improving data‑center energy efficiency is one way scientists are striving to make AI more sustainable.
Rapid Power‑Prediction Tool
Researchers from MIT and the MIT‑IBM Watson AI Lab have developed a rapid prediction tool that tells data‑center operators how much power will be consumed by running a particular AI workload on a given processor or AI‑accelerator chip.
- Speed: Produces reliable power estimates in a few seconds, whereas traditional modeling techniques can take hours or days.
- Flexibility: Applicable to a wide range of hardware configurations, including emerging designs that haven’t been deployed yet.
Data‑center operators could use these estimates to allocate limited resources across multiple AI models and processors, improving energy efficiency. Algorithm developers and model providers could also assess the potential energy consumption of a new model before deployment.
“The AI sustainability challenge is a pressing question we have to answer. Because our estimation method is fast, convenient, and provides direct feedback, we hope it makes algorithm developers and data‑center operators more likely to think about reducing energy consumption,” says Kyungmi Lee, an MIT postdoc and lead author of a paper on this technique.
Co‑authors
- Zhiye Song – EECS graduate student
- Eun Kyung Lee & Xin Zhang – Research managers at IBM Research and the MIT‑IBM Watson AI Lab
- Tamar Eilam – IBM Fellow, chief scientist of sustainable computing at IBM Research, and a member of the MIT‑IBM Watson AI Lab
- Anantha P. Chandrakasan – MIT provost, Vannevar Bush Professor of Electrical Engineering and Computer Science, and a member of the MIT‑IBM Watson AI Lab
The research is being presented this week at the IEEE International Symposium on Performance Analysis of Systems and Software.
Expediting Energy Estimation
Inside a data center, thousands of powerful graphics processing units (GPUs) train and deploy AI models. Power consumption varies with GPU configuration and workload.
Traditional prediction methods break a workload into individual steps and emulate each module inside the GPU—an approach that can take hours or days for large AI workloads (e.g., model training, data preprocessing).
“As an operator, if I want to compare different algorithms or configurations to find the most energy‑efficient manner to proceed, a single emulation that takes days is impractical,” Lee explains.
Leveraging Repeating Patterns
MIT researchers sought a faster approach by using less‑detailed, quickly estimable information. They observed that AI workloads often contain many repeatable patterns arising from software optimizations (e.g., parallel‑core distribution, efficient data movement). These regular structures can be leveraged for rapid power estimation.
The resulting lightweight model, EnergAIzer, captures the power‑usage pattern of a GPU from those optimizations.
An Accurate Assessment
While fast, the initial estimation omitted certain energy costs:
- Fixed overhead for program setup and configuration.
- Per‑operation energy for each data chunk processed.
- Variances due to hardware fluctuations or data‑access conflicts that reduce effective bandwidth and increase energy draw.
To account for these factors, the team gathered real‑world GPU measurements and derived correction terms applied to the model.
“This way, we can get a fast estimation that is also very accurate,” Lee says.
How EnergAIzer Works
- Input: Workload details (AI model, number/length of inputs, etc.).
- Optional tweaks: GPU configuration, operating speed, or other design choices.
- Output: Energy‑consumption estimate in seconds.
When tested with actual GPU workloads, EnergAIzer achieved ≈ 8 % error, comparable to traditional methods that require hours. The model can also predict power consumption for future GPUs and emerging device configurations, provided hardware does not change drastically in a short time frame.
Future Directions
- Test EnergAIzer on the newest GPU configurations.
- Scale the model to handle many GPUs collaborating on a workload.
- Provide a fast, cross‑stack energy‑estimation solution that supports hardware design decisions and sustainability goals.
References
“…gners, data center operators, and algorithm developers, so they can all be more aware of power consumption. With this tool, we’ve taken one step toward that goal,” Lee says.
Funding Acknowledgment
This research was funded, in part, by the MIT‑IBM Watson AI Lab.