As AI Grows More Complex, Model Builders Rely on NVIDIA
Source: NVIDIA AI Blog
Pretraining: The Bedrock of Intelligence
AI models are getting more capable thanks to three scaling laws: pretraining, post‑training and test‑time scaling. Reasoning models, which apply compute during inference to tackle complex queries using multiple networks working together, are now everywhere.
But pretraining and post‑training remain the bedrock of intelligence. They’re core to making reasoning models smarter and more useful. Getting there takes scale—training frontier models from scratch isn’t a small job. It requires tens of thousands, even hundreds of thousands, of GPUs working together effectively.
That level of scale demands excellence across many dimensions: world‑class accelerators, advanced networking across scale‑up, scale‑out and increasingly scale‑across architectures, plus a fully optimized software stack. In short, a purpose‑built infrastructure platform built to deliver performance at scale.
Compared with the NVIDIA Hopper architecture, NVIDIA GB200 NVL72 systems delivered 3× faster training performance on the largest model tested in the latest MLPerf Training industry benchmarks, and nearly 2× better performance per dollar (source).
NVIDIA GB300 NVL72 delivers more than 4× speedup compared with NVIDIA Hopper (MLPerf benchmarks).
These performance gains help AI developers shorten development cycles and deploy new models more quickly.
Proof in the Models Across Every Modality
The majority of today’s leading large language models were trained on NVIDIA platforms. AI isn’t just about text—NVIDIA supports development across speech, image, video generation, and emerging areas like biology and robotics.
- Evo 2 decodes genetic sequences (blog).
- OpenFold 3 predicts 3D protein structures.
- Boltz‑2 simulates drug interactions, helping researchers identify promising candidates faster.
On the clinical side, NVIDIA Clara synthesis models generate realistic medical images to advance screening and diagnosis without exposing patient data.
Companies such as Runway and Inworld train on NVIDIA infrastructure. Runway recently announced Gen‑4.5, a frontier video‑generation model that tops the Artificial Analysis leaderboard. Optimized for NVIDIA Blackwell, Gen‑4.5 was developed entirely on NVIDIA GPUs across research, pre‑training, post‑training and inference.
Runway also introduced GWM‑1, a state‑of‑the‑art general world model trained on NVIDIA Blackwell, built to simulate reality in real time. It’s interactive, controllable and general‑purpose, with applications in video games, education, science, entertainment and robotics.
Benchmarks illustrate why. MLPerf, the industry‑standard benchmark for training performance, shows NVIDIA’s breadth: in the latest round, NVIDIA submitted results across all seven MLPerf Training 5.1 benchmarks (details), demonstrating strong performance and versatility. It was the only platform to submit in every category.
NVIDIA’s ability to support diverse AI workloads helps data centers use resources more efficiently. AI labs such as Black Forest Labs, Cohere, Mistral, OpenAI, Reflection, and Thinking Machines Lab are all training on the NVIDIA Blackwell platform.
NVIDIA Blackwell Across Clouds and Data Centers
NVIDIA Blackwell is widely available from leading cloud service providers, neo‑clouds, and server makers. NVIDIA Blackwell Ultra, offering additional compute, memory, and architectural improvements, is now rolling out from server makers and cloud providers.
Major cloud service providers and NVIDIA Cloud Partners, including Amazon Web Services, CoreWeave, Google Cloud, Lambda, Microsoft Azure, Nebius, Oracle Cloud Infrastructure, and Together AI, already offer instances powered by NVIDIA Blackwell, ensuring scalable performance as pretraining scaling continues.
From frontier models to everyday AI, the future is being built on NVIDIA.