NVIDIA Partners With Mistral AI to Accelerate New Family of Open Models
Source: NVIDIA AI Blog
Announcement
Today, Mistral AI announced the Mistral 3 family of open‑source multilingual, multimodal models, optimized across NVIDIA supercomputing and edge platforms. The models will be available everywhere—from the cloud to the data center to the edge—starting Tuesday, Dec. 2.
Model Overview
- Mistral Large 3 is a mixture‑of‑experts (MoE) model. Instead of activating every neuron for each token, it only engages the most impactful parts of the model, delivering efficiency without sacrificing accuracy.
- It features 41 B active parameters, 675 B total parameters, and a 256 K context window, providing scalability and adaptability for enterprise AI workloads.
- Mistral AI also released nine smaller language models in the Mistral 3 suite, optimized for running AI on edge devices.
Hardware Integration
- By combining NVIDIA GB200 NVL72 systems with Mistral AI’s MoE architecture, enterprises can efficiently deploy and scale massive AI models, benefiting from advanced parallelism and hardware optimizations.
- The granular MoE design leverages NVIDIA NVLink’s coherent memory domain and wide expert parallelism optimizations.
- Accuracy‑preserving, low‑precision NVFP4 and NVIDIA Dynamo disaggregated inference optimizations further boost performance for both training and inference.
- On the GB200 NVL72, Mistral Large 3 achieved a performance gain compared with the prior‑generation NVIDIA H200, translating into lower per‑token cost, higher energy efficiency, and a better user experience.
Edge Deployment
- The compact Mistral 3 suite runs across NVIDIA’s edge platforms, including NVIDIA Spark, RTX PCs and laptops, and NVIDIA Jetson devices.
- NVIDIA collaborates with top AI frameworks such as Llama.cpp and Ollama to deliver peak performance on edge GPUs.
- Developers can try the Mistral 3 suite via Llama.cpp and Ollama for fast, efficient AI on the edge.
Open‑Source Ecosystem
- Mistral 3 models are openly available, enabling researchers and developers to experiment, customize, and accelerate AI innovation.
- Integration with NVIDIA NeMo tools—Data Designer, Customizer, Guardrails, and NeMo Agent Toolkit—allows enterprises to tailor models for specific use cases, speeding the move from prototype to production.
- NVIDIA has optimized inference frameworks for the Mistral 3 family, including TensorRT‑LLM, SGLang, and vLLM.
Availability
- Mistral 3 is available today on leading open‑source platforms and cloud service providers.
- The models are expected to be deployable soon as NVIDIA NIM microservices.
See the notice regarding software product information.