MiniMax's new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6

Published: 3 days ago (February 12, 2026 at 03:28 PM EST)

7 min read

Source: VentureBeat

MiniMax M2.5 – A Low‑Cost, High‑Performance Language Model

Company: MiniMax (Shanghai, China) – Website | Press Release

Model: M2.5 (two variants) – Announcement Tweet

Why M2.5 Matters

Aspect	Details
Cost reduction	Claims up to 95 % cheaper than current frontier models (Google, Anthropic).
Performance	Benchmarks suggest parity with top‑tier models, especially in agentic tool use (e.g., generating Word, Excel, PowerPoint files).
Open‑source claim	Described as “open source,” but weights, code, and license terms have not yet been released.
Enterprise focus	Tested with senior professionals in finance, law, and social sciences to meet real‑world standards.
Productivity impact	MiniMax reports 30 % of internal tasks and 80 % of newly committed code are now generated by M2.5.

Key Takeaways

From chatbot to worker: When AI becomes “too cheap to meter,” developers shift from simple Q&A bots to autonomous agents that can code, research, and manage projects for long periods without prohibitive costs.
Economic implication: Lower inference costs could accelerate the adoption of AI‑driven agents across enterprises, reshaping how software is built and operated.
Open‑source status: The promise of openness remains unverified until the model weights, source code, and licensing details are publicly available.

Quote from MiniMax

“We believe that M2.5 provides virtually limitless possibilities for the development and operation of agents in the economy.” – MiniMax release blog

This summary consolidates the most relevant information about MiniMax’s M2.5 release while preserving proper markdown structure for easy reading and further editing.

Technology: Sparse Power and the CISPO Breakthrough

The secret to M2.5’s efficiency lies in its Mixture‑of‑Experts (MoE) architecture. Instead of activating all 230 billion parameters for every token, the model only “turns on” about 10 billion. This gives M2.5 the reasoning depth of a massive model while retaining the agility of a much smaller one.

Training Framework: Forge

MiniMax built a proprietary reinforcement‑learning framework called Forge to train this complex system. According to MiniMax engineer Olive Song:

“What we realized is that there’s a lot of potential with a small model like this if we train reinforcement learning on it with a large amount of environments and agents. But it’s not a very easy thing to do.”

Song discussed Forge on the ThursdAI podcast (YouTube), noting that the technique was instrumental in scaling performance while using relatively few parameters. The model was trained for two months.

Forge enables the model to learn from “real‑world environments” by letting the AI practice coding and using tools across thousands of simulated workspaces.

Stabilizing Training: CISPO

To keep the model stable during this intense training, MiniMax employed a mathematical approach called CISPO (Clipping Importance Sampling Policy Optimization). The team shared the full formula on their blog, emphasizing that CISPO:

Prevents over‑correction during training.
Allows the model to develop what MiniMax calls an “Architect Mindset.”

Instead of jumping straight into writing code, M2.5 first plans the structure, features, and interface of a project, then proceeds to implementation.

References

Olive Song’s profile: Olive Song on X
ThursdAI podcast (YouTube): Watch here

State‑of‑the‑Art (and Near) Benchmarks

The results of this architecture are reflected in the latest industry leaderboards. M2.5 hasn’t just improved; it has vaulted into the top tier of coding models, approaching Anthropic’s latest model, Claude Opus 4.6 (released just a week ago link). This shows that Chinese companies are now only days away from catching up to the far‑better‑resourced (in terms of GPUs) U.S. labs.

MiniMax M2.5 – Model Performance Over Time

MiniMax M2.5 line plot comparing different models’ performance over time on the SWE benchmark. Credit: MiniMax

Highlighted Benchmarks for MiniMax M2.5

Benchmark	Score	Comment
SWE‑Bench Verified	80.2 %	Matches Claude Opus 4.6 speeds
BrowseComp	76.3 %	Industry‑leading search & tool use
Multi‑SWE‑Bench	51.3 %	State‑of‑the‑art in multi‑language coding
BFCL (Tool Calling)	76.8 %	High‑precision agentic workflows

Additional Benchmark Comparisons

MiniMax M2.5 various benchmarks comparison bar charts. Credit: MiniMax

Cost Efficiency

On the ThursdAI podcast, host Alex Volkov noted that MiniMax M2.5 operates extremely quickly, using far fewer tokens per task—approximately $0.15 per task, compared with $3.00 for Claude Opus 4.6.

Breaking the cost barrier

MiniMax offers two API‑driven versions of its M2.5 model, both aimed at high‑volume production use:

Model	Speed	Input‑token price*	Output‑token price*
M2.5‑Lightning	100 tokens / s	$0.30 per 1 M tokens	$2.40 per 1 M tokens
Standard M2.5	50 tokens / s	$0.15 per 1 M tokens	$1.20 per 1 M tokens

*Prices are quoted per million input or output tokens.

In plain language: MiniMax claims you can run four “agents” (AI workers) continuously for an entire year for roughly $10 k.

For enterprise customers, this pricing is ≈ 1/10 – 1/20 of the cost of competing proprietary models such as GPT‑5 or Claude 4.6 Opus.

Pricing comparison (per 1 M tokens)

Model	Input $	Output $	Total $	Source
Qwen 3 Turbo	0.05	0.20	0.25	Alibaba Cloud
deepseek‑chat (V3.2‑Exp)	0.28	0.42	0.70	DeepSeek
deepseek‑reasoner (V3.2‑Exp)	0.28	0.42	0.70	DeepSeek
Grok 4.1 Fast (reasoning)	0.20	0.50	0.70	xAI
Grok 4.1 Fast (non‑reasoning)	0.20	0.50	0.70	xAI
MiniMax M2.5	0.15	1.20	1.35	MiniMax
MiniMax M2.5‑Lightning	0.30	2.40	2.70	MiniMax
Gemini 3 Flash Preview	0.50	3.00	3.50	Google
Kimi‑k2.5	0.60	3.00	3.60	Moonshot
GLM‑5	1.00	3.20	4.20	Z.ai
ERNIE 5.0	0.85	3.40	4.25	Baidu
Claude Haiku 4.5	1.00	5.00	6.00	Anthropic
Qwen3‑Max (2026‑01‑23)	1.20	6.00	7.20	Alibaba Cloud
Gemini 3 Pro (≤ 200 K)	2.00	12.00	14.00	Google
GPT‑5.2	1.75	14.00	15.75	OpenAI
Claude Sonnet 4.5	3.00	15.00	18.00	Anthropic
Gemini 3 Pro (> 200 K)	4.00	18.00	22.00	Google
Claude Opus 4.6	5.00	25.00	30.00	Anthropic
GPT‑5.2 Pro	21.00	168.00	189.00	OpenAI

Takeaway

MiniMax’s Standard M2.5 and M2.5‑Lightning provide a dramatically lower cost per token than most leading commercial LLMs, making them especially attractive for enterprises that need to run many AI agents at scale.

Strategic Implications for Enterprises and Leaders

For technical leaders, M2.5 is more than a cheaper API—it reshapes the operational playbook for enterprises today.

Why the “prompt‑optimization” Pressure Disappears

The cost advantage removes the need to constantly trim prompts for savings.
High‑context, high‑reasoning models can now be deployed for routine tasks that were previously cost‑prohibitive.

Speed Gains Enable Real‑Time Agentic Pipelines

37 % faster end‑to‑end task completion.
AI orchestrators that chain models together (the “agentic” pipelines) finally move quickly enough for real‑time user applications.

Domain‑Specific Strengths

Financial modeling: 74.4 % on MEWC, indicating strong capability with the tacit knowledge required in law, finance, and other specialized sectors.
Minimal oversight is needed for complex, domain‑specific tasks.

Open‑Source Positioning: Opportunities & Caveats

Scalable automated code audits: Organizations can run intensive audits at a scale that previously required massive human effort, while retaining tighter data‑privacy controls.
Licensing & weight availability: Until the model’s license terms and weights are publicly released, the promised benefits remain theoretical.

The Bigger Picture

MiniMax M2.5 signals a shift in the AI frontier: success is no longer measured solely by model size, but by how useful and affordable the model is as a worker in the enterprise.

Key takeaway: Enterprises that adopt M2.5 can expect lower costs, faster agentic workflows, and stronger domain performance—provided the open‑source release materializes as promised.

MiniMax's new open M2.5 and M2.5 Lightning near state-of-the-art while costing 1/20th of Claude Opus 4.6

MiniMax M2.5 – A Low‑Cost, High‑Performance Language Model

Why M2.5 Matters

Key Takeaways

Quote from MiniMax

Technology: Sparse Power and the CISPO Breakthrough

Training Framework: Forge

Stabilizing Training: CISPO

State‑of‑the‑Art (and Near) Benchmarks

MiniMax M2.5 – Model Performance Over Time

Highlighted Benchmarks for MiniMax M2.5

Additional Benchmark Comparisons

Cost Efficiency

Breaking the cost barrier

Pricing comparison (per 1 M tokens)

Takeaway

Strategic Implications for Enterprises and Leaders

Why the “prompt‑optimization” Pressure Disappears

Speed Gains Enable Real‑Time Agentic Pipelines

Domain‑Specific Strengths

Open‑Source Positioning: Opportunities & Caveats

The Bigger Picture

Related posts

z.ai's open source GLM-5 achieves record low hallucination rate and leverages new RL 'slime' technique

Gemini 3 Deep Think gets ‘major upgrade’ aimed at practical applications

Leading Inference Providers Cut AI Costs by up to 10x With Open Source Models on NVIDIA Blackwell

Advanced LangGraph Orchestration: Enterprise-Ready AI Workflow Management

MiniMax M2.5 – A Low‑Cost, High‑Performance Language Model

Why M2.5 Matters

Key Takeaways

Quote from MiniMax

Technology: Sparse Power and the CISPO Breakthrough

Training Framework: Forge

Stabilizing Training: CISPO

State‑of‑the‑Art (and Near) Benchmarks

MiniMax M2.5 – Model Performance Over Time

Highlighted Benchmarks for MiniMax M2.5

Additional Benchmark Comparisons

Cost Efficiency

Breaking the cost barrier

Pricing comparison (per 1 M tokens)

Takeaway

Strategic Implications for Enterprises and Leaders

Why the “prompt‑optimization” Pressure Disappears

Speed Gains Enable Real‑Time Agentic Pipelines

Domain‑Specific Strengths

Open‑Source Positioning: Opportunities & Caveats

The Bigger Picture

Related posts

z.ai's open source GLM-5 achieves record low hallucination rate and leverages new RL 'slime' technique

Gemini 3 Deep Think gets ‘major upgrade’ aimed at practical applications

Leading Inference Providers Cut AI Costs by up to 10x With Open Source Models on NVIDIA Blackwell

Advanced LangGraph Orchestration: Enterprise-Ready AI Workflow Management

MiniMax M2.5 – A Low‑Cost, High‑Performance Language Model

Highlighted Benchmarks for MiniMax M2.5

Pricing comparison (per 1 M tokens)