Build with Gemini 3 Flash, frontier intelligence that scales with you
Source: Dev.to

Introducing Gemini 3 Flash
Today we’re introducing Gemini 3 Flash, our latest model with frontier intelligence built for speed at a fraction of the cost. Building on 3 Pro’s strong multimodal, coding, and agentic features, 3 Flash offers powerful performance at less than a quarter the cost of 3 Pro, along with higher rate limits. The new 3 Flash model surpasses 2.5 Pro across many benchmarks while delivering faster speeds. It also features our most advanced visual and spatial reasoning (details) and now offers code execution (docs) to zoom, count, and edit visual inputs.
Gemini 3 Flash is rolling out to developers via:
- Google AI Studio – Start a new chat
- Google Antigravity – Read the announcement
- Gemini CLI – Blog post
- Android Studio – Build smarter apps
- Vertex AI – Enterprise offering
Smarter, faster and ready for production at scale
Gemini 3 Flash delivers frontier‑class performance on PhD‑level reasoning and knowledge benchmarks like GPQA Diamond (90.4 %) and Humanity’s Last Exam (33.7 % without tools), rivaling much larger frontier models.
Gemini 3 Flash is highly efficient without sacrificing intelligence, pushing the Pareto frontier of performance vs. cost. It outperforms 2.5 Pro while being 3× faster (based on Artificial Analysis benchmarking) at a fraction of the cost. Even with the lowest thinking level (docs), 3 Flash often outperforms previous versions set to “high” thinking levels.
Pricing & Cost‑Saving Features
- Pricing (Gemini API & Vertex AI)
- $0.50 / 1 M input tokens
- $3 / 1 M output tokens
- Audio input remains $1 / 1 M input tokens
- Context caching – up to 90 % cost reduction for repeated token use over certain thresholds.
- Batch API – up to 50 % cost savings and higher rate limits for asynchronous processing.
- Production‑ready rate limits – available for paid API customers in synchronous and near‑real‑time use cases.
Gemini 3 Flash in action
Gemini 3 Flash is now integrated into many of our products, and early customers are enthusiastic about the new possibilities.
For coding
Gemini 3 Flash offers improved coding and agent capabilities over previous versions, enabling rapid, iterative development. It outperforms 3 Pro’s agentic coding skill (78 % on SWE‑bench Verified) while operating faster for quick iterations. Today, 3 Flash is rolling out to users in Google Antigravity, our new agentic development platform, to provide intelligent coding assistance that keeps pace with your train of thought.
For gaming
Gemini 3 Flash brings powerful performance to game developers, delivering superior video analysis and near‑real‑time reasoning that outperforms the 2.5 series.
- Astrocade uses 3 Flash for its agentic game‑creation engine, generating full game plans and executable code from a single prompt, turning concepts into playable games in seconds.
- Latitude leverages 3 Flash to generate smarter characters and more realistic worlds, elevating gameplay while keeping costs low. The engine can now handle complex tasks that previously required pro‑level models like Sonn.
Gemini 3 Flash – the sweet spot of speed, intelligence, and cost‑efficiency for developers and enterprises alike.
Gemini 3 Flash Highlights
Nick Walton
CEO, Latitude
For deep‑fake detection
Resemble AI is using Gemini 3 Flash to provide near‑real‑time deep‑fake intelligence by instantly transforming complex forensic data into simple explanations. They discovered that Gemini 3 Flash offered 4× faster multimodal analysis compared with Gemini 2.5 Pro, processing raw technical outputs without hindering crucial workflows. Learn more in their case study.
For document analysis
Performance gains often come with a latency trade‑off, but Gemini 3 Flash proves that fast models can still meet the rigorous accuracy demands of the legal industry. With strong reasoning capabilities without sacrificing speed, it enables new levels of efficiency for complex document analysis for Harvey, an AI company for law firms and professional service providers.
Gemini 3 Flash has achieved a meaningful step‑up in reasoning, improving over 7 % on Harvey’s BigLaw Bench compared with its predecessor, Gemini 2.5 Flash. These quality improvements, combined with Flash’s low latency, are impactful for high‑volume legal tasks such as extracting defined terms and cross‑references from contracts.
Niko Grupen
Head of Applied Research, Harvey
Get started with Gemini 3 Flash
Gemini 3 Flash is available across many of our products, APIs, and throughout the ecosystem. As you explore the Gemini 3 family, you can:
- Use our new built‑in API logs visualization dashboard.
- Send model feedback directly through Google AI Studio.
- Since 3 Flash is a reasoning model, be sure to circulate thoughts in the API or use the new Interactions API.
Where to access Gemini 3 Flash
We are excited to put this model in your hands and can’t wait to see what you create with Gemini 3 Flash.

