Gemini API Cheatsheet 2026 — Free Tier Limits, Models, and Endpoints in One Place

Published: 1 day ago (May 3, 2026 at 10:18 AM EDT)

2 min read

Source: Dev.to

Model Overview

Model	Context	Best for
gemini-2.5-flash-preview	1 M tokens	General use, thinking, fast
gemini-2.5-pro-preview	1 M tokens	Complex reasoning, best quality
gemini-1.5-flash	1 M tokens	Stable, production‑ready
gemini-1.5-pro	2 M tokens	Longest context
gemini-2.0-flash-lite	1 M tokens	Lowest latency, highest volume

Recommended Model

For most use cases: gemini-2.5-flash-preview

Rate Limits

Model	RPM	TPM	RPD
Gemini 2.5 Flash Preview	10	250 000	500
Gemini 1.5 Flash	15	1 000 000	1 500
Gemini 1.5 Pro	2	32 000	50
Gemini 2.0 Flash Lite	30	1 000 000	1 500

RPM = requests per minute, TPM = tokens per minute, RPD = requests per day.

API Examples

Generate Content (cURL)

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview:generateContent \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: YOUR_API_KEY" \
  -d '{
    "contents": [{"parts": [{"text": "Your prompt here"}]}]
  }'

Stream Generate Content (cURL)

curl https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview:streamGenerateContent \
  -H "Content-Type: application/json" \
  -H "x-goog-api-key: YOUR_API_KEY" \
  -d '{"contents": [{"parts": [{"text": "Tell me a story"}]}]}'

Rust Example (reqwest)

use reqwest::Client;
use serde_json::json;

pub async fn call_gemini(prompt: &str, api_key: &str) -> Result {
    let client = Client::new();
    let url = format!(
        "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview:generateContent?key={}",
        api_key
    );

    let body = json!({
        "contents": [{ "parts": [{ "text": prompt }] }]
    });

    let res = client.post(&url).json(&body).send().await?;
    let data: serde_json::Value = res.json().await?;

    let text = data["candidates"][0]["content"]["parts"][0]["text"]
        .as_str()
        .unwrap_or("")
        .to_string();

    Ok(text)
}

Error Codes

Code	Meaning	Fix
400	Bad request / token limit	Shorten prompt
403	Invalid API key	Check key
429	Rate limit hit	Wait and retry
500	Internal error	Retry
503	Overloaded	Wait 2 s, retry once

Token Estimates

1 token ≈ 4 characters in English
1 token ≈ 2–3 characters in Japanese
100 lines of Logcat ≈ 3 000–5 000 tokens
1 page of PDF text ≈ 500–800 tokens

Getting an API Key

Go to aistudio.google.com.
Sign in with Google.
Click “Get API Key.”
No credit card is required.

Hiyoko PDF Vault → (by @hiyoyok)

Gemini API Cheatsheet 2026 — Free Tier Limits, Models, and Endpoints in One Place

Model Overview

Recommended Model

Rate Limits

API Examples

Generate Content (cURL)

Stream Generate Content (cURL)

Rust Example (reqwest)

Error Codes

Token Estimates

Getting an API Key

Related posts

Prompt Engineering for Log Diagnosis — What Actually Works With Gemini

How to build an LLM wiki with How to build an LLM wiki with Claude and MCP

Experiment: Does repeated usage influence ChatGPT 5.4 outputs in a RAG-like setup?

Building with Gemini Embedding 2: Agentic multimodal RAG and beyond

Model Overview

Recommended Model

Rate Limits

API Examples

Generate Content (cURL)

Stream Generate Content (cURL)

Rust Example (reqwest)

Error Codes

Token Estimates

Getting an API Key

Related Links

Related posts

Prompt Engineering for Log Diagnosis — What Actually Works With Gemini

How to build an LLM wiki with How to build an LLM wiki with Claude and MCP

Experiment: Does repeated usage influence ChatGPT 5.4 outputs in a RAG-like setup?

Building with Gemini Embedding 2: Agentic multimodal RAG and beyond