tokens are now more expensive than juniors, and less predictable

Published: 3 days ago (May 1, 2026 at 05:05 AM EDT)

4 min read

Source: Dev.to

Public Pricing (as of 2024)

Provider / Model	Input price (per 1 M tokens)	Output price (per 1 M tokens)
OpenAI GPT‑5.4	$2.50	$15
Anthropic Claude Sonnet 4.6	$3.00	$15
Google Gemini 2.5 Pro	$1.25 (≤ 200 k tokens) $2.50 (> 200 k tokens)	$10 (≤ 200 k tokens) $15 (> 200 k tokens)

These numbers look cheap if you only run a few prompts in a playground.

A Rough Cost Sketch for a 10‑Person Team

Assumptions per seat per workday

5 M input tokens
2 M output tokens

22 workdays ≈ 1 month.

Provider / Model	Approx. monthly cost for 10 seats
OpenAI GPT‑5.4	$9,350
Anthropic Claude Sonnet 4.6	$9,900
Google Gemini 2.5 Pro	$7,150 – $9,350 (range reflects the two‑tier pricing)

The Gemini range shows how a single model can swing between “cheap” and “expensive” depending on token usage patterns.

How This Stacks Up Against Salaries

Role (2024 median)	Annual salary	Monthly salary
Administrative assistant (secretary)	$47,460	$3,955
Software developer (median)	$133,080	$11,090
Software developer (10th percentile)	$79,850	$6,654

Takeaway: A single engineer casually using a model is still cheaper than a junior developer, but company‑wide AI workflows can outpace junior labor costs very quickly. Five heavy AI seats can already exceed the monthly cost of a median administrative assistant.

Why Token Spend Can Be a Hidden Cost

Output is often the expensive half – OpenAI GPT‑5.4 charges 6× more for output than input. Teams that focus only on “sending a lot of context” miss the bulk of the bill.
Tokenizer changes matter – Anthropic notes that Claude Opus 4.7’s new tokenizer can consume up to 35 % more tokens for the same text, causing a sudden cost jump with no workload change.
Tiered pricing creates surprise – Gemini 2.5 Pro switches rates after 200 k tokens. Longer prompts, lower cache hit rates, or added features (e.g., grounding, search) can dramatically reshape the bill.
Agents multiply the line‑items – With an AI agent you pay for:
- The original prompt
- Tool schemas & results
- Chain‑of‑thought reasoning budgets (platform‑dependent)
- Retries, file context, prior‑turn summaries, review passes, self‑correction loops, etc.
“The agent did the task in 8 minutes” often hides a blurrier marginal cost than the dashboard suggests.

Recommendations (Do the Boring Things Early)

Action	Why
Don’t benchmark on a single cute demo	One‑off tests hide long‑term cost patterns.
Match model capability to task	Not every task needs the frontier model.
Avoid using expensive models as a management substitute	Human oversight still adds value.
Tag and monitor token usage	Treat token spend like any other budget line‑item.
Apply the “AI vs human” framing carefully	Leads to better architecture and honest economics.
Use AI to amplify good people, not replace them	Humans remain owners of correctness, cost, and consequences.

Bottom Line

Tokens are still useful, but they are no longer a cute rounding error.
For many teams, token spend is becoming a real labor‑adjacent budget category.
Don’t pretend tokens are magically cheaper than people – they come with a billing model that can change under your feet, a cost profile that can explode with usage patterns, and a nasty habit of looking cheap right until they aren’t.

My default stance:

Use AI aggressively, but never let the token budget operate without adult supervision.

References

OpenAI, API Pricing – https://openai.com/pricing
Anthropic, Claude Pricing – https://www.anthropic.com/pricing
Google, Gemini Developer API Pricing – https://cloud.google.com/vertex-ai/generative-ai/pricing
U.S. Bureau of Labor Statistics, Software Developers, Quality Assurance Analysts, and Testers – https://www.bls.gov/oes/current/oes151132.htm
U.S. Bureau of Labor Statistics, Secretaries and Administrative Assistants – https://www.bls.gov/oes/current/oes43-3000.htm

tokens are now more expensive than juniors, and less predictable

Public Pricing (as of 2024)

A Rough Cost Sketch for a 10‑Person Team

How This Stacks Up Against Salaries

Why Token Spend Can Be a Hidden Cost

Recommendations (Do the Boring Things Early)

Bottom Line

References

Related posts

The smarter the model, the more it saves.

Caching AI Responses in a Desktop App — Don't Pay Twice for the Same Question

LLM386: borrowing a 1990s idea for managing LLM context

Token Consumption Anxiety and the Open Source App I Built to Solve It