tokens are now more expensive than juniors, and less predictable

Published: (May 1, 2026 at 05:05 AM EDT)
4 min read
Source: Dev.to

Source: Dev.to

Public Pricing (as of 2024)

Provider / ModelInput price (per 1 M tokens)Output price (per 1 M tokens)
OpenAI GPT‑5.4$2.50$15
Anthropic Claude Sonnet 4.6$3.00$15
Google Gemini 2.5 Pro$1.25 (≤ 200 k tokens)
$2.50 (> 200 k tokens)
$10 (≤ 200 k tokens)
$15 (> 200 k tokens)

These numbers look cheap if you only run a few prompts in a playground.

A Rough Cost Sketch for a 10‑Person Team

Assumptions per seat per workday

  • 5 M input tokens
  • 2 M output tokens

22 workdays ≈ 1 month.

Provider / ModelApprox. monthly cost for 10 seats
OpenAI GPT‑5.4$9,350
Anthropic Claude Sonnet 4.6$9,900
Google Gemini 2.5 Pro$7,150 – $9,350 (range reflects the two‑tier pricing)

The Gemini range shows how a single model can swing between “cheap” and “expensive” depending on token usage patterns.

How This Stacks Up Against Salaries

Role (2024 median)Annual salaryMonthly salary
Administrative assistant (secretary)$47,460$3,955
Software developer (median)$133,080$11,090
Software developer (10th percentile)$79,850$6,654

Takeaway: A single engineer casually using a model is still cheaper than a junior developer, but company‑wide AI workflows can outpace junior labor costs very quickly. Five heavy AI seats can already exceed the monthly cost of a median administrative assistant.

Why Token Spend Can Be a Hidden Cost

  1. Output is often the expensive half – OpenAI GPT‑5.4 charges 6× more for output than input. Teams that focus only on “sending a lot of context” miss the bulk of the bill.

  2. Tokenizer changes matter – Anthropic notes that Claude Opus 4.7’s new tokenizer can consume up to 35 % more tokens for the same text, causing a sudden cost jump with no workload change.

  3. Tiered pricing creates surprise – Gemini 2.5 Pro switches rates after 200 k tokens. Longer prompts, lower cache hit rates, or added features (e.g., grounding, search) can dramatically reshape the bill.

  4. Agents multiply the line‑items – With an AI agent you pay for:

    • The original prompt
    • Tool schemas & results
    • Chain‑of‑thought reasoning budgets (platform‑dependent)
    • Retries, file context, prior‑turn summaries, review passes, self‑correction loops, etc.

    “The agent did the task in 8 minutes” often hides a blurrier marginal cost than the dashboard suggests.

Recommendations (Do the Boring Things Early)

ActionWhy
Don’t benchmark on a single cute demoOne‑off tests hide long‑term cost patterns.
Match model capability to taskNot every task needs the frontier model.
Avoid using expensive models as a management substituteHuman oversight still adds value.
Tag and monitor token usageTreat token spend like any other budget line‑item.
Apply the “AI vs human” framing carefullyLeads to better architecture and honest economics.
Use AI to amplify good people, not replace themHumans remain owners of correctness, cost, and consequences.

Bottom Line

  • Tokens are still useful, but they are no longer a cute rounding error.
  • For many teams, token spend is becoming a real labor‑adjacent budget category.
  • Don’t pretend tokens are magically cheaper than people – they come with a billing model that can change under your feet, a cost profile that can explode with usage patterns, and a nasty habit of looking cheap right until they aren’t.

My default stance:

Use AI aggressively, but never let the token budget operate without adult supervision.

References

0 views
Back to Blog

Related posts

Read more »