LLM API Pricing Comparison (CloudZero, 2026)
A ranked cost comparison of the major LLM APIs — the core source for llm-api-pricing. Headline: a ~600× cost spread across the market, $/1M tokens (input/output).
Snapshot tiers (mid-2026; volatile)
- Budget ($0.10–$0.75 in): Mistral Small 3.2 $0.10/$0.30 · GPT-4.1 Nano $0.10/$0.40 · deepseek V3.2 $0.27/$1.10.
- Mid / production ($2–$5 in): GPT-5.4 $2.50/$15 · Claude Sonnet 4.6 $3/$15 · Gemini 3.1 Pro (≤200K) $2/$12.
- Frontier / reasoning ($15–$30 in): GPT-5.4 Pro $30/$180 · o3 $15/$60 · Claude Opus 4.7 $5/$25.
Provider positioning
- OpenAI — widest range (~150× spread), pick by workload.
- Anthropic (anthropic) — consistency + aggressive prompt caching (90% off cached).
- google — undercuts on sticker price (Gemini 2.5 Flash $0.15/$0.60 = “cheapest high-quality option from a major U.S. provider”).
Cost levers (stackable → ~25% of list)
Prompt caching 50–90% off repeated input · batch flat 50% off · model routing to budget models cuts 70–90% of qualifying calls. Structural fact: output tokens cost 2–6× input, so context optimization is the biggest lever.
Related
llm-api-pricing · deepseek · anthropic · llm-provider