LLM API pricing
How llm-provider access is priced — per-token, split input vs output, with output 2–6× more expensive than input across providers llm-api-pricing-comparison. The 2026 market spans a ~600× cost spread.
Three tiers (mid-2026 snapshot — volatile, cite the source/date)
- Budget — Mistral Small 3.2 $0.10/$0.30 (in/out per 1M); GPT-4.1 Nano $0.10/$0.40; deepseek V3.2 $0.27/$1.10.
- Mid / production — GPT-5.4 $2.50/$15; Claude Sonnet 4.6 $3/$15; Gemini 3.1 Pro $2/$12.
- Frontier / reasoning — GPT-5.4 Pro $30/$180; o3 $15/$60; Claude Opus 4.7 $5/$25.
Cost levers (stack to ~25% of list)
- Prompt caching — 50–90% off repeated input prefixes (aggressive at anthropic/OpenAI).
- Batch processing — flat 50% off for non-real-time work.
- Model routing — send simple tasks to budget models; cuts 70–90% of qualifying calls.
Strategic reads
deepseek disrupted the bottom (OpenAI-compatible + ultra-low cost), pulling the floor down; google undercuts on sticker price; OpenAI offers the widest range; Anthropic competes on consistency + caching. Because output dominates cost, context/output optimization is the highest-leverage spend control. Pricing is the most volatile fact in this wiki — always date it.
Related
llm-api-pricing-comparison · deepseek · anthropic · llm-provider · llm-benchmarks