Defined Term concept updated Thu Jun 04 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

LLM API pricing

How llm-provider access is priced — per-token, split input vs output, with output 2–6× more expensive than input across providers llm-api-pricing-comparison. The 2026 market spans a ~600× cost spread.

Three tiers (mid-2026 snapshot — volatile, cite the source/date)

Budget — Mistral Small 3.2 $0.10/$0.30 (in/out per 1M); GPT-4.1 Nano $0.10/$0.40; deepseek V3.2 $0.27/$1.10.
Mid / production — GPT-5.4 $2.50/$15; Claude Sonnet 4.6 $3/$15; Gemini 3.1 Pro $2/$12.
Frontier / reasoning — GPT-5.4 Pro $30/$180; o3 $15/$60; Claude Opus 4.7 $5/$25.

Cost levers (stack to ~25% of list)

Prompt caching — 50–90% off repeated input prefixes (aggressive at anthropic/OpenAI).
Batch processing — flat 50% off for non-real-time work.
Model routing — send simple tasks to budget models; cuts 70–90% of qualifying calls.

Strategic reads

deepseek disrupted the bottom (OpenAI-compatible + ultra-low cost), pulling the floor down; google undercuts on sticker price; OpenAI offers the widest range; Anthropic competes on consistency + caching. Because output dominates cost, context/output optimization is the highest-leverage spend control. Pricing is the most volatile fact in this wiki — always date it.

llm-api-pricing-comparison · deepseek · anthropic · llm-provider · llm-benchmarks

LLM API pricing

Three tiers (mid-2026 snapshot — volatile, cite the source/date)

Cost levers (stack to ~25% of list)

Strategic reads

Related

Linked from