Spokes.wiki Search Graph Growth About

speech-audio-wiki

Blog Posting source ↗ source url updated Fri Jun 05 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Speech-to-Text APIs in 2026: Benchmarks, Pricing (Future AGI)

A comparison of the leading commercial speech-to-text APIs — the closed/managed side opposite the open-source models in open-source-stt-models.

The APIs (WER / latency / price)

APIWERLatencyLanguagesPrice
Deepgram Nova-35.26% batch / 6.84% stream<300ms36+$0.0043–0.0077/min
ElevenLabs Scribe v2 Realtime~3.3% (EN)~150ms90+$0.22–0.48/hr
OpenAI GPT-4o Transcribe~8.9%not real-time57+$6/1K min
Google Cloud Chirp 3~11.6%variable125+$4–16/1K min
AssemblyAI Universal-2~14.5%~760ms99+~$0.0062/min

(WER numbers come from differing test sets — treat as indicative snapshots, cf. speech-to-text.)

Reading it

speech-to-text · speech-audio-ai · open-source-stt-models · whisper