Software Application updated Fri Jun 05 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Kokoro

An 82M-parameter open-weight-tts model from Hexgrad (v1.0, Jan 2025) — the field’s efficiency leader. Apache-2.0 licensed.

Why it matters

Tiny + cheap: at 82M it runs on minimal hardware for <$1 / 1M characters self-hosted ($0.65/1M on tts-arena-leaderboard) — the budget/on-device default.
Punches above its size: ~4.5 MOS naturalness and a competitive Elo ~1064 despite being ~60× smaller than fish-audio-s2-pro tts-models-2026-benchmark.
Architecture: StyleTTS2 + ISTFTNet — no diffusion, which is part of why it’s fast.

The tradeoff

No voice-cloning — it ships ~54 preset voices and ~15 languages. The capability dropped to hit the footprint, making Kokoro the clean example of the efficiency-vs-controllability split in open-weight-tts. CER ~17% (Trelis) is higher than the heavyweight models — quality-per-byte is its pitch, not absolute accuracy.

open-weight-tts · text-to-speech · tts-benchmarks · voice-cloning · fish-audio-s2-pro · open-source-tts-models

Kokoro

Why it matters

The tradeoff

Related

Linked from