Software Application source ↗ source url updated Tue Jun 09 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Gemini 3.5 Live Translate

Google’s end-to-end speech-to-speech translation audio model (announced 2026-06), part of the gemini family (llm-providers-wiki, cross-wiki). It listens to spoken input, translates, and generates natural speech in the target language — not text translation. A proprietary, closed-frontier entrant that fuses the spoke’s STT and TTS branches into one real-time pipeline.

Capabilities

70+ languages, “2000+ language combinations.”
Voice preservation: maintains the speaker’s intonation, pacing, and pitch across languages.
Near real-time, simultaneous: processes streamed speech continuously (not turn-based), staying “just a few seconds behind the speaker,” balancing waiting-for-context vs. translate-immediately — i.e. a simultaneous-interpretation latency/quality trade-off (cf. tts-benchmarks latency axis).
Multilingual & robust: handles multilingual input without manual config; noise-robust.
Safety/provenance: all generated audio carries imperceptible SynthID watermarking (a detectability/anti-misinformation measure — a provenance mechanism new to this wiki).

Availability (2026-06)

Developers: public preview via the Gemini Live API + Google AI Studio.
Enterprise: private preview in Google Meet.
Consumers: global rollout in Google Translate (Android/iOS); Android “listening mode” delivers translation through the phone earpiece.

Significance

The clearest instance yet of the spoke’s “LLM eating audio from both ends” thesis happening simultaneously: STT + machine translation + TTS in one streamed model. It puts Google/gemini — already a TTS leader (Gemini 3.x Flash TTS) and STT player (Chirp) — at the closed frontier of speech-to-speech-translation too.

Caveat

A product announcement (vendor framing); language counts, latency (“few seconds”), and voice- preservation quality are Google-stated, not independently benchmarked. Dated snapshot, 2026-06.

speech-to-speech-translation · speech-to-text · text-to-speech · speech-audio-ai · gemini

Gemini 3.5 Live Translate

Capabilities

Availability (2026-06)

Significance

Caveat

Related

Linked from