Google Launches Gemini 3.5 Live Translate for Real-Time Speech Translation in 70+ Languages
Key Takeaways
- ▸Supports automatic detection and translation of 70+ languages with continuous, natural-sounding speech generation that preserves speaker tone, pacing, and pitch
- ▸Available now across consumer (Google Translate apps), enterprise (Google Meet private preview), and developer (Gemini Live API) channels with broad rollout plans
- ▸Enables 2,000+ language combinations in a single meeting, a major expansion from the previous five-language limitation
Summary
Google announced Gemini 3.5 Live Translate, its latest audio model delivering near real-time speech-to-speech translation in over 70 languages. The model automatically detects input languages and generates smooth, natural-sounding translated speech while preserving speakers' intonation, pacing, and pitch. Unlike traditional turn-by-turn translation systems that wait for speakers to finish, Gemini 3.5 Live Translate generates continuous speech in near real-time, maintaining a lag of just a few seconds while staying synchronized with the speaker throughout the session.
The platform is rolling out across multiple channels starting today: developers can access it via the Gemini Live API and Google AI Studio in public preview, enterprises can test it in Google Meet (private preview starting this month), and consumers can use it globally on Google Translate for Android and iOS. The model supports over 2,000+ language combinations in a single meeting—a significant expansion from the previous five-language limit—and features noise robustness for handling unpredictable real-world environments. Early partners including Grab, LiveKit, and CJ ENM have reported positive feedback on translation quality, accuracy, and latency.
- Early partnerships with Grab, LiveKit, Agora, Fishjam, and others demonstrate real-world viability with Grab testing the technology for driver-rider communication across 10+ million monthly voice calls
- Processes speech as it's streamed with noise robustness and latency of just a few seconds, enabling seamless multilingual communication without awkward pauses
Editorial Opinion
This release represents a watershed moment in real-time translation technology—moving from the awkward, delayed back-and-forth of traditional systems to fluid, continuous conversations that finally feel natural. Google's demonstration of stable performance across 70+ languages while preserving speaker characteristics suggests they've solved a genuinely hard technical problem in real-time audio processing. The aggressive multi-platform rollout and early partnership traction from companies like Grab indicate this technology could reshape how globalized teams, international customer service, and cross-border commerce operate. However, real-world performance across diverse language pairs and consistently noisy environments will be the true test of whether this delivers on its ambitious promise.



