TL;DR: Google released Gemini 3.5 Live Translate on June 9, 2026 — a speech-to-speech AI model that translates spoken conversations in near real-time across 70+ languages. It preserves the speaker's voice characteristics and is available today via the Gemini Live API, Google Translate (iOS/Android), and Google Meet (enterprise private preview).
70+Supported Languages
2,000+Language Pairs in Google Meet
128KInput Token Context Limit

The Language Barrier Just Got a Lot Smaller

On June 9, 2026, Google officially launched Gemini 3.5 Live Translate, a new member of the Gemini 3.5 model family purpose-built for real-time, bidirectional speech translation. Unlike conventional translation tools that wait for a speaker to finish before generating output, Gemini 3.5 Live Translate processes audio continuously — staying just a few seconds behind the speaker with no awkward pauses or choppy audio.

The model automatically detects the input language and generates smooth, natural-sounding speech in the target language, faithfully preserving the speaker's intonation, pacing, and pitch. This is a significant step beyond robotic-sounding machine translation outputs.

How It Works

Gemini 3.5 Live Translate is based on Gemini 3 Pro and operates through the Gemini Live API. It accepts a raw 16-bit PCM audio stream at 16kHz and outputs translated audio at 24kHz. Developers configure a translationConfig object in their session setup, specifying a targetLanguageCode using BCP-47 codes (e.g., "ko" for Korean, "ja" for Japanese).

The model handles multilingual inputs seamlessly — if a speaker switches languages mid-conversation, Gemini 3.5 Live Translate detects the switch automatically without any manual reconfiguration. It also includes noise robustness for real-world environments like busy conference rooms or outdoor settings.

💡
Developer Tip
Set echoTargetLanguage: true in your translationConfig if you want the model to echo back audio that's already in the target language. This is useful for bilingual sessions where some speakers may already know both languages. Audio chunks should be sent in 100ms intervals for optimal latency.

Model Specifications

Property Value
Model ID gemini-3.5-live-translate-preview
Input Audio (speech stream)
Output Translated audio + text transcript
Languages 70+ (BCP-47 codes)
Input audio format 16-bit PCM, 16kHz, mono
Output audio format 16-bit PCM, 24kHz, mono
Input token limit 131,072 tokens
Output token limit 65,536 tokens
Based on Gemini 3 Pro

Where You Can Use It Right Now

For Developers — Gemini Live API (Public Preview): Integrate Gemini 3.5 Live Translate into your apps today via gemini-3.5-live-translate-preview in the Gemini Live API or Google AI Studio. This is the most flexible option for building custom translation experiences, from live event captioning to real-time dubbing pipelines.

For Everyone — Google Translate (Android & iOS): The model is rolling out globally on the Google Translate app. Connect any pair of headphones for the best experience. Android users also get a new "listening mode" — just hold your phone to your ear like a regular call to hear translations through the earpiece, no headphones needed.

For Enterprises — Google Meet (Private Preview): Google Meet's speech translation is upgrading from 5 languages to 70+, and from English-only translation to 2,000+ language pair combinations in a single meeting. Enterprise Workspace customers can apply for private preview now, with broader rollout planned for later in 2026.

ℹ️
Current Limitations
As a preview model, Gemini 3.5 Live Translate has some known limitations: voice characteristics may shift after long pauses or during rapid multi-speaker exchanges, and language detection can struggle with strong non-native accents or very similar languages. Google DeepMind's model card provides full disclosure of these known issues.

Real-World Use Cases

From a product perspective, the potential applications are broad:

  • Multilingual remote meetings: Teams across Korea, Japan, France, and Brazil can each speak in their native language and hear real-time translated audio.
  • International education: Lectures, training sessions, and online courses can be delivered simultaneously in dozens of languages.
  • Live event interpretation: Conference organizers can offer attendees a QR-code-accessible translated audio stream without hiring on-site interpreters.
  • Content dubbing: Developers can build automated dubbing pipelines for video content.

Key Takeaways

  • Gemini 3.5 Live Translate provides real-time, continuous speech-to-speech translation across 70+ languages
  • Preserves speaker intonation, pacing, and pitch for natural-sounding output
  • Available via Gemini Live API in public preview — developers can integrate today
  • Google Meet gains 2,000+ language pair support (enterprise private preview)
  • Android "listening mode" enables earphone-free real-time translation
🔗
Official Resources & Documentation
Google Blog: Gemini 3.5 Live Translate Announcement
Gemini Live API: Live Translation Developer Guide (with code examples)
Model Specifications: gemini-3.5-live-translate-preview
Try it now in Google AI Studio