Google

Gemini 3.1 Flash TTS

Google's Gemini 3.1 Flash TTS — expressive text-to-speech with 30 voices and style control.

Brak subskrypcji
Kredyty nigdy nie wygasają
Dowiedz się więcej

Zapłać raz za kredyty - używaj ich w każdym modelu na ZOOOP. · Doładuj, kiedy potrzebujesz, bez miesięcznego spalania.

Powered by Google's API on ZOOOP

Kluczowe cechy

30 voices

A library of 30 named voices — from Kore and Puck to Zephyr and Achernar — covering a wide range of tones and characters.

Style instructions

Add a separate style instruction to steer delivery — pace, tone, and emotion — beyond the words themselves.

Google Gemini lineage

Built on Google's Gemini speech models for natural, expressive output.

Per-1,000-character pricing

Priced by text length, so cost scales cleanly with script size.

Przypadki użycia

Narration and voiceover

Narration and voiceover

Generate clear, expressive narration for videos, explainers, and presentations.

Style-directed delivery

Style-directed delivery

Use style instructions to set an upbeat, calm, or dramatic read from the same text.

Character voices

Character voices

Pick from 30 voices to give different characters distinct deliveries.

Drive a talking avatar

Drive a talking avatar

Generate the voice, then drive an avatar model like Kling Avatar V2 with it.

E-learning audio

E-learning audio

Produce consistent course narration across many lessons.

Podcast and audio content

Podcast and audio content

Generate spoken segments and intros with a chosen voice and style.

Wybierz odpowiedni model

Pick the right voice model. Your credits work everywhere on ZOOOP.

Expressive TTS with style controlGemini 3.1 Flash TTS
ElevenLabs flagship voiceoverMultilingual V3
Multilingual TTS, QwenQwen3-TTS
Cheap, many-voice TTSInworld TTS
Drive a talking avatarKling Avatar V2
Sound effects and ambienceSound Effects V2

Jak używać

01

Open Gemini 3.1 Flash TTS from this page or pick it in the Audio tools.

02

Paste your text and pick a voice.

03

Add a style instruction to steer delivery if needed.

04

Generate, then download or send the audio to your canvas.

Głębokie nurkowanie

What Gemini 3.1 Flash TTS is good at — and what it's not

Gemini 3.1 Flash TTS is Google's expressive text-to-speech model, built on the Gemini speech lineage. Its two defining strengths are a library of 30 named voices — Kore, Puck, Zephyr, Achernar, and more, spanning a wide range of tones and characters — and a separate style instruction field that lets you direct the delivery. The same script can be read upbeat, calm, or dramatic depending on the instruction, which gives finer control than picking a voice alone.

Pricing is per 1,000 characters, so cost scales cleanly with script length — predictable for everything from a short voiceover to a full narration. It's a natural pairing for talking-avatar work: generate the voice here, then drive a model like Kling Avatar V2 with it.

Where it sits among ZOOOP's voice models: Multilingual V3 is ElevenLabs' flagship with deep voice control; Qwen3-TTS and Inworld TTS lead on multilingual coverage and value. Gemini 3.1 Flash TTS's sweet spot is expressive, style-directed narration with Google's voices.

A reasonable mental model: default to Gemini 3.1 Flash TTS when you want expressive narration with explicit style control, and switch to Multilingual V3 for ElevenLabs' voice library or Inworld/Qwen for broad multilingual coverage.

Najczęściej zadawane pytania

How many voices does Gemini 3.1 Flash TTS have?+

30 named voices spanning a range of tones and characters.

What are style instructions?+

A separate field to direct delivery — pace, tone, emotion — so the same text can be read upbeat, calm, or dramatic.

How is it priced?+

Per 1,000 characters of text, so cost scales with script length.

How does it compare to ElevenLabs Multilingual V3?+

Both are high-quality TTS. Gemini 3.1 Flash TTS offers Google's voices with style instructions; Multilingual V3 is ElevenLabs' flagship with deep voice control. Pick by voice preference and workflow.

Więcej modeli