
High-volume narration
The lower cost makes Turbo a fit for long scripts and large batches.
The faster, cheaper MiniMax Speech 2.8 — emotion-controlled multilingual TTS at lower cost.
Betaal één keer voor credits - gebruik ze voor elk model op ZOOOP. · Vul bij wanneer dat nodig is, geen maandelijkse verbranding.
Powered by MiniMax's API on ZOOOP
The faster tier of MiniMax's Speech 2.8 line — the same voices and controls as HD, built for quick turnaround at volume.
Set neutral, happy, sad, angry, fearful, disgusted, or surprised to shape the read.
The same broad language-boost list as the HD tier — Chinese, English, Spanish, Japanese, Korean, Arabic, and many more.
Pick from named voices and adjust speaking speed.

The lower cost makes Turbo a fit for long scripts and large batches.

Set the emotion to match the script's tone, at a lower price than HD.

Voice scripts across a broad set of languages with language boost.

Generate the voice, then drive an avatar model like Kling Avatar V2 with it.
Pick the right voice model. Your credits work everywhere on ZOOOP.
Open Speech-2.8-Turbo from this page or pick it in the Audio tools.
Paste your text and pick a voice.
Set emotion, speed, and language boost as needed.
Generate, then download or send the audio to your canvas.
Speech-2.8-Turbo is the faster, cheaper tier of MiniMax's Speech 2.8 line — the same controls as the HD tier. That means you keep emotion control (neutral, happy, sad, angry, fearful, disgusted, surprised), the same broad language support via language boost, named voices, and adjustable speed — at a lower cost. The trade is some audio refinement versus HD, in exchange for price and speed.
That economics makes Turbo the tier for high-volume narration, long scripts, and quick iteration — generate voice takes freely, then re-run a keeper on Speech-2.8-HD if the final needs the extra polish.
Where it sits among ZOOOP's voice models: Speech-2.8-HD is the higher-quality sibling; Inworld TTS is another cheap multilingual option; Qwen3-TTS is strong on Chinese/English. Speech-2.8-Turbo's sweet spot is affordable, emotion-aware multilingual voiceover at volume.
A reasonable mental model: default to Speech-2.8-Turbo for high-volume, emotion-controlled narration at lower cost, and step up to Speech-2.8-HD when a final needs top quality.
Speech-2.8-Turbo is the faster, cheaper tier, with the same voices, emotion control, and language support. Pick Turbo for cost and volume, HD for top quality.
Yes — neutral, happy, sad, angry, fearful, disgusted, and surprised, the same as the HD tier.
The same broad language-boost list as Speech-2.8-HD.
Yes — speed is adjustable to fit your content's pacing.
Prompt*
Voice*
Speed*
Emotion*
Language Boost*