
Clone a brand voice
Clone a consistent brand or presenter voice from a sample and reuse it across content.
Resemble AI's Chatterbox — clone a voice from a sample and speak text in 20+ languages.
จ่ายหนึ่งครั้งสำหรับเครดิต - ใช้พวกเขาในทุกรุ่นใน ZOOOP · เติมเงินเมื่อคุณต้องการไม่มีการเผาไหม้รายเดือน
Powered by Resemble AI's API on ZOOOP
Provide a reference audio sample and Chatterbox speaks your text in that cloned voice.
Speak across more than 20 languages — English, Chinese, Japanese, Korean, Spanish, French, German, Arabic, Hindi, and more.
Tune exaggeration and temperature to shape how expressive and varied the delivery is.
Built on Resemble AI's voice-cloning stack.

Clone a consistent brand or presenter voice from a sample and reuse it across content.

Speak the same cloned voice across 20+ languages for localized content.

Raise exaggeration for a livelier read or keep it low for a steady tone.

Generate the cloned voice, then drive an avatar model like Kling Avatar V2 with it.
Pick the right voice model. Your credits work everywhere on ZOOOP.
Open Chatterbox TTS Multilingual from this page or pick it in the Audio tools.
Upload a reference voice sample and paste your text.
Pick the language; tune exaggeration and temperature as needed.
Generate, then download or send the audio to your canvas.
Chatterbox TTS Multilingual is a voice-cloning model from Resemble AI: provide a reference audio sample and it speaks your text in that cloned voice. Unlike a preset-voice TTS model, it reproduces a specific voice — a brand presenter, a character, or your own — which is the point when consistency to a particular voice matters more than picking from a library.
Its standout is range: the cloned voice can speak across more than 20 languages — English, Chinese, Japanese, Korean, Spanish, French, German, Arabic, Hindi, and more — so one cloned voice can carry localized content. Two controls, exaggeration and temperature, shape how expressive and how varied the delivery is.
Where it sits among ZOOOP's voice models: Index TTS 2 is a voice clone with fine emotion control; LUX TTS is the cheapest clone; for preset voices use Multilingual V3 or another TTS model. Chatterbox's sweet spot is multilingual voice cloning from a sample.
A reasonable mental model: default to Chatterbox when you need a specific cloned voice across many languages, and switch to Index TTS 2 for emotion control or a preset-voice TTS when you don't need cloning.
A reference audio sample of the voice. It then speaks your text in that cloned voice.
More than 20, including English, Chinese, Japanese, Korean, Spanish, French, German, Arabic, and Hindi.
Exaggeration shapes how expressive the delivery is; temperature controls how varied versus consistent it is.
Cloning reproduces a specific voice from your sample, rather than choosing from a fixed library. For preset voices, use a TTS model like Multilingual V3.
Audio Reference*
Text*
Language of Audio Reference*
Exaggeration*
Temperature*
เรียกเก็บเงินต่อ 1000 - บล็อกอักขระ