
Chinese-language voiceover
Natural Chinese narration for video, e-learning, and presentations.
Alibaba's Qwen3-TTS — multilingual text-to-speech with strong Chinese and English voices.
Pay once for credits — use them across every model on ZOOOP. · Пополняйте, когда вам нужно, без ежемесячного сжигания.
Powered by Qwen's API on ZOOOP
Generate speech across English, Chinese, Spanish, French, German, Italian, Japanese, Korean, Portuguese, and Russian, with auto language detection.
A set of distinct voices — Serena, Aiden, Ryan, Eric, and more — to match tone and character.
Tune the temperature to make delivery more consistent or more varied.
From Alibaba's Qwen line, with particularly strong Chinese and English output.

Natural Chinese narration for video, e-learning, and presentations.

Voice the same script across 10+ languages with auto detection.

Pick a named voice to give different characters distinct deliveries.

Generate the voice, then drive an avatar model like Kling Avatar V2 with it.
Pick the right voice model. Your credits work everywhere on ZOOOP.
Open Qwen3-TTS from this page or pick it in the Audio tools.
Paste your text and pick a voice; set the language or leave it on auto.
Adjust temperature for consistency or variety if needed.
Generate, then download or send the audio to your canvas.
Qwen3-TTS is Alibaba's multilingual text-to-speech model from the Qwen line, with particularly strong Chinese and English output. It covers ten languages — English, Chinese, Spanish, French, German, Italian, Japanese, Korean, Portuguese, and Russian — and includes an auto-detect option, so the same workflow handles a mixed-language script or a localized set without manual switching.
Beyond language coverage, it offers a set of named voices to match tone and character, and a temperature control to tune how consistent or varied the delivery is — steadier for narration, looser for expressive reads. Pricing is per 1,000 characters.
Where it sits among ZOOOP's voice models: Inworld TTS is cheaper with a broader voice library; Multilingual V3 is ElevenLabs' flagship with deeper voice control; Gemini 3.1 Flash TTS adds explicit style instructions. Qwen3-TTS's sweet spot is multilingual narration where Chinese and English quality matters.
A reasonable mental model: default to Qwen3-TTS for Chinese/English-led multilingual voiceover, and switch to Inworld for the lowest cost or Multilingual V3 / Gemini for deeper voice and style control.
English, Chinese, Spanish, French, German, Italian, Japanese, Korean, Portuguese, and Russian, with an auto-detect option.
Yes — from Alibaba's Qwen line, it produces particularly strong Chinese and English speech.
It tunes how consistent or varied the delivery is — lower for steady reads, higher for more variation.
Both are multilingual. Qwen3-TTS is strong on Chinese and English with temperature control; Inworld TTS is cheaper with a broader voice library. Pick by language priority and budget.
Text*
Voice*
Language*
Temperature*
Billed per 1000-character block