ToolQuestor Logo

Voxtral TTS by Mistral AI Can Copy Any Voice in Just 5 Seconds

Mistral AI has launched Voxtral TTS, its first text-to-speech model that clones voices in seconds and speaks nine languages with natural emotion.

2 min readHHaneem
March 27, 2026 at 04:56 AM
Voxtral TTS by Mistral AI Can Copy Any Voice in Just 5 Seconds

Mistral AI has released its first text-to-speech model, called Voxtral TTS. The model turns written text into natural, lifelike speech and is now available for developers and businesses to use.

Voxtral TTS supports nine languages, including English, French, Spanish, Arabic, Hindi, and German. It can handle different dialects within those languages and produce speech that sounds natural rather than robotic. The model can also express emotion, adjusting tone based on the context of the text.

One of its standout features is voice cloning. Users only need three to five seconds of reference audio to copy a voice. The model preserves the accent, tone, and speaking style of the original voice. It can also apply a voice from one language to speech in another language, which makes it useful for translation tools.

The model responds very quickly. Tests show it delivers the first audio output in around 70 milliseconds, making it suitable for live conversations, customer support systems, and voice assistants.

Mistral has made the model weights available on Hugging Face under a non-commercial license. Developers who want to use it for business purposes can access it through an API at $0.016 per 1,000 characters. Anyone can also try it directly through Mistral Studio without any setup.

The model has about 4 billion parameters and runs on a single graphics card with at least 16 GB of memory. In human evaluations, it performed at a similar level to ElevenLabs v3, one of the leading voice AI tools on the market.

Voxtral TTS is designed to work alongside Mistral's existing speech-to-text tools, allowing developers to build complete voice pipelines. Mistral positions it as a practical, lower-cost alternative to proprietary voice AI systems, especially for companies that want to run AI locally or keep data private.

Recent News

Comments

Sign in to leave a comment

No comments yet. Be the first!