Gradio

This demo is running a fine-tuned XTTS model. XTTS is a multilingual text-to-speech and voice-cloning model. This demo features zero-shot voice cloning.

Supported languages: Finnish: fi, English: en, Estonian: et, German: de, Russian: ru

Text Prompt

One or two sentences at a time is better. Up to 200 text characters.

Language

Select an output language for the synthesised speech

Reference Audio

0:00

This check can improve output if your microphone or reference voice is noisy

Cleanup Reference Voice

I agree to the terms of the CPML: https://coqui.ai/cpml

Agree

Synthesised Audio

Metrics

Reference Audio Used

Examples

Text Prompt	Language	Reference Audio	Cleanup Reference Voice	Agree