What are some issues I might encounter and how can I avoid them?

AI is a highly advanced field of technology and can, at times, be unpredictable as the output is based on the input and then interpreted by the AI. We have tried to minimize the unpredictability as much as possible and keep adding features and improvements that make it more predictable and controllable. However, there are still a few things you need to be mindful of, and this applies to all generative AI.

You can read more in our guide to troubleshooting.

Multilingual v2 Model: This model represents a significant improvement in predictability and consistency compared to the experimental multilingual v1 model. It has resolved many of the issues associated with the v1 model, although some minor issues still exist, such as inconsistency and language switching.

Inconsistency: Users have reported occasional inconsistencies between AI generations, where the output does not fit together perfectly. This issue is being worked on and is less prominent in the multilingual v2 model. Cloning the voice with consistent samples is recommended to address this.
Language Switching: A common problem is the AI switching languages or accents within a single generation, especially in longer texts. This issue is being addressed, but using a properly cloned voice with the Projects feature can help mitigate it.
Corrupt Speech: A rare issue where the AI produces muffled and strange-sounding speech. There are no specific solutions, but regenerating the section usually resolves it.

Studio (previously Projects): Studio is a workflow for creating long-form content using AI. It generally works well with a proper voice choice and model.

Import Function: The import function attempts to import files, but due to the number of formatting variables, users should double-check imported content for accuracy. Some issues may require manual adjustments.
Glitches between Paragraphs: Occasionally, glitches or abrupt transitions between paragraphs may happen. This issue is rare and is being actively worked on. Regenerating the last paragraph can often resolve it.

Multilingual v1 (Experimental): This experimental model may exhibit tone and quality issues across longer segments, introduce noise, and have transitions between male and female voices. It is less stable for longer generations compared to the monolingual model and is recommended for shorter texts in English.

Factors Affecting Issues: Several factors affect AI performance, including text chunk length, monolingual vs. multilingual models, voice type (pre-made, voice-designed, or cloned), and settings like stability and similarity.
Future Developments: The team is actively working on improving AI performance and developing new technologies like the Projects feature.

Here are a few issues you might encounter: audio corruption, audio degradation, whispering, volume fluctuation, inconsistency, language switching, glitches, and a few. They are generally very rare and voice-dependent.