Our Flash and Turbo models have been specially developed for low-latency applications.
Flash v2 and Flash v2.5 are our ultra-low-latency models, generating audio in less than 75ms. Flash v2 is English only, while Flash v2.5 supports 32 languages. You can see a full list of all supported languages here. The Flash models have slightly lower quality and emotional depth than the Turbo models, but offer significantly lower latency.
Our Turbo models are also low-latency. Turbo v2 is English only, while Turbo v2.5 supports 32 languages, and is 25% faster than Turbo v2, generating audio in around 300ms. This is a highly optimized model, specifically tailored for low-latency applications without sacrificing vocal performance and keeping inline with the quality standard that people have come to expect from our models.
Both the Flash and Turbo models cost 1 credit for every 2 characters generated.
We also offer Conversational AI, our platform for deploying customized, interactive voice agents. Visit our Conversational AI documentation to learn more.