Can I reduce API latency?

  • Updated

To find the most comprehensive and up-to-date information about this, we recommend reading our documentation here.

Through the API, you also have the option to optimize the generative process of the AI, which helps reduce latency but may affect accuracy. We offer five optimization levels:

0 = Default mode (no latency optimizations)
1 = Normal latency optimizations (about 50% of possible latency improvement of option 3)
2 = Strong latency optimizations (about 75% of possible latency improvement of option 3)
3 = Max latency optimizations
4 = Max latency optimizations, but also with text normalizer turned off for even more latency savings (best latency, but can mispronounce eg. numbers and dates)

5 = Max latency optimizations, but also with text normalizer turned off for even more latency savings (best latency, but can mispronounce eg. numbers and dates)

Going from 0 to 5, you can expect to reduce the latency by 300-400ms. To incorporate this optimization, add the query parameter `optimize_streaming_latency=[OPTIMIZATION_LEVEL]` to the streaming TTS endpoint.

Example:

/text-to-speech/{voice_id}/stream?optimize_streaming_latency=3