Pauses:
You can use the break tag when generating audio via the API. This will create an exact and natural pause in the speech. It is not just added silence between words, but the AI has an actual understanding of this syntax and will add a natural pause.
The syntax for the break tag is <break time="1.5s" /> and the AI can handle pauses of up to 3 seconds in length.
All of our models, with the exception of Eleven V3, support SSML break tags, and these can be used when generating audio via the API.
If you are using Eleven V3, you can instead incorporate expressive pause tags such as [pause], [short pause], and [long pause]. These tags are exclusive to Eleven V3 and are not supported by other models.
For more information, please see the Pause section of our guide to Prompting.
Phonemes:
Our Eleven English V1, Eleven Flash V2, and Eleven Turbo V2 models support SSML phoneme tags, and these can be used when generating audio via the API using these models.
Please note that phonemes are available only for English language models and are currently not supported for other languages.
For full details on how to use phoneme tags, please see the Pronunciation section of our guide to Prompting.