You can use the break tag when generating audio via the API. This will create an exact and natural pause in the speech. It is not just added silence between words, but the AI has an actual understanding of this syntax and will add a natural pause.
The syntax for the break tag is <break time="1.5s" /> and the AI can handle pauses of up to 3 seconds in length.
For more information, please see the Pause section of our guide to Prompting.
Our Eleven English V1 and Eleven Turbo V2 models support SSML phoneme tags, and these can be used when generating audio via the API using these models.
For full details on how to use phoneme tags, please see the Pronunciation section of our guide to Prompting.