Eleven v3 (Alpha) is our latest and most expressive Text to Speech model, offering:
- More human-like generations with higher quality overall
- Support for audio tags
- emotions: [sad] [angry] [happily]
- delivery direction: [whispers] [shouts]
- non-verbal reactions: [laughs][clears throat] [sighs]
- Dialogue mode to support natural sounding audio with multiple speakers
- Support for 70+ languages
Eleven v3 is in public alpha and is automatically available for all ElevenLabs users. It's a research preview and requires more prompt engineering than our previous models. When it works the output is breathtaking but the reliability and higher latency means it’s not suitable for real-time and conversational use cases. For these, we recommend v2.5 Turbo or Flash. We are working on the real-time version of Eleven v3.
You can generate using v3 (Alpha) via API using our Create speech and Stream speech endpoints by specifying model ID eleven_v3.
You can also use our Create dialogue and Stream dialogue endpoints to create a natural sounding dialogue with multiple speakers.
Visit the following resources for more information:
Please note that under our Beta Services Addendum, content generated using Beta Services cannot be used for any commercial purpose or in any production environment. This restriction applies to content generated using v3 (Alpha).