What is Eleven v3 (Alpha)?

Eleven v3 (Alpha) is our latest and most expressive Text to Speech model, offering:

  • More human-like generations with higher quality overall
  • Support for audio tags
    • emotions: [sad] [angry] [happily]
    • delivery direction: [whispers] [shouts]
    • non-verbal reactions: [laughs][clears throat] [sighs]
  • Dialogue mode to support natural sounding audio with multiple speakers
  • Support for 70+ languages

Eleven v3 is in public alpha and is automatically available for all ElevenLabs users. It's a research preview and requires more prompt engineering than our previous models. When it works the output is breathtaking but the reliability and higher latency means it’s not suitable for real-time and conversational use cases. For these, we recommend v2.5 Turbo or Flash. We are working on the real-time version of Eleven v3.

During June, we’re offering an 80% discount on generations via the website for self-serve users. Generating via API is not yet publicly available. For early access, please contact sales.

Visit the following resources for more information: