Dia: Revolutionary Open-Source Text-to-Speech Model

A new, open source text-to-speech model called Dia has arrived to challenge ElevenLabs, OpenAI and more

Posted under: AI technologies

Date: 2025-04-23

Dia: Revolutionary Open-Source Text-to-Speech Model | Justo Global

Nari Labs has launched Dia, an open-source text-to-speech model that offers advanced features like emotional tone control, speaker tagging, and nonverbal audio cues. Running on PyTorch 2.0+ with CUDA 12.6, Dia outperforms competitors like ElevenLabs and Sesame in handling natural timing, emotional range, and nonverbal expressions. The Apache 2.0-licensed model requires 10GB VRAM and delivers 40 tokens per second on NVIDIA A4000 GPUs. Developed by a two-person team with support from Google TPU Research Cloud and Hugging Face, Dia is available via GitHub and Hugging Face, with a consumer version in development.

Read more at: venturebeat.com