Two Engineers Build Open Source Voice AI That Rivals ElevenLabs and OpenAI

A two-person team at Nari Labs has just dropped Dia, a powerful new open source text-to-speech model—and it’s turning heads across the AI landscape. Built with zero funding and trained on Google TPUs via the Research Cloud, Dia aims to beat industry leaders like ElevenLabs, OpenAI’s gpt-4o-mini-tts, and Google’s NotebookLM podcast tool. What sets it apart? Natural dialogue delivery, emotional tone, and the ability to interpret nonverbal cues like laughter and coughing—all from plain text.

Openly available on GitHub and Hugging Face under an Apache 2.0 license, Dia is already making waves in audio generation communities with example demos showing it outperforming proprietary models. With advanced voice control, voice cloning, and expressive performance baked in, this model isn’t just for coders—it’s built for anyone who wants to generate high-quality spoken content.

Key Points

Dia is a 1.6B parameter text-to-speech (TTS) model that rivals and, in many cases, outperforms competitors like ElevenLabs Studio, Sesame CSM-1B, and NotebookLM’s podcast generator.
Built by just two engineers at Nari Labs, the model was created with no external funding and is fully open source.
Dia supports emotional tones, speaker tagging, and nonverbal audio cues (like laughs or coughs), giving it a lifelike, conversational edge.
Side-by-side comparisons show Dia’s superior ability to handle rhythmic complexity, emotional transitions, and expressive delivery.
Voice cloning and prompt-based tone matching are included, expanding creative and commercial applications.
Available under Apache 2.0 license, making it legally viable for commercial use.
Nari Labs explicitly bans misuse, including impersonation, misinformation, and illegal activity.
Developers can try it with a Gradio-based demo, and a consumer-friendly remix tool is on the way.

Key Quotes

“Dia rivals NotebookLM’s podcast feature while surpassing ElevenLabs Studio and Sesame’s open model in quality.” – Toby Kim, Co-creator of Dia
“We were not AI experts from the beginning… None of them sounded like real human conversation.” – Kim, recounting their motivation
“Dia interprets and delivers actual laughter, whereas ElevenLabs and Sesame output textual substitutions like ‘haha’.” – From Nari Labs’ audio comparison tests
“We wanted more—more control over the voices, more freedom in the script.” – Toby Kim on why they built Dia

Implications

Dia represents a significant moment for open source voice tech. It’s a rare example of a small, grassroots AI team challenging the dominance of major players with a model that combines both accessibility and expressiveness. By making it commercially usable out of the box and prioritizing responsible development, Nari Labs is positioning Dia as a go-to solution for developers, educators, content creators, and more.

This isn’t just about better voice synthesis—it’s about democratizing the ability to create digital voices that feel human. And if this is what two people can build without funding, the ripple effects across AI development could be massive.

Source: https://venturebeat.com/ai/a-new-open-source-text-to-speech-model-called-dia-has-arrived-to-challenge-elevenlabs-openai-and-more/

AISQSquirrly

I am yourGateway

Gateway to AI Adoption. This is the Future of Marketing.

|

Key Points

Key Quotes

Implications

Share This Article

AISQ

Related Post

Anthropic’s Latest AI Update Can Use a

You Can Now Schedule Reminders and Recurring

Could AI Handle 90% of Your Accounting Tasks?

Leave a Comment Cancel reply

Recent Posts

Recent Comments

92% of businesses plan to invest in AI this year

Your gateway to ai adoption

Key Points

Key Quotes

Implications

Share This Article

AISQ

Related Post

Anthropic’s Latest AI Update Can Use a

You Can Now Schedule Reminders and Recurring

Could AI Handle 90% of Your Accounting Tasks?

Leave a Comment Cancel reply

Recent Posts

Recent Comments

92% of businesses plan to invest in AI this year

Your gateway to ai adoption

I'm a paid user

I'm a free user