ElevenLabs Crosses $500M ARR, Adds BlackRock and NVIDIA to $550M+ Series D
ElevenLabs has surpassed $500 million in annual recurring revenue within the first four months of 2026, up from $350M at year-end 2025. The AI voice company announced a third close of its Series D round — now topping $550 million — with new institutional investors including BlackRock, Wellington, NVIDIA, and D.E. Shaw, plus a roster of celebrity backers that includes Jamie Foxx and Eva Longoria.
ElevenLabs, the AI voice platform that turned text-to-speech from a novelty into an industry-grade capability, has crossed $500 million in annual recurring revenue — a milestone the company hit in the first four months of 2026, up from $350 million at the end of last year. To mark the occasion, the company announced a third close of its Series D funding round, bringing the total to more than $550 million at an $11 billion valuation and adding a striking mix of institutional heavyweights and celebrity names to its cap table.
New institutional investors include BlackRock, Wellington Management, D.E. Shaw, and Schroders. On the strategic side, NVIDIA’s investment arm NVentures and Santander joined the round, alongside enterprise technology partners. The celebrity cohort — more than 30 actors, musicians, athletes, and entertainment executives — includes Jamie Foxx, Eva Longoria, and Squid Game creator Hwang Dong-hyuk.
The headline number is the revenue growth. Going from $350 million to $500 million ARR in roughly four months implies an annualized growth rate exceeding 50 percent, extraordinary for a company already operating at this scale. Most AI voice startups that got to a few million in ARR struggled to find enterprise adoption beyond novelty use cases. ElevenLabs has systematically moved upmarket.
How ElevenLabs Became Infrastructure
Founded in 2022 by Mati Staniszewski and Piotr Dabkowski — two Polish engineers who previously worked in ML at Google and Palantir — ElevenLabs launched with a single product: a text-to-speech API that produced more natural-sounding voices than anything that had existed before. Early adopters included audiobook creators, YouTube content producers, and podcasters looking to localize content without expensive re-recording.
What happened next was less the result of a grand strategy than product-market pull. Enterprises started using the API to build customer service voice agents; media companies used it to dub content at scale; video game studios integrated it for dynamic NPC dialogue. Each of these use cases has a completely different buyer, a different pricing model, and a different competitive dynamic. ElevenLabs leaned into all of them.
The platform today encompasses far more than text-to-speech. ElevenLabs has built a full audio AI suite: voice cloning, voice design, a speech-to-speech model for real-time voice conversion, a conversational AI framework for building voice agents, and — most recently — ElevenMusic, a music generation tool that has put it in direct competition with Suno and Udio in the generative music market.
Why BlackRock and Wellington Are Paying Attention
The involvement of BlackRock and Wellington — two of the world’s largest asset managers, not traditional venture investors — is the most strategically interesting detail in today’s announcement.
Both firms manage enormous fixed-income and equity portfolios that depend on real-time information synthesis and client communication. Voice AI sits at the intersection of those workflows in ways that text-based AI does not. A voice-first earnings call summary, a personalized portfolio update delivered as spoken audio, an AI relationship manager that can converse naturally with retail investors — these are products that asset managers have been trying to build for years. Having a direct stake in ElevenLabs’s development gives BlackRock and Wellington a front-row seat to shape how those products get built.
There is also a content licensing dimension. As AI-generated voice content proliferates, rights management for synthetic voices is becoming a genuinely complicated legal landscape. Having sophisticated financial institution investors with deep legal and compliance teams is an asset when navigating what the music and entertainment industries are pushing back against.
The Celebrity Angle
The entertainment industry names on ElevenLabs’s cap table are not purely cosmetic. Jamie Foxx, Eva Longoria, and Hwang Dong-hyuk each have a professional stake in understanding how AI voice and audio technology will evolve — and in ensuring it does not simply commoditize the human creative work they represent.
Hwang Dong-hyuk’s participation is especially pointed. Squid Game is one of the most successfully localized pieces of video content in history, and ElevenLabs’s dubbing technology is the direct application of that commercial use case at industrial scale. His investment reads as both strategic positioning and a bet on the company’s ability to preserve voice authenticity through AI dubbing rather than flatten it.
ElevenLabs has been aggressive in signing talent agreements — licensing agreements that pay voice actors for training data and ongoing synthetic usage — which it frames as a responsible AI approach and a competitive differentiator against less scrupulous alternatives.
Platform War Risk
ElevenLabs does not operate without competitive risk. Every major AI platform company is building voice capabilities. OpenAI’s Advanced Voice Mode, Google’s Gemini Live, and Anthropic’s audio features all encroach on the same territory. Meta is building voice AI for WhatsApp at a scale that dwarfs ElevenLabs’s user base by orders of magnitude.
The bear case is that voice AI becomes a commodity feature built into every LLM provider’s stack, and the standalone voice API market collapses as customers simply call OpenAI or Google for speech synthesis as part of their existing contract. ElevenLabs would need to move up the value chain — toward the end-to-end conversational agent platform, the content production toolchain, or the talent rights infrastructure — to avoid being squeezed.
The bull case, which the company appears to be executing against, is that voice is sufficiently technically nuanced — voice cloning fidelity, emotional expressiveness, real-time latency, multilingual naturalness — that there remains a durable place for a focused specialist. The company’s breadth of audio products (speech, music, voice design, agents) gives it more surface area than a pure API play.
At $500 million ARR and growing, ElevenLabs has earned the right to test that thesis.