ElevenLabs has officially unveiled Music v2, its next-generation foundational AI music model. The release marks a major evolution for the voice synthesis leader, transforming its audio ecosystem from a simple “prompt-and-generate” tool into a highly controllable, block-based composition suite tailored for professional creators, musicians, and enterprise marketing teams.
The launch places ElevenLabs at the forefront of a highly competitive AI audio race, squaring off against standalone platforms like Suno, Udio, and Google’s expanding suite of Flow Music creation tools.
The Core Upgrade: Studio-Grade Compositional Control
While ElevenLabs’ first-generation music engine introduced baseline text-to-track capabilities, Music v2 centers entirely on a single theme: granular architectural control. Rather than generating a rigid, uneditable 3-minute audio file, the new model allows creators to interact with the generation timeline like a modern digital audio workstation (DAW).
1. Seamless Mid-Track Genre Shifting
One of the most impressive technical breakthroughs in Music v2 is its ability to execute radical genre and tempo transitions in the middle of a single track without breaking vocal continuity or causing arrangement collapses. Users can instruct a song to start as a classical opera, transition directly into heavy metal, drop into a rapid-fire rap verse, and pivot back to an orchestral finish while keeping the underlying key and vocal character perfectly locked.
2. High-Precision Audio Inpainting
The upgrade introduces an advanced inpainting engine. If a creator generates a track where the first 30 seconds are flawless but the bridge fails to land, they no longer have to scrap the file and start over. Users can isolate a specific block on the timeline, modify the prompt or lyrics for just that section, and regenerate it. The model smoothly blends the newly generated audio into the existing track without altering the surrounding composition.
3. Section-by-Section Custom Sequencing
Creators can build songs from scratch piece by piece. The interface allows users to explicitly define and sequence individual structural blocks:
[ Intro ] ----> [ Verse 1 ] ----> [ Chorus ] ----> [ Instrumental Solo ] ----> [ Outro ]
Each block features its own dedicated text window for independent lyric editing and instrumentation prompting. Within these blocks, users can deploy bracketed commands—such as [drum fill] or [energetic guitar solo]—to inject precise musical cues.
4. Advanced Multilingual Vocal Performance
Vocal range has been significantly enhanced to handle complex, fast-paced vocal deliveries across multiple global dialects. Out of the box, Music v2 features deep, highly clear vocal rendering in English, Spanish, German, French, and Japanese, accurately matching cultural performance nuances and linguistic cadences.
The Tri-Platform Deployment and Price Cuts
To support the rollout, ElevenLabs has structured its music ecosystem across three distinct layers, simultaneously slashing operational costs to attract high-volume developers:
- ElevenMusic: A consumer-facing web and mobile interface designed for social listening, rapid remixing, original track creation, and public publishing.
- ElevenCreative: A corporate-facing content pipeline platform that allows marketing teams to deploy Music Finetunes—training custom, copyright-compliant mini-models on proprietary brand audio assets to ensure a consistent sonic identity. Self-serve pricing for ElevenCreative has been reduced by up to 40%.
- ElevenAPI: Built specifically for external developers looking to programmatically bake music generation and reference-matching into applications. API pricing has been cut by up to 50%, with Music v2 access scheduled to roll out to the public endpoint over the coming days.
Insulating Creators from Copyright and Legal Friction
As lawsuits over AI training data continue to disrupt the generative media space, ElevenLabs is taking a strictly protective path toward monetization. The company explicitly confirmed that Music v2 was trained entirely on a fully licensed, pre-screened dataset built in direct collaboration with independent labels, publishers, and emerging musical artists.
Consequently, all tracks generated under commercial subscription plans are fully cleared for mainstream business applications, including television broadcasting, film scoring, video game soundtracks, podcast backgrounds, and social media advertising channels.
