Sarvam AI release ‘Saaras V3’ model

In a major push for “Sovereign AI,” Bengaluru-based startup Sarvam AI has officially launched Saaras V3, its latest generation of Automatic Speech Recognition (ASR). Released on February 11, 2026, Saaras V3 is specifically engineered to tackle the “code-mixing” (Hinglish, Tanglish, etc.) and noisy environments that typically cause global AI models to struggle.

The launch is part of Sarvam’s 14-day “launch blitz” leading up to the India-AI Impact Summit 2026.

Benchmarking Excellence: Beating the Giants

The most striking claim from the Saaras V3 release is its performance against global frontier models. According to Sarvam, Saaras V3 recorded a significantly lower Word Error Rate (WER) on the IndicVoices and Svarah datasets than Western competitors.

Model	IndicVoices WER (Lower is Better)
Sarvam Saaras V3	~19.3%
OpenAI GPT-4o Transcribe	~24.5%
Google Gemini 3 Pro	~26.1%
Deepgram Nova-3	~23.8%

Key Features of Saaras V3

Saaras V3 isn’t just a basic update; it introduces a new architecture trained on over one million hours of multilingual Indian audio.

22-Language Support: Natively understands all 22 scheduled Indian languages plus English.
Real-Time Streaming: Unlike batch-only models, Saaras V3 can transcribe audio as it is being spoken with a “Time-to-First-Token” (TTFT) of under 150ms.
Advanced Diarization: Automatically identifies and labels different speakers in a single recording, ideal for meeting transcripts and call center audits.
Numeric Fidelity: High precision in capturing dates, currency, and phone numbers, even when spoken in mixed languages.

Pricing and Developer Access

Sarvam is positioning Saaras V3 as a cost-effective alternative for Indian enterprises, offering a “pay-per-use” model that is significantly cheaper than global APIs.

Service Type	Price (INR)	Unit
Speech to Text	₹30	Per Hour
Speech to Text + Diarization	₹45	Per Hour
Speech to Text + Translation	₹30	Per Hour

Beyond Speech: The Sarvam Ecosystem

Saaras V3 joins a growing suite of specialized models released by the startup in February 2026, including:

Bulbul V3: A state-of-the-art Text-to-Speech (TTS) model with 35+ professional-quality Indian voices.
Sarvam Vision: A 3-billion parameter model optimized for OCR and document intelligence in Indic scripts.
Sarvam Dub: An AI-powered dubbing tool capable of zero-shot voice cloning.

Lapaas Voice

Subscribe to newsletter

Startup

Artificial Intelligence

Funding

Case Studies

Lapaas Voice

Startup

Artificial Intelligence

Funding

Case Studies

Lapaas Voice

Trending

Related Posts

Sarvam AI release ‘Saaras V3’ model

Benchmarking Excellence: Beating the Giants

Key Features of Saaras V3

Pricing and Developer Access

Beyond Speech: The Sarvam Ecosystem

LEAVE A REPLY Cancel reply

Popular Articles

Lapaas Voice

About us

Latest Articles

Most Popular

Subscribe