In a significant stride for India’s artificial intelligence landscape, Bengaluru-based startup Sarvam AI has unveiled its flagship large language model (LLM), Sarvam-M. Boasting 24 billion parameters, this open-source model is designed to excel in Indian languages, mathematics, and programming tasks .
What is Sarvam-M?
Sarvam-M is a hybrid language model built upon the open-source Mistral Small framework. It has been fine-tuned using a combination of supervised learning and reinforcement learning with verifiable rewards (RLVR) to enhance its performance in specific domains. The model is optimized for applications such as conversational AI, machine translation, and educational tools .
Performance Benchmarks
Sarvam-M demonstrates impressive capabilities across various benchmarks:
- Mathematics: Achieves a 94% accuracy on the GSM-8K benchmark.
- Programming: Scores 88% on HumanEval, indicating strong code generation abilities.
- Indian Languages: Outperforms several existing models in tasks involving Indian languages, showcasing its proficiency in multilingual understanding .
Technical Innovations
To enhance efficiency and scalability, Sarvam AI implemented several technical optimizations:
- Inference Optimization: Utilized post-training quantization (PTQ) and lookahead decoding to improve throughput without compromising accuracy.
- Deployment: The model is deployed on NVIDIA H100 GPUs, ensuring high-performance inference capabilities .Sarvam AI
Community and Industry Response
The launch of Sarvam-M has sparked discussions within the AI community. While some critics questioned the model’s reliance on existing frameworks, industry leaders like Zoho CEO Sridhar Vembu defended Sarvam AI’s approach, emphasizing the importance of iterative development and long-term vision in AI innovation .
Looking Ahead
Sarvam AI’s release of Sarvam-M marks a pivotal moment in India’s journey toward developing sovereign AI capabilities. By focusing on Indian languages and culturally relevant applications, Sarvam AI aims to bridge the gap in AI accessibility and foster innovation tailored to the Indian context.