IIT Bombay’s BharatGen has been selected to lead one of India’s most ambitious AI projects: building a large language model (LLM) with one trillion parameters, under the IndiaAI Mission.
What is BharatGen and What Has It Done So Far
- BharatGen is a consortium led by IIT Bombay, including several top Indian institutes like IIT Kanpur, IIT Madras, IIIT Hyderabad, IIT Hyderabad, IIT Mandi, IIM Indore and others.
- It already released Param-1, a bilingual LLM with 2.9 billion parameters, trained on ~5 trillion tokens (English and Hindi).
- It also developed speech models across multiple Indian languages and is focused on multimodal AI (text, speech, images).
The Trillion-Parameter LLM: Ambitions & Funding
- As part of the second phase of IndiaAI Mission, eight entities are selected; among them, BharatGen will build the 1-trillion parameter foundational model.
- To support this effort, BharatGen has been allocated about Rs 988.6 crore (nearly 10,000 million rupees) of government assistance from the Ministry of Electronics & Information Technology (MeitY).
- The model aims to be open source (open weights), with intellectual property held by IIT Bombay & the IndiaAI Mission.
Significance: Why a Trillion Parameters Matters
- A model of this scale places India among the countries developing flagship LLMs at global scale. A trillion-parameter model often allows more capacity: better understanding of nuance, supporting multiple languages, handling complex reasoning, etc.
- For India’s context: linguistic diversity (22 scheduled languages), different dialects, accents, lower representation of Indic languages in many datasets — such a model can help in bridging gaps.
Challenges & Considerations Ahead
- Compute & Infrastructure: Training and deploying trillion-parameter models require massive computational resources (GPUs/TPUs, storage, energy). Ensuring sufficient infrastructure is critical.
- Data: High-quality annotated data in many Indian languages, dialects, modalities will be needed; ensuring diversity, ethical sourcing, privacy.
- Cost & Efficiency: Balancing cost vs performance; optimizations for latency, serving, fine-tuning.
- Bias & Fairness: Ensuring model behaves fairly across different languages, socio-cultural backgrounds.
- Regulation & Sovereignty: Open weights helps transparency; but concerns around misuse, safety will require governance frameworks.
What This Means for India & Global AI Ecosystem
- India is pushing toward AI sovereignty — reducing dependency on foreign models and infrastructure. Moneycontrol
- It’s likely to spur growth in related domains: startups building applications using BharatGen, improvements in local datasets, additional compute infrastructure investment.
- Having a massive LLM with open access could enable researchers, smaller companies, and public sector to build better tools for education, health, agriculture, governance etc.
Conclusion
With BharatGen’s trillion-parameter model, IIT Bombay is set to take India’s AI ambitions to a new scale. The approval of nearly Rs 988.6 crore funding under the IndiaAI Mission shows government commitment. If successful, this model could be a foundational pillar for inclusive, multilingual, and sovereign AI solutions in India. The coming months will be critical in shaping how well the project handles challenges of scale, fairness, and usability.