India’s Sovereign AI Race: Is the Country Missing Easy Wins?
India’s sovereign AI race is heating up. The big dream is to build India’s own large AI models from scratch. But some experts now ask a sharp question: is India chasing the hardest goal while ignoring quick, simple wins? “Sovereign AI” means AI that India controls and runs on its own terms. There may be faster ways to get there than building giant models from zero.
This debate matters because money and time are limited. India already has 100 million-plus weekly ChatGPT users, the second-largest market in the world. The country clearly loves AI. The real question is how to turn that demand into AI products that India owns. Let us break down both sides in plain words.
The case for “easy wins”
One side says India is overthinking it. Pranesh Prakash, a tech policy researcher linked to Yale Law School, argues that sovereignty is about control, not where a model was born. In his view, it means the right to “use, modify, and deploy” a model freely. If you can take a free, open model and shape it for India, that counts as sovereign.
Here is his key point in simple terms: building a model is not the hard part. Turning it into a product people use every day is. As he puts it, “the hardest part of the AI stack is not model creation but productisation.” He also notes India already has free tools available — “low-hanging fruits that just haven’t been plucked.” In other words, the quick wins are sitting right there.
What are these free tools? Open-weight models like Meta’s Llama and Mistral can be downloaded and changed at no cost. Indian companies are already using foreign AI smartly. Firms like ixigo, Bajaj Finserv, and MakeMyTrip use OpenAI. Meesho and PhysicsWallah use ElevenLabs for voice in Indian languages. The point: you can build great India-first products today without training a model from scratch.
The case for building from scratch
The other side pushes back hard. Vivek Raghavan, CEO of Sarvam AI, says some things simply cannot be fixed later. The best example is the “tokenizer” — the part of a model that breaks language into small pieces it can read. For Indian languages, a tokenizer built from scratch works far better. You cannot bolt that on after the fact.
Raghavan also points to national security. For sensitive uses, the government needs full “audit visibility” into the data a model was trained on. With a foreign model, you cannot see inside. He argues India should aim for “frontier minus one” systems — models with about 1 trillion parameters, just behind the global best. Parameters are the internal settings a model learns; more of them usually means more power.
Key facts from the debate
| Data point | Figure |
|---|---|
| Weekly ChatGPT users in India | 100 million-plus |
| Share of ChatGPT activity from users under 30 | Nearly 80% |
| Share of Indian user messages about work tasks | ~35% |
| Languages supported by BHASHINI | 36 Indian text languages |
| Downloads of Sarvam’s Indus chatbot | 1 lakh-plus on Google Play |
| IndiaAI Mission budget | ₹10,372 crore |
| Poisoned documents that can backdoor a model | As few as 250 |
Where the government’s money goes
The government has backed this with cash. The IndiaAI Mission, worth ₹10,372 crore, has been approved. Much of it funds startups building India’s own foundation models. Critics ask whether that is the right first priority. They argue more funds could go toward products and adoption, where the gap is wider.
Both sides agree on one risk: silicon. Prakash notes there is “no open-source silicon.” India still depends on foreign chips to run any AI, home-grown or not. Local chip-making is years away. So even a fully Indian model would run on imported hardware for now. This debate connects to a bigger picture you can read in our report on how OpenAI is using AI to patch software flaws at scale.
FAQ
What does “sovereign AI” mean here?
It means AI that India controls — the freedom to use, change, and run it on India’s own terms. Experts disagree on whether that requires building models from scratch or adapting free open models.
What is an open-weight model?
It is an AI model whose internal settings are shared publicly. Anyone can download it for free and change it. Llama and Mistral are popular examples. This lets Indian teams build on top without starting from zero.
Why does a tokenizer matter for Indian languages?
A tokenizer splits text into small chunks a model can read. A tokenizer built for Indian languages handles them more efficiently. You cannot fix this well after a model is trained, which is why some argue for building from scratch.
Why it matters (especially for India / founders)
For founders, this is the key takeaway: you do not need to build a giant model to win. The bigger gap is in products. India has huge demand but few breakout AI apps built on existing models. That is the “easy win” space — and it is wide open for new startups.
At the same time, deep-tech founders have a role too. Building strong tokenizers, safe data pipelines, and secure systems for government use is hard and valuable. Both paths can create real businesses. The same memory and chip needs driving this also drive deals like Micron’s investment in Anthropic on AI memory infrastructure.
The takeaway
India does not have to pick just one path. Adapting free open models can deliver quick, useful products today. Building from scratch can solve deeper needs like Indian-language tokenizers and national security. The smart move may be to do both — pluck the easy wins now while investing in the hard parts for later. The demand is clearly there. The challenge is spending time and money where they count most.