📚 New to this topic? Read our full guide: Generative AI Explained.
Coinbase Switches to Cheaper Chinese AI Models, Halving Its AI Bill
Coinbase is a big US company. People use it to buy and sell cryptocurrency, which is a kind of digital money. Now Coinbase runs on cheap AI made in China. And this change cut its AI bill in half. The boss of Coinbase is Brian Armstrong. (A boss who runs a whole company is called the CEO, short for chief executive officer.) He says the company uses more AI than ever, but pays much less.
A news site called The Decoder reported this. It says many companies are now rushing to use cheap Chinese AI. An AI model is a computer program that has been trained to “think.” It can read, write, and write computer code. This big switch is hurting Western AI makers like OpenAI and Anthropic. And it is happening just as some of them get ready to sell shares to the public.
What Coinbase changed
Coinbase now uses AI models called GLM 5.2 and Kimi 2.7. Both were made by labs in China. Armstrong said the company now uses more “tokens” than ever. But it pays about half of what it used to pay. A token is a tiny piece of text. It is about one word, or part of a word. AI companies charge you based on how many tokens you use. So more tokens normally means a bigger bill. But Coinbase did the opposite. It used more AI and still paid less.
Coinbase workers can still pick any AI model they like. But 91% of them never even reach their old usage limits, Armstrong said. So for most jobs, the cheaper Chinese models work just fine.

The clever part: smart routing and caching
Coinbase also built an automatic “routing” system. Routing means it sends each job to the best AI model. It picks the model based on the task, the price, and one more thing. It checks if an old answer can be used again. Using an old answer again is called caching. Caching means you save a result so you do not have to pay to work it out a second time. Better caching alone lifted Coinbase’s “hit rate” from 5% to 60%. A higher hit rate means more answers come from cheap saved storage. Fewer answers need fresh, costly computing.
Workers are told to keep their instructions short. They are also told to start a fresh session for each new task. Experts call this habit “context engineering.” It keeps every job small and cheap. All these steps cut Coinbase’s AI spending in half, even though usage kept going up.
Key facts at a glance
| Detail | What the report says |
|---|---|
| Company | Coinbase (US crypto exchange) |
| New models used | Chinese models like GLM 5.2 and Kimi 2.7 |
| AI bill | Cut roughly in half |
| Usage | Higher than ever (more tokens) |
| Cache hit rate | Rose from 5% to 60% |
| Devs hitting old limits | Only 9% (91% never did) |
| Others doing this | Lindy (Deepseek v4), Snowflake testing Chinese models |
One smart rule on spending
Coinbase shows each worker how much AI they use. But it does not put a limit on it. This fits a trend some people call “tokenmaxxing.” It means using lots and lots of AI. At companies like Amazon and Meta, staff were praised for using a lot of AI. But Coinbase adds one twist. “The more you spend on AI, the more impact we expect,” Armstrong said. In simple words: use all the AI you want, but it must give real results.
A stress test for Western AI labs
Coinbase is not the only one. The CEO of a startup called Lindy just switched to Deepseek v4. That is another Chinese model. Snowflake, a big data company, is also testing Chinese models to save money. The Decoder calls this a “stress test” for Western labs. A stress test is a hard check. It asks if their big growth numbers still hold up when customers can pay far less somewhere else.
Why does this matter? Some Western labs want to do IPOs soon. An IPO is the first time a company sells its shares to the public. These labs have raised billions of dollars. To prove they are worth that much, they need to earn lots of money. But if customers keep moving to cheaper rivals, those numbers get harder to reach. A price war may be starting too. OpenAI’s GPT-5.6-Sol costs the same as the older GPT-5.5. But people say it uses tokens more wisely than Anthropic’s Claude Fable and Mythos. OpenAI is also selling two weaker 5.6 versions at much lower prices.
Why it matters (especially for India and founders)
For people who start companies, this is a money lesson, not just a tech one. AI costs can grow very fast as your company grows. But Coinbase showed a smart plan. Route jobs smartly. Cache answers often. Keep instructions short. This plan lets you use more AI while paying less. It works for a tiny startup just as well as for a giant company.
For Indian builders, cheap models make it easier to build AI products. You no longer need the most expensive model to make something useful. But there is a warning. If you depend on foreign models, Chinese or American, your product is tied to their prices and rules. This wider price war is shaking up markets too. You can see this in our look at the Kospi crash and AI-stock fears. And the idea that small, cheap models can compete is exactly what Sina’s tiny VibeThinker-3B is trying to prove.
FAQ
Which AI models is Coinbase using?
Coinbase now mostly uses cheaper Chinese models. These include GLM 5.2 and Kimi 2.7, says CEO Brian Armstrong. But workers can still choose other models if they want.
How did Coinbase cut its AI bill?
It did three things. It switched to cheaper models. It used smart routing to pick the best model for each task. And it improved caching, so its hit rate rose from 5% to 60%. This cut its spending in half, even though it used more AI.
What is a token in AI?
A token is a small piece of text. It is about one word, or part of a word. AI companies charge you by how many tokens you use. So using fewer tokens helps lower your cost.
Why is this a problem for OpenAI and Anthropic?
Customers are moving to cheaper Chinese models. So Western labs feel pressure to lower their prices. This is hard, because some are getting ready for IPOs (selling shares to the public for the first time). They need to earn lots of money to prove they are worth their high value.
The takeaway
Coinbase proved one big point. Smart engineering plus cheaper models can cut AI costs a lot, without slowing you down. More and more companies are doing the same. So Western AI labs now face a real test. They can keep prices high and lose customers. Or they can cut prices and earn less money. For anyone building with AI, one old habit is ending: just picking the most expensive model. Now, saving money on AI is the new way to win.
Source: The Decoder (June 28, 2026).