OpenAI and Broadcom Unveil Jalapeño, a Custom Chip Built for LLM Inference
OpenAI just took a big step into hardware. On June 24, 2026, OpenAI and chipmaker Broadcom unveiled Jalapeño, a custom chip built for LLM inference. A chip is the tiny brain inside a computer. “LLM inference” simply means running an AI model to answer your questions after it has already been trained. OpenAI calls Jalapeño its first “Intelligence Processor.” This is the company’s first try at building its own silicon instead of only renting other people’s.
Why does this matter? Today, almost every AI company buys expensive chips from Nvidia. By making its own chip, OpenAI hopes to cut costs and rely less on one supplier. Let us break down what was announced in plain words.
What is Jalapeño and who built it?
Jalapeño is an ASIC. An ASIC is a chip made for one job only, not for everything. In this case, the one job is running AI models fast and cheaply. OpenAI says it is not a training chip and not a general-purpose chip. It is built purely to serve answers to users.
Three companies share the work. OpenAI designed the chip. Broadcom handles the manufacturing and the networking technology that links many chips together, using its Tomahawk networking chips. Celestica builds the boards, racks, and full systems. The chip will be made by TSMC, the world’s top chip factory, using its advanced 3-nanometer process. A “3-nanometer process” just means the chip’s parts are extremely tiny, which makes the chip faster and more power-efficient.
Inside, Jalapeño uses a “systolic array.” That is a grid of small math units that pass numbers to each other in a steady flow. It suits the repeated multiplication that AI models do all day. Reports describe the chip as “reticle-sized,” meaning it is about as large as a chip can physically be made.
How fast was it built, and how cheap is it?
The speed of the project stands out. OpenAI and Broadcom say they went from first design to “tape-out” (the moment the design is finished and sent to the factory) in just nine months. They call this the fastest ASIC development cycle ever for high-performance chips. Interestingly, OpenAI says its own AI models helped speed up the design work.
On cost, Broadcom CEO Hock Tan said early tests point to roughly 50% cheaper inference compared with normal AI GPUs (the chips most AI runs on today). OpenAI’s own wording is more careful. It says performance per watt is “substantially better than current state-of-the-art.” Performance per watt means how much work you get for each unit of electricity. Higher is better, because power bills are a huge cost for AI.
One honest note: these numbers are self-reported. OpenAI ran the tests on workloads of its own choosing and has not shared the exact comparison chips or independent results. A full technical report is promised but not out yet. So treat the figures as early claims, not proven facts.
Benchmarks & specs
| Detail | Jalapeño (reported) | Compared with |
|---|---|---|
| Inference cost | ~50% cheaper (per Broadcom CEO) | Conventional AI GPUs (e.g. Nvidia) |
| Performance per watt | “Substantially better” (self-reported) | Current state-of-the-art hardware |
| Chip type | Inference-only ASIC | General-purpose / training GPUs |
| Architecture | Systolic array | — |
What it means: if the 50% claim holds, OpenAI could run the same AI for about half the chip cost — but wait for the technical report before trusting the exact number.
Key facts
| Item | Detail |
|---|---|
| Chip name | Jalapeño (OpenAI’s first “Intelligence Processor”) |
| Announced | June 24, 2026 |
| Designer | OpenAI |
| Manufacturer / networking | Broadcom (with Tomahawk networking chips) |
| System integration | Celestica |
| Made by | TSMC, 3nm process |
| Development time | 9 months (design to tape-out) |
| First deployment | Late 2026, at gigawatt scale |
| Microsoft commitment | 40% of initial chip production |
| Current testing | Engineering samples running GPT-5.3-Codex-Spark |
When will it be used, and at what scale?
OpenAI plans to start deploying Jalapeño in late 2026 at “gigawatt scale.” A gigawatt is a huge amount of power — enough to run a small city. Broadcom says it is helping build gigawatt-scale data centers with Microsoft and other partners starting in 2026. Microsoft has committed to taking 40% of the first batch of chips.
Demand looks strong. Hock Tan said his earlier forecast of 1.3 gigawatts of chip deployments next year “may prove conservative,” meaning the real number could be bigger. Right now, OpenAI is running early engineering samples of the chip with a model called GPT-5.3-Codex-Spark. At the launch, Broadcom CEO Hock Tan and President Charlie Kawwas handed the first wafer (the disc the chips are cut from) to OpenAI CEO Sam Altman and President Greg Brockman.
Why it matters (especially for India / founders)
For founders everywhere, the cost of running AI is a make-or-break number. If your app calls an AI model thousands of times a day, your bill grows fast. Cheaper inference chips could push down the price you pay to OpenAI over time. That makes AI features more affordable to add to your product.
For India, this is also a signal. The race for custom AI chips shows that owning your own hardware is becoming a real advantage. Indian startups and policymakers watching the “AI compute” story should note how big players are cutting reliance on one chip vendor. The lesson is simple: in AI, controlling your costs and your supply chain matters as much as a clever model. This shift connects to a wider trend where companies bet on smart strategy and human-led design choices, as seen when Figma bet on human judgment at Config 2026 even while leaning on outside AI.
FAQ
What does Jalapeño actually do?
It runs AI models to give answers to users. That step is called inference. It does not train new models; it just serves the ones already built.
Is Jalapeño faster than Nvidia chips?
OpenAI claims better performance per watt and Broadcom claims about 50% cheaper inference. But these are self-reported. No independent test or full technical report is out yet, so the claims are not proven.
When can people use it?
OpenAI plans to deploy it in late 2026. Microsoft has agreed to take 40% of the first chips, and early samples are already running an OpenAI model.
Why is OpenAI making its own chip?
To cut costs and depend less on Nvidia. A custom chip built only for inference can be cheaper and more power-efficient than a general-purpose GPU.
The takeaway
Jalapeño marks OpenAI’s first move from AI software into custom AI hardware. The pitch is cheaper, more efficient inference, built fast with Broadcom, TSMC, and Celestica. The cost and power claims look exciting, but they are early and self-reported. The real test comes with the technical report and the late-2026 rollout. For founders and watchers in India, it is a clear sign that the next AI battle is being fought over chips and cost, not just over models. Read more on how strategy beats hype in our coverage of xCures landing its Series B for medical records AI.
Sources
- The Decoder — OpenAI and Broadcom unveil ‘Jalapeño,’ a custom chip built for LLM inference
- OpenAI — OpenAI and Broadcom unveil LLM-optimized inference chip
- Broadcom — OpenAI and Broadcom Unveil LLM-Optimized Intelligence Processor
- Tom’s Hardware — Broadcom and OpenAI unveil custom-built Jalapeño inference processor