In a monumental shift toward hardware independence, OpenAI has officially unveiled its first custom-built processor, named Jalapeño.

Co-developed in a tight partnership with Broadcom, the hardware represents OpenAI’s first tangible step away from its near-total reliance on Nvidia’s off-the-shelf graphics processing units (GPUs). Officially announced on Wednesday, June 24, 2026, the chip marks OpenAI’s transition from a pure software lab into a full-stack AI infrastructure provider.

1. Built For “Inference at Scale”

While training advanced models grabs the public headlines, serving them to millions of daily users creates an incredibly expensive infrastructure bill. Jalapeño is architected as a specialized Application-Specific Integrated Circuit (ASIC) built to tackle that exact bottleneck:

  • A Pure Inference Engine: Jalapeño is a “blank-slate” design engineered solely to execute responses from already trained models. It completely skips the general-purpose components required for training, allowing it to focus entirely on running ChatGPT, Codex, APIs, and future autonomous agents.
  • Wringing Out Efficiency: OpenAI hardware chief Richard Ho stated that the chip is optimized specifically around kernels, memory loops, and networking patterns fundamental to large language models. Early lab tests show it operating at production target frequency with a performance-per-watt profile that cuts inference costs by an estimated 50%.
  • The Physical Footprint: The processor is a massive, reticle-sized chip (roughly $840\text{ mm}^2$) packed with six High Bandwidth Memory (HBM) modules, allowing it to combine high data throughput with low latency.
[Traditional GPUs] ──► General Purpose (Built for both Training & Inference) ──► High Cost / High Power
[Jalapeño ASIC]    ──► Blank-Slate Design (Optimized solely for LLM Inference) ──► 50% Cost Cut / Ultra-Efficient

2. The Nine-Month Blistering Tape-Out

Standard development cycles for cutting-edge semiconductors typically take years of iterative testing. Jalapeño moved from initial blueprints to factory manufacturing readiness (tape-out) in just nine months—one of the fastest turnarounds recorded for advanced silicon.

To pull off this sprint, engineers deployed a unique loop: they used OpenAI’s existing AI models to accelerate and optimize parts of the chip’s design process. The very models that consumers interact with daily helped layout the infrastructure required to run their future successors.

3. Joining the Silicon Arms Race

By deploying its own silicon, OpenAI joins a well-established club of tech giants that have custom-tailored hardware to escape third-party reliance:

CompanyProprietary AI Silicon ArmCore Deployment Mandate
GoogleTPU (Tensor Processing Units)Powering the Gemini ecosystem; currently on its 8th generation.
Amazon (AWS)Trainium & InferentiaHigh-capacity cloud instances; heavily consumed by partners like Anthropic.
MicrosoftMaiaCore infrastructure driving Azure-based AI workloads.
OpenAIJalapeño (Intelligence Processor)Powering ChatGPT and agentic pipelines; scaling to 10 gigawatts by 2029.

4. Rollout Strategy and the Nvidia Impact

While engineering samples are already successfully running workloads in the lab—specifically powering internal versions of GPT-5.3-Codex-Spark—the broader roll-out will move in phases.

Broadcom and its system assembly partner, Celestica, are currently manufacturing the initial server racks, with initial gigawatt-scale data center deployment alongside Microsoft slated to begin by the end of 2026. Volume production and scaling will stretch into 2027.

Crucially, OpenAI noted that Jalapeño is flexible enough to run non-OpenAI models, leaving the door wide open for the company to eventually sell or lease its hardware layout to third-party enterprises down the road.