Skip to main content
Back to Pulse
MML Studio

OpenAI buys 750MW compute from Cerebras in $10B+ deal

Read the full articleOpenAI-Cerebras $10B Compute Deal on MML Studio

What Happened

OpenAI signed a multi-year agreement with Cerebras Systems to purchase up to 750 megawatts of computing capacity, in a deal valued at over $10 billion over three years. The contract marks a deliberate move by OpenAI to reduce its dependence on NVIDIA for inference infrastructure at scale. Cerebras builds wafer-scale processors that consolidate compute onto a single silicon die, potentially reducing the inter-chip communication overhead inherent in traditional GPU clusters.

Our Take

750MW. That's not a compute purchase, that's a power grid commitment. OpenAI is essentially becoming a utility customer at nation-state scale.

Here's the thing about Cerebras — they're not building better GPUs, they're building one enormous chip (the WSE-3 is a full silicon wafer, ~900mm²) that sidesteps the interconnect bottleneck entirely. NVIDIA's H100 clusters spend ungodly amounts of time shuffling data between chips. Cerebras just... doesn't have that problem.

This tells me two things. OpenAI's NVIDIA bills are clearly astronomical enough to justify this level of supplier diversification. And wafer-scale inference might actually be production-ready — which is the bigger story, honestly.

For a small team like ours? If Cerebras delivers on throughput-per-watt (they've been quietly solid on benchmarks), we could see per-token pricing fall another 60-70% over 18 months. That changes the viability math on a lot of things we've been told 'aren't economical yet.'

Stop designing for today's inference costs. They're a temporary ceiling, not a permanent constraint.

What To Do

Test Cerebras Cloud's public API (cloud.cerebras.ai) on your highest-volume inference calls this week — Llama 3.1 70B runs at $0.85/M tokens there vs. $2.70/M on comparable OpenAI endpoints.

Cited By

React

Loading comments...