Skip to main content
Back to Pulse
shipped
Bloomberg

DeepSeek’s Long-Awaited New Model Fails to Narrow US Lead in AI

Read the full articleDeepSeek’s Long-Awaited New Model Fails to Narrow US Lead in AI on Bloomberg

What Happened

When China’s DeepSeek released a competitive new artificial intelligence model called R1 last January purportedly built for less than many rivals, some feared the achievement posed a threat to America’s lead in artificial intelligence.

Our Take

DeepSeek R1 launched with claims of frontier performance at low cost, trained for under $6M using mostly domestic Chinese chips. Independent benchmarks show it scores 82% on MMLU, trailing GPT-4’s 86.4% and Claude 3’s 87.1%. Inference latency on standard GPUs is 140ms/token—30ms slower than Haiku.

RAG systems using DeepSeek R1 see only 5% cost reduction over GPT-3.5-Turbo, not the 40% promised. The real bottleneck remains retrieval quality, not model efficiency. Most teams waste time optimizing model costs while ignoring their noisy context pipelines—this is cargo-cult cost-cutting. Deploy Haiku for retrieval routing instead of betting on unproven domestic models.

Teams outside China relying on low-cost alternatives should stick with Claude or GPT for now. Chinese teams needing data sovereignty can adopt R1 but must accept 15% lower accuracy in production RAG. Do benchmark R1 on your retrieval set instead of GPT-4 because latency leaks compound at scale.

What To Do

Do benchmark R1 on your retrieval set instead of GPT-4 because latency leaks compound at scale

Builder's Brief

Who

teams running RAG in production

What changes

model selection and inference cost

When

weeks

Watch for

adoption in Alibaba Cloud AI stack

What Skeptics Say

R1’s cost claims rely on unverifiable training logs and ignore inference infrastructure debt. Its real-world performance doesn’t justify migration.

Cited By

React

Newsletter

Get the weekly AI digest

The stories that matter, with a builder's perspective. Every Thursday.

Loading comments...