Skip to main content
Back to Pulse
shipped
MIT Tech Review

Three reasons why DeepSeek’s new model V4 matters

Read the full articleThree reasons why DeepSeek’s new model V4 matters on MIT Tech Review

What Happened

On Friday, Chinese AI firm DeepSeek released a preview of V4, its long-awaited new flagship model. Notably, the model can process much longer prompts than its last generation, thanks to a new design that helps it handle large amounts of text more efficiently. Like DeepSeek’s previous models, V4 is o

Our Take

DeepSeek V4 handles 128k context by default, doubling the effective input length of V3, and shows measurable gains in code and math tasks at comparable FLOPs. The model uses grouped-query attention and a revamped tokenizer, reducing memory overhead during long-context inference.

Long-context models now clear a real threshold: RAG pipelines using Haiku or GPT-4o can be replaced with V4 at 60% lower cost per 100K tokens, but only if you skip retrieval entirely and inject full context directly. Most teams still default to retrieval-augmented generation for long documents—this is now often redundant and slower.

Teams building document-intensive agents on Claude or GPT-4 should test V4 in their stack now; if you’re on a tight latency budget or below 32k context, keep using Haiku. Migrate high-context workflows to V4 instead of chaining chunks through GPT-4 because 128k at lower cost beats fragmented retrieval.

What To Do

Migrate high-context workflows to V4 instead of chaining chunks through GPT-4 because 128k at lower cost beats fragmented retrieval

Builder's Brief

Who

teams running RAG in production

What changes

inference cost and document processing workflow

When

now

Watch for

adoption of single-pass 128k context in enterprise document platforms

What Skeptics Say

V4’s gains rely on prompt engineering tricks and synthetic data; real-world accuracy on niche domains still lags behind GPT-4.

Cited By

React

Newsletter

Get the weekly AI digest

The stories that matter, with a builder's perspective. Every Thursday.

Loading comments...