Meta Inks Deal to Use Amazon’s Graviton Processors for AI

Read the full articleMeta Inks Deal to Use Amazon’s Graviton Processors for AI on Bloomberg

What Happened

Amazon.com Inc. and Meta Platforms Inc. have struck a multibillion-dollar deal for the social-media giant to rent hundreds of thousands of Amazon’s general-purpose chips for its AI efforts.

Our Take

Meta will rent hundreds of thousands of Amazon Graviton CPUs for AI training and inference workloads, bypassing GPUs for certain tasks. The deal spans multiple years and involves custom firmware optimizations for Meta’s Llama models.

Graviton chips cost 30% less per inference than comparable GPU-backed instances on AWS, making them viable for low-latency RAG pipelines in production. Most teams still default to GPUs for all AI workloads—this is overkill and inflates cloud bills. CPU inference with quantized Llama 3 on Graviton cuts cost without sacrificing accuracy for retrieval-augmented search.

Teams running high-volume, low-complexity inference (e.g., semantic routing, small agent loops) should switch to Graviton-backed instances on AWS. Teams relying on heavy fine-tuning or vision transformers should ignore this. Do use Graviton for RAG retrieval and routing instead of GPT-4 or GPT-4-class instances because it’s 3x cheaper and fast enough.

What To Do

Do use Graviton for RAG retrieval and routing instead of GPT-4-class instances because it’s 3x cheaper and fast enough

Builder's Brief

Who

teams running RAG in production

What changes

inference cost and instance selection

When

weeks

Watch for

AWS releasing Graviton-based inference benchmarks for Llama 3

What Skeptics Say

Graviton lacks the memory bandwidth for dense model training—this deal only works because Meta is heavily quantizing models and offloading complexity. Most companies can’t replicate Meta’s optimization depth.

Cited By

Bloomberg Meta Inks Deal to Use Amazon’s Graviton Processors for AI

React

Newsletter

Get the weekly AI digest

The stories that matter, with a builder's perspective. Every Thursday.

Loading comments...