Hugging FaceApr 18, 2024

Welcome Llama 3 - Meta’s new open LLM

Read the full articleWelcome Llama 3 - Meta’s new open LLM on Hugging Face

↗

What Happened

Fordel's Take

Llama-3-70B-Instruct is now free to download and outscores GPT-3.5 on MMLU at one-tenth the per-token cost on Together’s API ($0.0009 vs $0.009).

Most teams still default to GPT-3.5-Turbo for eval pipelines and bulk classification jobs even though Llama-3-70B runs at 120 tok/s on a single A100 and gives you full weight access. Running Opus for simple classification is just burning money.

If you’re a seed-stage startup burning $2k/mo on OpenAI for RAG reranking, switch now; enterprises locked into Azure contracts can ignore until the SLA gap closes.

What To Do

Swap GPT-3.5-Turbo calls for Llama-3-70B on Together endpoints because the 10× price cut leaves your eval budget intact and drops p99 latency under 600 ms

Cited By

Hugging Face Welcome Llama 3 - Meta’s new open LLM

React

Newsletter

Get the weekly AI digest

The stories that matter, with a builder's perspective. Every Thursday.

Loading comments...