Welcome Llama 3 - Meta’s new open LLM
What Happened
Welcome Llama 3 - Meta’s new open LLM
Fordel's Take
Llama-3-70B-Instruct is now free to download and outscores GPT-3.5 on MMLU at one-tenth the per-token cost on Together’s API ($0.0009 vs $0.009).
Most teams still default to GPT-3.5-Turbo for eval pipelines and bulk classification jobs even though Llama-3-70B runs at 120 tok/s on a single A100 and gives you full weight access. Running Opus for simple classification is just burning money.
If you’re a seed-stage startup burning $2k/mo on OpenAI for RAG reranking, switch now; enterprises locked into Azure contracts can ignore until the SLA gap closes.
What To Do
Swap GPT-3.5-Turbo calls for Llama-3-70B on Together endpoints because the 10× price cut leaves your eval budget intact and drops p99 latency under 600 ms
Cited By
React
Get the weekly AI digest
The stories that matter, with a builder's perspective. Every Thursday.