Skip to main content
Back to Pulse
TechCrunch

Google’s new Gemini Pro model has record benchmark scores — again

Read the full articleGoogle’s new Gemini Pro model has record benchmark scores — again on TechCrunch

What Happened

Gemini 3.1 Pro promises a Google LLM capable of handling more complex forms of work.

Our Take

Benchmarks are a trap. Google's already best-in-class on evals. Clients care about (1) cost per token, (2) latency, (3) reliability, (4) whether it works for their specific thing. Gemini 3.1 being "better" on benchmarks means almost nothing unless it's also cheaper or faster.

The real story's that Google's throwing everything at LLM performance while pricing stays confusing. They're winning on paper, losing in the sales call.

What To Do

Run your own benchmarks on YOUR use case before switching—published evals don't predict real-world performance.

Cited By

React

Loading comments...