Google releases Gemma 4 under Apache 2.0
What Happened
Google released Gemma 4 on April 2, 2026 under the Apache 2.0 license with no commercial restrictions. The model shares its research base with Gemini 3.1 Pro and runs on a single 80GB H100 GPU, delivering performance comparable to models roughly 20 times its size. It is the most permissively licensed Gemini-class model released to date.
Our Take
Single H100. That's what catches my attention — not the Apache license, not the benchmark charts Google will inevitably wave around. One 80GB GPU running Gemini-class inference.
We've been paying cloud API bills for this quality level for years. Gemini Pro isn't cheap at scale, and now here's the same underlying research baked into something you can self-host. No commercial restrictions means no legal gymnastics for product use.
Look, every 'open' model release comes with an asterisk. Llama 2 restrictions, 'research only' fine print — someone always buries something. Apache 2.0 is actually clean (not MIT-level paranoid-clean, but clean enough). Fine-tune it, wrap it in your own UI, resell it. Nobody's stopping you.
Honestly, this kills the API-only argument for lower-volume internal tools. If you've got dedicated GPU access, the cost math flips completely. We're already benchmarking this against one client project where the API spend was getting uncomfortable.
There's a catch: you still need the H100. That's roughly $2.49/hr on Lambda Labs. But if you're running constant inference load, break-even against Gemini API costs comes sooner than you'd expect.
What To Do
Spin up a Lambda Labs H100 instance (~$2.49/hr), run Gemma 4, and benchmark it against your current Gemini API spend — if you're over $300/month on inference, run the break-even math before your next billing cycle.
Cited By
React