Google's Gemma 4 Makes Frontier AI Free and Open — Here's What That Means for Your Business

On April 2, Google DeepMind released Gemma 4 — a family of four AI models that represent, byte for byte, the most capable open models available. The 31-billion-parameter flagship currently ranks #3 on the Arena AI text leaderboard, outperforming models twenty times its size. The entire family ships under an Apache 2.0 licence — meaning any business can use, modify, and deploy them commercially, with zero licensing fees and no usage restrictions.

This matters because it shifts the fundamental economics of AI adoption. The question for business owners is no longer "which AI subscription should we pay for?" It's increasingly "do we need to pay at all?"

What Google actually released

Gemma 4 arrives in four sizes, each targeting different hardware and use cases. Clément Farabet, VP of Research at Google DeepMind, and Olivier Lacombe, Group Product Manager, described the release as purpose-built for "advanced reasoning and agentic workflows."

The 31B Dense model is the quality leader — scoring 85.2% on MMMLU (a broad knowledge benchmark), 89.2% on AIME 2026 (a competitive mathematics test), and 80.0% on LiveCodeBench for code generation. It runs unquantised on a single 80GB NVIDIA H100. More relevantly for most businesses: at 4-bit precision, it fits on a consumer RTX 4090 GPU — a $2,500 graphics card.

The 26B Mixture of Experts (MoE) model activates only 3.8 billion of its parameters during inference, delivering faster responses at slightly lower quality. The E4B and E2B edge models run on smartphones, Raspberry Pis, and IoT devices with under 1.5GB of RAM — bringing genuine AI capability to hardware that costs less than a month of ChatGPT Plus.

All models support 140+ languages, native function calling for agentic workflows, and context windows up to 256,000 tokens. Since the first Gemma generation launched, developers have downloaded Gemma over 400 million times, building more than 100,000 variants in what Google calls the "Gemmaverse."

The open-source arms race

Gemma 4 doesn't exist in isolation. April 2026 is what Digital Applied called "the most competitive moment in open-source AI history."

Meta released Llama 4 Scout around the same time — a 109-billion-parameter model with a staggering 10-million-token context window. Chinese labs including Moonshot AI, Alibaba, and Z.AI have shipped open-weight models that, as The Register noted, "now rival OpenAI's GPT-5 or Anthropic's Claude." Mistral continues to punch above its weight from France.

The critical difference is licensing. Meta's Llama uses a custom licence that restricts commercial use above 700 million monthly active users and retains certain controls. Google's move to Apache 2.0 is a deliberate statement: no restrictions, no termination clauses, no rug-pulling. As The Register put it, enterprises can now "deploy the models without fear of Google pulling the rug out from under them."

For Google, the strategy is clear. Gemma models aren't competing with Google's own proprietary Gemini — they're competing with everyone else's open models. By making Gemma the most permissively licensed frontier-class open model, Google positions its cloud infrastructure (Vertex AI, Cloud Run, GKE) as the natural place to scale when a business outgrows local deployment.

Why this matters for your business

The practical implications are significant, especially for the 10-50 person companies that make up our readership.

The cost equation has flipped. A business paying $20 per user per month for ChatGPT Plus or Claude Pro spends $1,200 per year for just five users. A self-hosted model running on local hardware costs nothing per query after the initial setup. One developer documented migrating from $3/day in cloud API costs to $0/day using a local 32B open-source model — which is now essentially what Gemma 4's 31B model offers.

The performance gap has closed. Open-source models now deliver approximately 90% of closed-model performance on most knowledge benchmarks, according to analysis from WorthvieW. For the majority of business tasks — drafting emails, summarising documents, answering questions about internal data, generating first-draft content — that gap is functionally irrelevant.

Data stays on your machine. This matters enormously for Australian businesses facing the Privacy Act's automated decision-making requirements coming in December 2026. A locally hosted model processes customer data without it ever leaving your network. No third-party data processing agreements. No questions about where your data is being stored or whether it's training someone else's model.

No vendor lock-in. Apache 2.0 means you own your deployment. You can switch models, fine-tune on your data, or run multiple models for different tasks. According to AI-CTO.IO's business decision guide, 41% of enterprises now plan to increase their use of open-source models — precisely because of this flexibility.

The honest caveats

Open-source AI isn't a free lunch. The costs shift, they don't disappear.

You need someone technical. Tools like Ollama have dramatically simplified local deployment — most models are a single command away from running. But maintaining a production system, handling updates, monitoring performance, and troubleshooting issues still requires technical competence. If you don't have that in-house, you're trading a subscription fee for a consulting bill.

Frontier reasoning still favours closed models. For the most demanding tasks — complex multi-step reasoning, nuanced analysis of ambiguous scenarios, advanced creative work — GPT-5, Claude, and Gemini's proprietary models still hold an edge. The gap is narrow and closing, but it exists.

The hybrid approach wins. The smartest deployment strategy in 2026 isn't all-open or all-closed. It's using local open-source models for routine, high-volume tasks (email triage, document summarisation, data extraction) and reserving expensive closed-model API calls for the complex work that justifies the cost. The compute breakeven for self-hosting occurs at roughly 1-2 billion tokens per month, according to Kael Research — but even below that threshold, privacy and control benefits may justify local deployment.

What to watch

Three developments will determine how quickly open-source AI reshapes business adoption:

Fine-tuning accessibility. Gemma 4's architecture is explicitly designed for efficient fine-tuning on consumer hardware. When a 20-person accounting firm can fine-tune a model on its own correspondence, templates, and regulatory knowledge — running entirely on a desktop workstation — that's a different category of AI adoption than "we subscribed to ChatGPT."

Edge deployment maturity. Gemma 4's E2B model running on a Raspberry Pi is a proof of concept today. But the trajectory points toward AI embedded in point-of-sale systems, customer kiosks, field equipment, and industrial sensors — all processing data locally, offline, with no cloud dependency.

The regulatory tailwind. Australia's incoming Privacy Act requirements around automated decision-making create a structural advantage for locally deployed models. Businesses that can demonstrate their AI processes data entirely on-premises will have a simpler compliance story than those routing customer data through external APIs.

Google has made a calculated bet that the best way to win the AI infrastructure war is to give the models away. For business owners, that bet translates directly into leverage: more capable tools, lower costs, and fewer strings attached. The era of paying a premium just to access AI capability is ending. What you do with that access is the competitive question now.

Sources

Gemma 4: Byte for byte, the most capable open models — Google Blog
Gemma 4 Model Card and Benchmarks — Google DeepMind
Google battles Chinese open-weights models with Gemma 4 — The Register
Gemma 4 vs Llama 4 vs Mistral Small 4: Full Comparison — Digital Applied
Open Source vs Closed LLMs 2026: A Business Decision Guide — AI-CTO.IO
Open Source vs Proprietary LLMs: The Real Cost Breakdown — Kael Research
How to Build Free Local AI with Ollama for Small Businesses — Dev.to
From Cloud-First to Local-First: Migrating My AI Agent to a 32B Open-Source Model — Dev.to
Open Source AI vs Closed AI: The Complete Guide 2026 — WorthvieW