TL;DR — Amazon Bedrock added Google DeepMind's Gemma 4 family on June 15, 2026. Three instruction-tuned variants (31B Dense, 26B-A4B MoE, E2B) are available under Apache 2.0. All support 256K token context windows, multimodal text+image input, and native function calling. Developers can access them via a fully managed AWS service — no GPU provisioning required.

What Makes This Integration Significant

Google DeepMind launched Gemma 4 in April 2026, marking the first time an open-weight model family from Google was released under the Apache 2.0 license — meaning commercial use, fine-tuning, and redistribution are all permitted without royalties.

The Bedrock integration takes that openness and wraps it in AWS's managed infrastructure. Developers and enterprises get the performance of Gemma 4 without operating inference stacks, provisioning model weights, or managing GPU clusters. AWS's Model Deployment Account isolation architecture — with zero operator access to model internals — means regulated industries (finance, healthcare, government) can use Gemma 4 without sacrificing compliance posture.

Model Comparison

Model Architecture Parameters Context Modalities
Gemma 4 31B Dense 30.7B total 256K Text, Image
Gemma 4 26B-A4B Mixture-of-Experts 25.2B / 3.8B active 256K Text, Image
Gemma 4 E2B Dense (PLE) 5.1B / 2.3B effective 128K Text, Image

All three variants include built-in reasoning mode, native function calling for agentic workflows, and out-of-the-box support for 35+ languages with pre-training across 140+.

256K Max context window (tokens)
140+ Pre-training languages
89.2% AIME 2026 (Gemma 4 31B Thinking)
💡
The MoE Efficiency Argument
Gemma 4 26B-A4B activates only 3.8B parameters per request despite having 25.2B total parameters. This Mixture-of-Experts design delivers 27B-class intelligence at a fraction of the inference cost. For high-throughput agentic pipelines where cost-per-call matters, this model hits a strong price/performance point.

Getting Started on Bedrock

AWS supports Gemma 4 through the bedrock-mantle endpoint with OpenAI Python SDK compatibility — meaning existing OpenAI-style codebases can switch to Gemma 4 with minimal changes.

Model IDs:

  • google.gemma-4-31b — 30.7B Dense
  • google.gemma-4-26b-a4b — 26B Mixture-of-Experts
  • google.gemma-4-e2b — Compact 5.1B Dense

Service tiers available: Standard (on-demand shared throughput), Priority (reserved capacity), and Flex (cost-optimized for batch workloads).

Launch regions: US East (N. Virginia, Ohio), US West (Oregon), Europe (Frankfurt). Additional regions will expand over time.

Benchmark Performance

Benchmark Gemma 4 31B Thinking Gemma 4 26B-A4B Thinking
AIME 2026 (Math) 89.2% 88.3%
LiveCodeBench v6 80.0% 77.1%
GPQA Diamond (Science) 84.3% 82.3%
MMMLU Multilingual 85.2% 82.6%
τ2-bench Agentic (Retail) 86.4% 85.5%
Arena AI Elo (text) 1452 1441

The 26B-A4B nearly matches the 31B on all benchmarks while activating a fraction of the parameters per call — the practical efficiency gain is substantial in production.

ℹ️
Open Weights, Your Way
Because Gemma 4 is Apache 2.0, you're not locked into Bedrock. Download the weights from Hugging Face for local deployment, fine-tune on proprietary data with your preferred framework, and audit the architecture independently. Bedrock provides the fastest path to production; the open weights provide the option to go deeper.

Why Open-Weight Enterprise AI Is Having a Moment

The Bedrock launch illustrates a broader trend: open-weight models are reaching the capability threshold where enterprises no longer need to default to closed APIs for frontier performance. Gemma 4 31B now competes with models that were considered closed-model territory six months ago, while giving organizations the auditability and deployment flexibility that Apache 2.0 enables.

With AWS handling the infrastructure and Google DeepMind supplying the intelligence-per-parameter efficiency, the barrier for enterprise teams to deploy, fine-tune, and customize frontier AI has dropped significantly.

Key Takeaways

  • Gemma 4 family (31B, 26B-A4B MoE, E2B) is live on Amazon Bedrock as of June 15, 2026
  • Apache 2.0 license — commercial use, fine-tuning, and redistribution all permitted
  • MoE architecture activates only 3.8B of 25.2B parameters per call — significant cost efficiency
  • 256K context window enables full codebase analysis and multi-turn agents
  • OpenAI SDK-compatible API makes migration low-friction
  • Model Deployment Account isolation meets enterprise and regulatory security requirements
🔗
Official Sources & Getting Started
AWS Blog — Introducing Gemma 4 on Amazon Bedrock
Google DeepMind Gemma 4 — Model Cards & Downloads
Google AI for Developers — Gemma Getting Started Guide