Spokes.wiki Search Graph Growth About

llm-providers-wiki

Blog Posting source ↗ source url updated Fri Jun 05 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Gemma 4 with Quantization-Aware Training (Google)

google‘s announcement of QAT-optimized checkpoints for the gemma-4 family — the next move in the family’s footprint-first strategy, pushing open weights (open-weight-models) into ever-smaller memory envelopes via quantization.

What’s announced

Numbers

Tooling & availability

Why it matters

Confirms memory footprint as a competitive axis for gemma-4: the 12B made the quality case at 16 GB gemma-4-12b-announcement; this makes the floor case — capable open models under 1 GB, runnable on phones. Quantization, not just architecture, is now an explicit lever in the open-weight market (synthesis). No API pricing — these are self-host weights (llm-api-pricing).

gemma-4 · quantization · google · open-weight-models · gemma-4-12b-announcement · llm-inference