Google has just announced a major update to its Gemini AI lineup: Gemini 3.1 Flash-Lite is now rolling out in preview starting today (March 3, 2026). This new model is available to developers through the Gemini API in Google AI Studio and to enterprises via Vertex AI.
This lightweight variant builds on the Gemini 3 series architecture, delivering frontier-class performance at significantly reduced costs and latency. It’s positioned as Google’s most cost-efficient Gemini model yet, optimized for high-volume, cost-sensitive applications without compromising on key capabilities.
Gemini 3.1 Flash-Lite is the first “Flash-Lite” model in the Gemini 3 family. It inherits advanced intelligence from higher-tier models like Gemini 3.1 Pro but focuses on efficiency for scaled production use cases.
Key highlights include:
According to developer feedback and early announcements, it provides faster time-to-first-token, better accuracy in targeted domains, and substantial savings compared to fuller models.