Gemini 2.5 Flash-Lite
Gemini 2.5 Flash-Lite is Google's latest addition to the Gemini 2.5 family, designed to be the fastest and most cost-efficient model in its lineup. Currently available in preview, Flash-Lite delivers enhanced performance with higher quality and lower latency compared to earlier Lite versions, making it ideal for high-volume, latency-sensitive tasks such as translation and classification. It retains the hallmark features of the Gemini 2.5 models, including a 1 million-token context window, multimodal input capabilities, and integration with tools like Google Search and code execution.
The model demonstrates significant improvements across benchmarks in coding, math, science, reasoning, and multimodal tasks, outperforming its predecessors like 2.0 Flash-Lite and 2.0 Flash. Flash-Lite is optimized for developers and organizations seeking a balance between speed and cost-efficiency, enabling applications that require rapid processing at scale.
Gemini 2.5 Flash-Lite is available for preview via Google AI Studio and Vertex AI, alongside the stable versions of Gemini 2.5 Flash and Pro, which are now generally accessible. Custom versions of Flash-Lite and Flash have also been integrated into Google Search, further expanding its utility across various platforms.