Google logo

Gemini 2.5 Flash Lite

by Google

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

Chat with Gemini 2.5 Flash Lite

Capabilities

Vision
Audio Input

Pricing

Input Tokens
Per 1M tokens
Free
Output Tokens
Per 1M tokens
Free
Image Processing
Per 1M tokens
$0.00/1M tokens

Supported Modalities

Input

text
image
file
audio
video

Output

text

Performance Benchmarks

Intelligence Index
Overall intelligence score
47.9
Coding Index
Programming capability
36.5
Math Index
Mathematical reasoning
68.7
GPQA
Graduate-level questions
70.9%
MMLU Pro
Multitask language understanding
80.8%
HLE
Human-like evaluation
6.6%
LiveCodeBench
Real-world coding tasks
68.8%
AIME 2025
Advanced mathematics
68.7%

Specifications

Context Length
1.0M tokens
Provider
Google
Throughput
5.34 tokens/s
Released
Jul 22, 2025
Model ID
google/gemini-2.5-flash-lite

Ready to try it?

Start chatting with Gemini 2.5 Flash Lite right now. No credit card required.

Start Chatting

More from Google

View all models