Gemini 2.5 Flash Lite
by Google
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.
Capabilities
Pricing
Input Tokens
Per 1M tokens
Free
Output Tokens
Per 1M tokens
Free
Image Processing
Per 1M tokens
$0.00/1M tokens
Supported Modalities
Input
text
image
file
audio
video
Output
text
Performance Benchmarks
Intelligence Index
Overall intelligence score
47.9
Coding Index
Programming capability
36.5
Math Index
Mathematical reasoning
68.7
GPQA
Graduate-level questions
70.9%
MMLU Pro
Multitask language understanding
80.8%
HLE
Human-like evaluation
6.6%
LiveCodeBench
Real-world coding tasks
68.8%
AIME 2025
Advanced mathematics
68.7%
Specifications
- Context Length
- 1.0M tokens
- Provider
- Throughput
- 5.34 tokens/s
- Released
- Jul 22, 2025
- Model ID
- google/gemini-2.5-flash-lite
Ready to try it?
Start chatting with Gemini 2.5 Flash Lite right now. No credit card required.
Start ChattingMore from Google
View all modelsCompare Models
Select a model to compare with Gemini 2.5 Flash Lite