Google logo

Gemini 2.5 Flash Lite: Pricing, Context Window & Benchmarks

by Google

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.

Chat with Gemini 2.5 Flash Lite
Input Price
$0.10/1M tokens
Output Price
$0.40/1M tokens
Intelligence
12.5
Coding
7.4

What you can do with Gemini 2.5 Flash Lite

Everyday Q&A and clear explanations

Writing help (emails, posts, summaries)

Idea generation and brainstorming

Learning support with step-by-step guidance

Composite Indices

Intelligence, Coding, Math

Standard Benchmarks

Academic and industry benchmarks

Benchmark Highlights

6 tests
GPQA
47.4%
MMLU Pro
72.4%
LiveCodeBench
40.0%
Math 500
92.6%
AIME 2025
35.3%
HLE
3.7%
Metric Value
Provider Google
Context Window 1,048,576 tokens
Input Price $0.10/1M tokens
Output Price $0.40/1M tokens
Release Date Jun 17, 2025
Modalities text, image, file, audio, video
Capabilities Vision, Audio Input

Compare Gemini 2.5 Flash Lite to other models

See how it stacks up on price, quality, and overall performance.

Frequently asked questions

What is Gemini 2.5 Flash Lite good for?

Use Gemini 2.5 Flash Lite for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.

How much does Gemini 2.5 Flash Lite cost?

Pricing is based on usage. Current rates are $0.10/1M tokens for input and $0.40/1M tokens for output.

Can I try Gemini 2.5 Flash Lite for free?

Yes. You can start a chat instantly and test the model before deciding on a plan.

Does Gemini 2.5 Flash Lite support images or audio?

Gemini 2.5 Flash Lite can understand images.

Benchmarks and pricing are sourced from Artificial Analysis where available. OpenRouter specs are used as a fallback.