LLMs by Category

Top AI Models by Category

Compare the latest models across open source, proprietary, uncensored, coding, math, speed, and release freshness.

Most Used AI Models

Popular picks across OpenRouter.

MiniMax logo

MiniMax M2.5

MiniMax

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity.

Context 197K
Speed 41 tok/s
Input Text
Output Text
Reasoning Yes
Google logo

Gemini 3 Flash Preview

Google

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.

Context 1.0M
Speed 165 tok/s
Input Text, Image, File, Audio, Video
Output Text
Reasoning Yes
DeepSeek logo

DeepSeek V3.2

DeepSeek

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance.

Context 164K
Speed 33 tok/s
Input Text
Output Text
Reasoning Yes
Anthropic logo

Claude Opus 4.6

Anthropic

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.

Context 1.0M
Speed 43 tok/s
Input Text, Image
Output Text
Reasoning Yes
Anthropic logo

Claude Sonnet 4.6

Anthropic

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.

Context 1.0M
Speed 49 tok/s
Input Text, Image
Output Text
Reasoning Yes
MoonshotAI logo

Kimi K2.5

MoonshotAI

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm.

Context 262K
Speed 34 tok/s
Input Text, Image
Output Text
Reasoning Yes
xAI logo

Grok 4.1 Fast

xAI

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window.

Context 2.0M
Speed 111 tok/s
Input Text, Image
Output Text
Reasoning Yes
OpenAI logo

gpt-oss-120b

OpenAI

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases.

Context 131K
Speed 59 tok/s
Input Text
Output Text
Reasoning Yes
Google logo

Gemini 2.5 Flash

Google

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks.

Context 1.0M
Speed 184 tok/s
Input File, Image, Text, Audio, Video
Output Text
Reasoning Yes
Anthropic logo

Claude Sonnet 4.5

Anthropic

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows.

Context 1.0M
Speed 52 tok/s
Input Text, Image, File
Output Text
Reasoning Yes

Top Open Source AI Models

Community-driven, inspectable weights.

Z.ai logo

GLM 5

Z.ai

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows.

Context 203K
Speed 46 tok/s
Input Text
Output Text
Reasoning Yes
MoonshotAI logo

Kimi K2.5

MoonshotAI

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm.

Context 262K
Speed 34 tok/s
Input Text, Image
Output Text
Reasoning Yes
Qwen logo

Qwen3.5 397B A17B

Qwen

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Context 262K
Speed 55 tok/s
Input Text, Image, Video
Output Text
Reasoning Yes
Qwen logo

Qwen3.5-27B

Qwen

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance.

Context 262K
Speed 89 tok/s
Input Text, Image, Video
Output Text
Reasoning Yes
Z.ai logo

GLM 4.7

Z.ai

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.

Context 203K
Speed 85 tok/s
Input Text
Output Text
Reasoning Yes
MiniMax logo

MiniMax M2.5

MiniMax

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity.

Context 197K
Speed 41 tok/s
Input Text
Output Text
Reasoning Yes
DeepSeek logo

DeepSeek V3.2

DeepSeek

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance.

Context 164K
Speed 33 tok/s
Input Text
Output Text
Reasoning Yes
Qwen logo

Qwen3.5-122B-A10B

Qwen

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Context 262K
Speed 141 tok/s
Input Text, Image, Video
Output Text
Reasoning Yes
Xiaomi logo

MiMo-V2-Flash

Xiaomi

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi.

Context 262K
Speed 133 tok/s
Input Text
Output Text
Reasoning Yes
MoonshotAI logo

Kimi K2 Thinking

MoonshotAI

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning.

Context 131K
Speed 40 tok/s
Input Text
Output Text
Reasoning Yes

Top Proprietary AI Models

Frontier closed models.

Google logo

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.

Context 1.0M
Speed 106 tok/s
Input Audio, File, Image, Text, Video
Output Text
Reasoning Yes
OpenAI logo

GPT-5.4

OpenAI

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system.

Context 1.1M
Speed 77 tok/s
Input Text, Image, File
Output Text
Reasoning Yes
OpenAI logo

GPT-5.3-Codex

OpenAI

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2.

Context 400K
Speed 63 tok/s
Input Text, Image
Output Text
Reasoning Yes
Anthropic logo

Claude Opus 4.6

Anthropic

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.

Context 1.0M
Speed 43 tok/s
Input Text, Image
Output Text
Reasoning Yes
Anthropic logo

Claude Sonnet 4.6

Anthropic

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.

Context 1.0M
Speed 49 tok/s
Input Text, Image
Output Text
Reasoning Yes
OpenAI logo

GPT-5.2

OpenAI

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1.

Context 400K
Speed 64 tok/s
Input File, Image, Text
Output Text
Reasoning Yes
Anthropic logo

Claude Opus 4.5

Anthropic

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use.

Context 200K
Speed 55 tok/s
Input File, Image, Text
Output Text
Reasoning Yes
OpenAI logo

GPT-5.2-Codex

OpenAI

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows.

Context 400K
Speed 73 tok/s
Input Text, Image
Output Text
Reasoning Yes
Google logo

Nano Banana Pro (Gemini 3 Pro Image Preview)

Google

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro.

Context 66K
Speed 110 tok/s
Input Image, Text
Output Image, Text
Reasoning Yes
OpenAI logo

GPT-5.1

OpenAI

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5.

Context 400K
Speed 77 tok/s
Input Image, Text, File
Output Text
Reasoning Yes

Top Coding AI Models

Models tuned for code and developer workflows.

OpenAI logo

GPT-5.4

OpenAI

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system.

Context 1.1M
Speed 77 tok/s
Input Text, Image, File
Output Text
Reasoning Yes
Google logo

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.

Context 1.0M
Speed 106 tok/s
Input Audio, File, Image, Text, Video
Output Text
Reasoning Yes
OpenAI logo

GPT-5.3-Codex

OpenAI

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilities of GPT-5.2.

Context 400K
Speed 63 tok/s
Input Text, Image
Output Text
Reasoning Yes
Anthropic logo

Claude Sonnet 4.6

Anthropic

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work.

Context 1.0M
Speed 49 tok/s
Input Text, Image
Output Text
Reasoning Yes
OpenAI logo

GPT-5.2

OpenAI

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1.

Context 400K
Speed 64 tok/s
Input File, Image, Text
Output Text
Reasoning Yes
Anthropic logo

Claude Opus 4.6

Anthropic

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.

Context 1.0M
Speed 43 tok/s
Input Text, Image
Output Text
Reasoning Yes
Anthropic logo

Claude Opus 4.5

Anthropic

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use.

Context 200K
Speed 55 tok/s
Input File, Image, Text
Output Text
Reasoning Yes
Anthropic logo

Claude Opus 4.1

Anthropic

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks.

Context 200K
Speed 50 tok/s
Input Image, Text, File
Output Text
Reasoning Yes
Google logo

Gemini 2.5 Pro Preview 05-06

Google

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks.

Context 1.0M
Speed N/A
Input Text, Image, File, Audio, Video
Output Text
Reasoning Yes
Google logo

Nano Banana Pro (Gemini 3 Pro Image Preview)

Google

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro.

Context 66K
Speed 110 tok/s
Input Image, Text
Output Image, Text
Reasoning Yes

Top OCR AI Models

Models specialised in optical character recognition and document extraction.

PaddlePaddle logo

PaddleOCR-VL-0.9B

PaddlePaddle

Baidu's 0.9B vision-language OCR model combining a NaViT-style dynamic-resolution encoder with ERNIE-4.5-0.3B. Handles multilingual text, tables, charts, and formulas across 16K context — optimized for efficient on-device document parsing.

Context 16K
Speed N/A
Input Text, Image
Output Text
Reasoning No
AllenAI logo

olmOCR-2-7B

AllenAI

Allen AI's 7B OCR model fine-tuned from Qwen2.5-VL-7B on curated academic papers and technical documentation. Supports 128K context and extracts structured text from PDFs and scanned documents with high fidelity.

Context 128K
Speed N/A
Input Text, Image
Output Text
Reasoning No
DeepSeek logo

DeepSeek-OCR

DeepSeek

DeepSeek's ~3B MoE OCR model using optical context compression to encode full pages into compact token sequences. Outputs structured Markdown preserving text layout, tables, and mathematical formulas from images and PDFs.

Context N/A
Speed N/A
Input Text, Image
Output Text
Reasoning No
Mistral AI logo

Mistral OCR

Mistral AI

Mistral's dedicated document understanding model (December 2025). Processes PDFs and images page-by-page via API, returning structured Markdown with preserved tables, equations, image bounding boxes, and rich layout metadata.

Context N/A
Speed N/A
Input Image, Pdf
Output Text
Reasoning No

Top Math AI Models

Math and reasoning specialists.

OpenAI logo

GPT-5.2

OpenAI

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1.

Context 400K
Speed 64 tok/s
Input File, Image, Text
Output Text
Reasoning Yes
OpenAI logo

GPT-5 Codex

OpenAI

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows.

Context 400K
Speed 181 tok/s
Input Text, Image
Output Text
Reasoning Yes
Google logo

Gemini 3 Flash Preview

Google

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.

Context 1.0M
Speed 165 tok/s
Input Text, Image, File, Audio, Video
Output Text
Reasoning Yes
DeepSeek logo

DeepSeek V3.2 Speciale

DeepSeek

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance.

Context 164K
Speed N/A
Input Text
Output Text
Reasoning Yes
Xiaomi logo

MiMo-V2-Flash

Xiaomi

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi.

Context 262K
Speed 133 tok/s
Input Text
Output Text
Reasoning Yes
OpenAI logo

GPT-5.1-Codex

OpenAI

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows.

Context 400K
Speed 120 tok/s
Input Text, Image
Output Text
Reasoning Yes
Google logo

Nano Banana Pro (Gemini 3 Pro Image Preview)

Google

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro.

Context 66K
Speed 110 tok/s
Input Image, Text
Output Image, Text
Reasoning Yes
Z.ai logo

GLM 4.7

Z.ai

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.

Context 203K
Speed 85 tok/s
Input Text
Output Text
Reasoning Yes
MoonshotAI logo

Kimi K2 Thinking

MoonshotAI

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning.

Context 131K
Speed 40 tok/s
Input Text
Output Text
Reasoning Yes
OpenAI logo

GPT-5

OpenAI

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience.

Context 400K
Speed 65 tok/s
Input Text, Image, File
Output Text
Reasoning Yes

Fast AI Models

Lowest cost + latency options.

Inception logo

Mercury 2

Inception

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM).

Context 128K
Speed 842 tok/s
Input Text
Output Text
Reasoning Yes
IBM logo

Granite 4.0 Micro

IBM

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models.

Context 131K
Speed 223 tok/s
Input Text
Output Text
Reasoning No
Google logo

Gemini 2.5 Flash Lite Preview 09-2025

Google

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.

Context 1.0M
Speed 348 tok/s
Input Text, Image, File, Audio, Video
Output Text
Reasoning Yes
Amazon logo

Nova Micro 1.0

Amazon

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost.

Context 128K
Speed 330 tok/s
Input Text
Output Text
Reasoning No
OpenAI logo

gpt-oss-20b

OpenAI

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license.

Context 131K
Speed 291 tok/s
Input Text
Output Text
Reasoning Yes
Mistral logo

Devstral Small 1.1

Mistral

Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI.

Context 131K
Speed 202 tok/s
Input Text
Output Text
Reasoning No
OpenAI logo

gpt-oss-120b

OpenAI

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases.

Context 131K
Speed 59 tok/s
Input Text
Output Text
Reasoning Yes
Google logo

Gemini 3.1 Flash Lite Preview

Google

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases.

Context 1.0M
Speed 283 tok/s
Input Text, Image, Video, File, Audio
Output Text
Reasoning Yes
LiquidAI logo

LFM2-24B-A2B

LiquidAI

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment.

Context 33K
Speed 250 tok/s
Input Text
Output Text
Reasoning No
Amazon logo

Nova 2 Lite

Amazon

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text.

Context 1.0M
Speed 228 tok/s
Input Text, Image, Video, File
Output Text
Reasoning Yes

Large Context Window AI Models

Models with 200K+ context windows.

xAI logo

Grok 4.1 Fast

xAI

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window.

Context 2.0M
Speed 111 tok/s
Input Text, Image
Output Text
Reasoning Yes
xAI logo

Grok 4 Fast

xAI

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window.

Context 2.0M
Speed 154 tok/s
Input Text, Image
Output Text
Reasoning Yes
OpenAI logo

GPT-5.4

OpenAI

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system.

Context 1.1M
Speed 77 tok/s
Input Text, Image, File
Output Text
Reasoning Yes
OpenAI logo

GPT-5.4 Pro

OpenAI

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks.

Context 1.1M
Speed N/A
Input Text, Image, File
Output Text
Reasoning Yes
Google logo

Gemini 3 Flash Preview

Google

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance.

Context 1.0M
Speed 165 tok/s
Input Text, Image, File, Audio, Video
Output Text
Reasoning Yes
Google logo

Gemini 2.5 Flash

Google

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks.

Context 1.0M
Speed 184 tok/s
Input File, Image, Text, Audio, Video
Output Text
Reasoning Yes
Google logo

Gemini 2.5 Flash Lite

Google

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency.

Context 1.0M
Speed 197 tok/s
Input Text, Image, File, Audio, Video
Output Text
Reasoning Yes
Google logo

Gemini 3.1 Pro Preview

Google

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows.

Context 1.0M
Speed 106 tok/s
Input Audio, File, Image, Text, Video
Output Text
Reasoning Yes
Google logo

Gemini 2.0 Flash

Google

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5).

Context 1.0M
Speed N/A
Input Text, Image, File, Audio, Video
Output Text
Reasoning No
Google logo

Gemini 3.1 Flash Lite Preview

Google

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases.

Context 1.0M
Speed 283 tok/s
Input Text, Image, Video, File, Audio
Output Text
Reasoning Yes

Top Uncensored AI Models

Lightly filtered, high-flexibility models.

Sao10K logo

Llama 3 8B Lunaris

Sao10K

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3.

Context 8K
Speed N/A
Input Text
Output Text
Reasoning No
gryphe logo

MythoMax 13B

gryphe

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge

Context 4K
Speed N/A
Input Text
Output Text
Reasoning No
TheDrummer logo

Skyfall 36B V2

TheDrummer

Skyfall 36B v2 is an enhanced iteration of Mistral Small 2501, specifically fine-tuned for improved creativity, nuanced writing, role-playing, and coherent storytelling.

Context 33K
Speed N/A
Input Text
Output Text
Reasoning No
TheDrummer logo

UnslopNemo 12B

TheDrummer

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.

Context 33K
Speed N/A
Input Text
Output Text
Reasoning No
TheDrummer logo

Cydonia 24B V4.1

TheDrummer

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

Context 131K
Speed N/A
Input Text
Output Text
Reasoning No
Nous logo

Hermes 4 70B

Nous

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B.

Context 131K
Speed N/A
Input Text
Output Text
Reasoning Yes
Nous logo

Hermes 3 405B Instruct

Nous

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.

Context 131K
Speed 35 tok/s
Input Text
Output Text
Reasoning No
TheDrummer logo

Rocinante 12B

TheDrummer

Rocinante 12B is designed for engaging storytelling and rich prose.

Context 33K
Speed N/A
Input Text
Output Text
Reasoning No
Sao10K logo

Llama 3.3 Euryale 70B

Sao10K

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k).

Context 131K
Speed N/A
Input Text
Output Text
Reasoning No
Nous logo

Hermes 3 70B Instruct

Nous

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.

Context 131K
Speed 44 tok/s
Input Text
Output Text
Reasoning No

Newest AI Models

Fresh releases from OpenRouter.

ByteDance Seed logo

Seed-2.0-Lite

ByteDance Seed

Seed-2.0-Lite is a balanced model designed for high-frequency enterprise workloads, optimizing for both capability and cost.

Context 262K
Speed N/A
Input Text, Image, Video
Output Text
Reasoning Yes
Qwen logo

Qwen3.5-9B

Qwen

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture.

Context 262K
Speed 118 tok/s
Input Text, Image, Video
Output Text
Reasoning Yes
OpenAI logo

GPT-5.4 Pro

OpenAI

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks.

Context 1.1M
Speed N/A
Input Text, Image, File
Output Text
Reasoning Yes
OpenAI logo

GPT-5.4

OpenAI

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system.

Context 1.1M
Speed 77 tok/s
Input Text, Image, File
Output Text
Reasoning Yes
Inception logo

Mercury 2

Inception

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM).

Context 128K
Speed 842 tok/s
Input Text
Output Text
Reasoning Yes
OpenAI logo

GPT-5.3 Chat

OpenAI

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful.

Context 128K
Speed N/A
Input Text, Image, File
Output Text
Reasoning No
Google logo

Gemini 3.1 Flash Lite Preview

Google

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases.

Context 1.0M
Speed 283 tok/s
Input Text, Image, Video, File, Audio
Output Text
Reasoning Yes
ByteDance Seed logo

Seed-2.0-Mini

ByteDance Seed

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment.

Context 262K
Speed N/A
Input Text, Image, Video
Output Text
Reasoning Yes
Google logo

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Google

Gemini 3.1 Flash Image Preview, a.k.a.

Context 66K
Speed N/A
Input Image, Text
Output Image, Text
Reasoning Yes
Qwen logo

Qwen3.5-35B-A3B

Qwen

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency.

Context 262K
Speed 128 tok/s
Input Text, Image, Video
Output Text
Reasoning Yes
EU Made in Europe

Chat with 100+ AI Models in one App.

Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, GLM-5, Kimi K2.5 and many more.

How to Choose the Right AI Model

A practical guide to picking the best LLM for your use case.

Match the model to the task

General-purpose models like GPT-4o and Claude Sonnet handle most tasks well. For specialized work, coding models (DeepSeek Coder, Codestral) and math models (QwQ, DeepSeek R1) outperform generalists on their respective benchmarks while often costing less per token.

Consider context window size

If you work with long documents, codebases, or multi-turn conversations, context window matters. Models range from 8K to over 1M tokens. Larger windows let you process entire books or repositories in a single prompt, but they increase cost and latency.

Balance cost, speed, and quality

Frontier models deliver the highest benchmark scores but cost more per token and respond slower. Fast models like Gemini Flash, Llama 3 (8B), and Mistral Small can handle routine tasks at a fraction of the cost with sub-second latency - ideal for high-volume applications.

Open source vs. proprietary

Open-source models (Llama, Mistral, Qwen, DeepSeek) let you self-host, fine-tune, and inspect weights. Proprietary models (GPT-4o, Claude, Gemini) often lead on benchmarks and offer managed APIs with built-in safety features. Many teams use both: proprietary for peak performance, open source for cost control and customization.

Check for multimodal capabilities

Some models accept images, audio, or files alongside text. If your workflow involves analyzing screenshots, diagrams, or audio transcriptions, filter for models with vision or audio input support. Models with structured output and function calling are essential for building agents and tool-using applications.

Use benchmarks as a starting point

Scores like GPQA, MMLU Pro, and HLE measure academic knowledge and reasoning. LiveCodeBench and SciCode test practical coding ability. MATH 500 and AIME evaluate mathematical problem-solving. No single benchmark tells the full story - compare scores across categories relevant to your use case, then test with your own prompts.

Data on this page is sourced from OpenRouter and Artificial Analysis. Pricing, speed, and benchmark scores are updated regularly. Try any model instantly using the free chat - no API key required.