AI Model Ranking (LLM Leaderboard)

Cheapest AI Models

Most affordable language models sorted by price per token

Model
AI model name and provider organization
Input/1M
Cost per 1 million input tokens (text you send to the model)
Output/1M
Cost per 1 million output tokens (text the model generates for you)
MMLU-Pro
Massive Multitask Language Understanding (Professional) - tests broad knowledge across 14 subjects including STEM, humanities, and social sciences
GPQA
Graduate-level Google-Proof Q&A benchmark - tests PhD-level reasoning and advanced intelligence
AIME 2025
American Invitational Mathematics Examination 2025 - tests advanced mathematical problem-solving ability
Release
When the model was released - newer models may have more capabilities
Compare
xAI AI provider logo - Grok-1
#1 Grok-1
by xAI
N/A N/A - - - Mar 17, 2024
Google AI provider logo - Gemma 3 27B Instruct
#2 Gemma 3 27B Instruct
by Google
N/A N/A 66.9% 42.8% 20.7% Mar 12, 2025
Google AI provider logo - Gemma 3 270M
#3 Gemma 3 270M
by Google
N/A N/A 5.5% 22.4% 2.3% Aug 14, 2025
Google AI provider logo - Gemma 3n E2B Instruct
#4 Gemma 3n E2B Instruct
by Google
N/A N/A 37.8% 22.9% 10.3% Jun 26, 2025
Google AI provider logo - Gemma 3 12B Instruct
#5 Gemma 3 12B Instruct
by Google
N/A N/A 59.5% 34.9% 18.3% Mar 12, 2025
Google AI provider logo - Gemma 3 4B Instruct
#6 Gemma 3 4B Instruct
by Google
N/A N/A 41.7% 29.1% 12.7% Mar 12, 2025
Google AI provider logo - Gemma 3 1B Instruct
#7 Gemma 3 1B Instruct
by Google
N/A N/A 13.5% 23.7% 3.3% Mar 13, 2025
Mistral AI provider logo - Devstral 2
#8 Devstral 2
by Mistral
N/A N/A 76.2% 59.4% 36.7% Dec 9, 2025
Mistral AI provider logo - Devstral Small 2
#9 Devstral Small 2
by Mistral
N/A N/A 67.8% 53.2% 34.3% Dec 9, 2025
DeepSeek AI provider logo - DeepSeek V3.2 Speciale
#10 DeepSeek V3.2 Speciale
by DeepSeek
N/A N/A 86.3% 87.1% 96.7% Dec 1, 2025
DeepSeek AI provider logo - DeepSeek R1 0528 Qwen3 8B
#11 DeepSeek R1 0528 Qwen3 8B
by DeepSeek
N/A N/A 73.9% 61.2% 63.7% May 29, 2025
Perplexity AI provider logo - R1 1776
#12 R1 1776
by Perplexity
N/A N/A - - - Feb 18, 2025
TII UAE AI provider logo - Falcon-H1R-7B
#13 Falcon-H1R-7B
by TII UAE
N/A N/A 72.5% 66.1% 80.0% Jan 4, 2026
xAI AI provider logo - Grok Voice Agent
#14 Grok Voice Agent
by xAI
N/A N/A - - - Dec 17, 2025
Microsoft Azure AI provider logo - Phi-4 Mini Instruct
#15 Phi-4 Mini Instruct
by Microsoft Azure
N/A N/A 46.5% 33.1% 6.7% Feb 26, 2024
Microsoft Azure AI provider logo - Phi-4 Multimodal Instruct
#16 Phi-4 Multimodal Instruct
by Microsoft Azure
N/A N/A 48.5% 31.5% - Feb 26, 2025
Liquid AI AI provider logo - LFM2.5-VL-1.6B
#17 LFM2.5-VL-1.6B
by Liquid AI
N/A N/A - 28.9% - Jan 5, 2026
Liquid AI AI provider logo - LFM2.5-1.2B-Thinking
#18 LFM2.5-1.2B-Thinking
by Liquid AI
N/A N/A - 33.9% - Jan 20, 2026
Liquid AI AI provider logo - LFM2 8B A1B
#19 LFM2 8B A1B
by Liquid AI
N/A N/A 50.5% 34.4% 25.3% Oct 7, 2025
Liquid AI AI provider logo - LFM2 2.6B
#20 LFM2 2.6B
by Liquid AI
N/A N/A 29.8% 30.6% 8.3% Sep 23, 2025
Liquid AI AI provider logo - LFM2.5-1.2B-Instruct
#21 LFM2.5-1.2B-Instruct
by Liquid AI
N/A N/A - 32.6% - Jan 5, 2026
Upstage AI provider logo - Solar Pro 2 (Non-reasoning)
#22 Solar Pro 2 (Non-reasoning)
by Upstage
N/A N/A 75.0% 56.1% 30.0% Jul 9, 2025
Upstage AI provider logo - Solar Pro 2 (Reasoning)
#23 Solar Pro 2 (Reasoning)
by Upstage
N/A N/A 80.5% 68.7% 61.3% Jul 9, 2025
Upstage AI provider logo - Solar Open 100B (Reasoning)
#24 Solar Open 100B (Reasoning)
by Upstage
N/A N/A - 65.7% - Dec 17, 2025
NVIDIA AI provider logo - Llama 3.3 Nemotron Super 49B v1 (Reasoning)
#25 Llama 3.3 Nemotron Super 49B v1 (Reasoning)
by NVIDIA
N/A N/A 78.5% 64.3% 54.7% Mar 18, 2025
NVIDIA AI provider logo - Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)
#26 Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning)
by NVIDIA
N/A N/A 55.6% 40.8% 50.0% May 20, 2025
NVIDIA AI provider logo - Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)
#27 Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)
by NVIDIA
N/A N/A 69.8% 51.7% 7.7% Mar 18, 2025
Kimi AI provider logo - Kimi Linear 48B A3B Instruct
#28 Kimi Linear 48B A3B Instruct
by Kimi
N/A N/A 58.5% 41.2% 36.3% Oct 30, 2025
StepFun AI provider logo - Step3 VL 10B
#29 Step3 VL 10B
by StepFun
N/A N/A - 69.0% - Jan 20, 2026
Allen Institute for AI AI provider logo - Molmo 7B-D
#30 Molmo 7B-D
by Allen Institute for AI
N/A N/A 37.1% 24.0% - Sep 25, 2024
Allen Institute for AI AI provider logo - Molmo2-8B
#31 Molmo2-8B
by Allen Institute for AI
N/A N/A - 42.5% - Dec 11, 2025
Allen Institute for AI AI provider logo - Olmo 3.1 32B Think
#32 Olmo 3.1 32B Think
by Allen Institute for AI
N/A N/A 76.3% 59.1% 77.3% Dec 12, 2025
IBM AI provider logo - Granite 4.0 1B
#33 Granite 4.0 1B
by IBM
N/A N/A 32.5% 28.1% 6.3% Oct 28, 2025
IBM AI provider logo - Granite 4.0 Micro
#34 Granite 4.0 Micro
by IBM
N/A N/A 44.7% 33.6% 6.0% Sep 22, 2025
IBM AI provider logo - Granite 4.0 350M
#35 Granite 4.0 350M
by IBM
N/A N/A 12.4% 26.1% - Oct 28, 2025
IBM AI provider logo - Granite 4.0 H 350M
#36 Granite 4.0 H 350M
by IBM
N/A N/A 12.7% 25.7% 1.3% Oct 28, 2025
IBM AI provider logo - Granite 4.0 H 1B
#37 Granite 4.0 H 1B
by IBM
N/A N/A 27.7% 26.3% 6.3% Oct 28, 2025
Nous Research AI provider logo - DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)
#38 DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning)
by Nous Research
N/A N/A 36.5% 27.0% - Feb 13, 2025
Nous Research AI provider logo - DeepHermes 3 - Mistral 24B Preview (Non-reasoning)
#39 DeepHermes 3 - Mistral 24B Preview (Non-reasoning)
by Nous Research
N/A N/A 58.0% 38.2% - Mar 13, 2025
LG AI Research AI provider logo - K-EXAONE (Reasoning)
#40 K-EXAONE (Reasoning)
by LG AI Research
N/A N/A 83.8% 78.3% 90.3% Dec 31, 2025
LG AI Research AI provider logo - K-EXAONE (Non-reasoning)
#41 K-EXAONE (Non-reasoning)
by LG AI Research
N/A N/A 81.0% 69.5% 44.0% Dec 31, 2025
LG AI Research AI provider logo - Exaone 4.0 1.2B (Non-reasoning)
#42 Exaone 4.0 1.2B (Non-reasoning)
by LG AI Research
N/A N/A 50.0% 42.4% 24.0% Jul 15, 2025
LG AI Research AI provider logo - Exaone 4.0 1.2B (Reasoning)
#43 Exaone 4.0 1.2B (Reasoning)
by LG AI Research
N/A N/A 58.8% 51.5% 50.3% Jul 15, 2025
Baidu AI provider logo - ERNIE 5.0 Thinking Preview
#44 ERNIE 5.0 Thinking Preview
by Baidu
N/A N/A 83.0% 77.7% 85.0% Nov 13, 2025
Meta AI provider logo - Llama 65B
#45 Llama 65B
by Meta
N/A N/A - - - Feb 24, 2023
Prime Intellect AI provider logo - INTELLECT-3
#46 INTELLECT-3
by Prime Intellect
N/A N/A 82.2% 76.1% 88.0% Nov 27, 2025
Motif Technologies AI provider logo - Motif-2-12.7B-Reasoning
#47 Motif-2-12.7B-Reasoning
by Motif Technologies
N/A N/A 79.6% 69.5% 80.3% Dec 4, 2025
MBZUAI Institute of Foundation Models AI provider logo - K2-V2 (medium)
#48 K2-V2 (medium)
by MBZUAI Institute of Foundation Models
N/A N/A 76.1% 59.8% 64.7% Dec 5, 2025
MBZUAI Institute of Foundation Models AI provider logo - K2-V2 (low)
#49 K2-V2 (low)
by MBZUAI Institute of Foundation Models
N/A N/A 71.3% 54.1% 35.3% Dec 5, 2025
MBZUAI Institute of Foundation Models AI provider logo - K2-V2 (high)
#50 K2-V2 (high)
by MBZUAI Institute of Foundation Models
N/A N/A 78.6% 68.1% 78.3% Dec 5, 2025
MBZUAI Institute of Foundation Models AI provider logo - K2 Think V2
#51 K2 Think V2
by MBZUAI Institute of Foundation Models
N/A N/A - 71.3% - Dec 15, 2025
Korea Telecom AI provider logo - Mi:dm K 2.5 Pro
#52 Mi:dm K 2.5 Pro
by Korea Telecom
N/A N/A 80.9% 70.1% 76.7% Dec 11, 2025
Korea Telecom AI provider logo - Mi:dm K 2.5 Pro Preview
#53 Mi:dm K 2.5 Pro Preview
by Korea Telecom
N/A N/A 81.3% 72.2% 78.7% Dec 11, 2025
Naver AI provider logo - HyperCLOVA X SEED Think (32B)
#54 HyperCLOVA X SEED Think (32B)
by Naver
N/A N/A 78.5% 61.5% 59.0% Dec 26, 2025
Trillion Labs AI provider logo - Tri-21B-Think
#55 Tri-21B-Think
by Trillion Labs
N/A N/A - 60.1% - Feb 10, 2026
Trillion Labs AI provider logo - Tri-21B-think Preview
#56 Tri-21B-think Preview
by Trillion Labs
N/A N/A - 53.8% - Feb 10, 2026
Cohere AI provider logo - Tiny Aya Global
#57 Tiny Aya Global
by Cohere
N/A N/A - 30.5% - Feb 17, 2026
ServiceNow AI provider logo - Apriel-v1.6-15B-Thinker
#58 Apriel-v1.6-15B-Thinker
by ServiceNow
N/A N/A 79.0% 73.3% 88.0% Nov 25, 2025
AI21 Labs AI provider logo - Jamba Reasoning 3B
#59 Jamba Reasoning 3B
by AI21 Labs
N/A N/A 57.7% 33.3% 10.7% Oct 8, 2025
AI21 Labs AI provider logo - Jamba 1.7 Mini
#60 Jamba 1.7 Mini
by AI21 Labs
N/A N/A 38.8% 32.2% 0.3% Jul 7, 2025
Alibaba AI provider logo - Qwen Chat 14B
#61 Qwen Chat 14B
by Alibaba
N/A N/A - - - Sep 25, 2023
Alibaba AI provider logo - Qwen3 4B 2507 (Reasoning)
#62 Qwen3 4B 2507 (Reasoning)
by Alibaba
N/A N/A 74.3% 66.7% 82.7% Aug 6, 2025
Alibaba AI provider logo - Qwen3 VL 4B Instruct
#63 Qwen3 VL 4B Instruct
by Alibaba
N/A N/A 63.4% 37.1% 37.0% Oct 14, 2025
Alibaba AI provider logo - Qwen3 VL 4B (Reasoning)
#64 Qwen3 VL 4B (Reasoning)
by Alibaba
N/A N/A 70.0% 49.4% 25.7% Oct 14, 2025
Alibaba AI provider logo - Qwen3 4B 2507 Instruct
#65 Qwen3 4B 2507 Instruct
by Alibaba
N/A N/A 67.2% 51.7% 52.3% Aug 6, 2025
InclusionAI AI provider logo - Ring-1T
#66 Ring-1T
by InclusionAI
N/A N/A 80.6% 77.4% 89.3% Oct 13, 2025
InclusionAI AI provider logo - Ling-1T
#67 Ling-1T
by InclusionAI
N/A N/A 82.2% 71.9% 71.3% Oct 8, 2025
ByteDance Seed AI provider logo - Doubao Seed 2.0 lite (Reasoning)
#68 Doubao Seed 2.0 lite (Reasoning)
by ByteDance Seed
N/A N/A - 65.6% - Feb 15, 2026
ByteDance Seed AI provider logo - Doubao Seed Code
#69 Doubao Seed Code
by ByteDance Seed
N/A N/A 85.4% 76.4% 79.3% Nov 11, 2025
OpenAI AI provider logo - o1-mini
#70 o1-mini
by OpenAI
N/A N/A 74.2% 60.3% - Sep 12, 2024
OpenAI AI provider logo - GPT-4o (ChatGPT)
#71 GPT-4o (ChatGPT)
by OpenAI
N/A N/A 77.3% 51.1% - Feb 15, 2025
OpenAI AI provider logo - GPT-4o mini Realtime (Dec '24)
#72 GPT-4o mini Realtime (Dec '24)
by OpenAI
N/A N/A - - - Dec 17, 2024
OpenAI AI provider logo - GPT-4o (March 2025, chatgpt-4o-latest)
#73 GPT-4o (March 2025, chatgpt-4o-latest)
by OpenAI
N/A N/A 80.3% 65.5% 25.7% Mar 27, 2025
OpenAI AI provider logo - GPT-3.5 Turbo (0613)
#74 GPT-3.5 Turbo (0613)
by OpenAI
N/A N/A - - - Jun 13, 2023
OpenAI AI provider logo - GPT-4o Realtime (Dec '24)
#75 GPT-4o Realtime (Dec '24)
by OpenAI
N/A N/A - - - Dec 17, 2024
OpenAI AI provider logo - GPT-4.5 (Preview)
#76 GPT-4.5 (Preview)
by OpenAI
N/A N/A - - - Feb 27, 2025
Meta AI provider logo - Llama 2 Chat 70B
#77 Llama 2 Chat 70B
by Meta
N/A N/A 40.6% 32.7% - Jul 18, 2023
Meta AI provider logo - Llama 2 Chat 13B
#78 Llama 2 Chat 13B
by Meta
N/A N/A 40.6% 32.1% - Jul 18, 2023
Google AI provider logo - Gemini 2.0 Pro Experimental (Feb '25)
#79 Gemini 2.0 Pro Experimental (Feb '25)
by Google
N/A N/A 80.5% 62.2% - Feb 5, 2025
Google AI provider logo - Gemini 2.0 Flash (experimental)
#80 Gemini 2.0 Flash (experimental)
by Google
N/A N/A 78.2% 63.6% - Dec 11, 2024
Google AI provider logo - Gemini 1.5 Pro (Sep '24)
#81 Gemini 1.5 Pro (Sep '24)
by Google
N/A N/A 75.0% 58.9% - Sep 24, 2024
Google AI provider logo - Gemini 2.0 Flash-Lite (Preview)
#82 Gemini 2.0 Flash-Lite (Preview)
by Google
N/A N/A - 54.2% - Feb 5, 2025
Google AI provider logo - Gemini 1.5 Flash (Sep '24)
#83 Gemini 1.5 Flash (Sep '24)
by Google
N/A N/A 68.0% 46.3% - Sep 24, 2024
Google AI provider logo - Gemini 1.5 Flash-8B
#84 Gemini 1.5 Flash-8B
by Google
N/A N/A 56.9% 35.9% - Oct 3, 2024
Google AI provider logo - PALM-2
#85 PALM-2
by Google
N/A N/A - - - May 10, 2023
Google AI provider logo - Gemini 2.0 Flash-Lite (Feb '25)
#86 Gemini 2.0 Flash-Lite (Feb '25)
by Google
N/A N/A 72.4% 53.5% - Feb 25, 2025
Google AI provider logo - Gemini 1.0 Ultra
#87 Gemini 1.0 Ultra
by Google
N/A N/A - - - Dec 6, 2023
Google AI provider logo - Gemini 1.0 Pro
#88 Gemini 1.0 Pro
by Google
N/A N/A 43.1% 27.7% - Dec 6, 2023
Google AI provider logo - Gemini 1.5 Flash (May '24)
#89 Gemini 1.5 Flash (May '24)
by Google
N/A N/A 57.4% 32.4% - May 14, 2024
Google AI provider logo - Gemma 3n E4B Instruct Preview (May '25)
#90 Gemma 3n E4B Instruct Preview (May '25)
by Google
N/A N/A 48.3% 27.8% - May 20, 2025
Google AI provider logo - Gemini 2.0 Flash Thinking Experimental (Dec '24)
#91 Gemini 2.0 Flash Thinking Experimental (Dec '24)
by Google
N/A N/A - - - Dec 19, 2024
Google AI provider logo - Gemini 2.0 Flash Thinking Experimental (Jan '25)
#92 Gemini 2.0 Flash Thinking Experimental (Jan '25)
by Google
N/A N/A 79.8% 70.1% - Jan 21, 2025
Google AI provider logo - Gemini 2.5 Flash Preview (Non-reasoning)
#93 Gemini 2.5 Flash Preview (Non-reasoning)
by Google
N/A N/A 78.3% 59.4% - Apr 17, 2025
Google AI provider logo - Gemini 2.5 Flash Preview (Reasoning)
#94 Gemini 2.5 Flash Preview (Reasoning)
by Google
N/A N/A 80.0% 69.8% - Apr 17, 2025
Google AI provider logo - Gemini 2.5 Pro Preview (Mar' 25)
#95 Gemini 2.5 Pro Preview (Mar' 25)
by Google
N/A N/A 85.8% 83.6% - Mar 25, 2025
Google AI provider logo - Gemini 1.5 Pro (May '24)
#96 Gemini 1.5 Pro (May '24)
by Google
N/A N/A 65.7% 37.1% - May 15, 2024
Anthropic AI provider logo - Claude Instant
#97 Claude Instant
by Anthropic
N/A N/A 43.4% 33.0% - Mar 14, 2023
Anthropic AI provider logo - Claude 2.1
#98 Claude 2.1
by Anthropic
N/A N/A 49.5% 31.9% - Nov 21, 2023
Anthropic AI provider logo - Claude 2.0
#99 Claude 2.0
by Anthropic
N/A N/A 48.6% 34.4% - Jul 11, 2023
Mistral AI provider logo - Mixtral 8x22B Instruct
#100 Mixtral 8x22B Instruct
by Mistral
N/A N/A 53.7% 33.2% - Apr 17, 2024
Showing 100 of 408 models

Understanding the AI Model Leaderboard

This comprehensive AI model leaderboard helps you compare and choose the best large language models (LLMs) for your needs. We track standardized AI benchmarks, token pricing, inference speed, and model capabilities across all major AI providers like OpenAI, Anthropic, Google, Meta, and DeepSeek.

Core AI Benchmarks Explained

  • MMLU-Pro: Tests broad knowledge across 14 academic subjects including STEM, humanities, and social sciences - the foundational intelligence benchmark
  • GPQA: Graduate-level Google-Proof Q&A benchmark - measures PhD-level reasoning and advanced problem-solving capabilities
  • AIME 2025: American Invitational Mathematics Examination - evaluates elite mathematical reasoning and competition-level problem solving
  • Coding Index: Composite score of LiveCodeBench, SciCode, and coding benchmarks - measures programming ability
  • Math Index: Composite score of AIME, MATH-500, and mathematical reasoning tests

Key Metrics to Consider

  • Token Pricing: Compare input vs output token costs per million - crucial for estimating API expenses and optimizing usage patterns
  • Inference Speed: Measured in tokens/second - determines response time for chatbots, streaming, and real-time applications
  • Release Date: Newer models often incorporate latest training techniques and updated knowledge cutoffs
  • Benchmark Scores: Percentage scores (0-100%) make it easy to compare model capabilities at a glance

How to Choose the Right AI Model for Your Use Case

For Research & Analysis

Prioritize models with high MMLU-Pro (70%+) and GPQA (60%+) scores for complex reasoning tasks, academic research, and technical documentation

For Cost Optimization

Sort by input/output pricing - smaller models often deliver 80% of flagship performance at 10% of the cost for simple tasks

For Math & STEM

Filter by Math Index or AIME 2025 scores (50%+) for quantitative analysis, engineering calculations, and scientific applications

All benchmark scores and pricing data are updated daily from Artificial Analysis to reflect the latest model versions and capabilities. Use the sort filters above to find AI models by intelligence, cost, coding ability, math performance, speed, or release date.

Frequently Asked Questions

What is MMLU-Pro and why is it the standard AI intelligence benchmark?

MMLU-Pro (Massive Multitask Language Understanding - Professional) is the most comprehensive AI benchmark, testing models across 14 academic subjects including mathematics, science, history, law, and ethics. Scores range from 46% (basic competency) to 87% (near-expert level). Models scoring above 75% demonstrate strong general intelligence suitable for professional applications, while scores below 60% indicate limitations in complex reasoning tasks.

What does GPQA measure and which models score highest?

GPQA (Graduate-level Google-Proof Q&A) tests PhD-level reasoning with questions designed to be "Google-proof" - requiring deep understanding rather than simple fact retrieval. Top models like GPT-5.1 (87.3%), GPT-5 mini (82.8%), and o3 (82.7%) excel at GPQA, making them ideal for research, technical analysis, and complex problem-solving. Models below 50% GPQA struggle with advanced reasoning and may provide superficial answers to complex questions.

What is AIME 2025 and how does it evaluate AI mathematical ability?

AIME 2025 (American Invitational Mathematics Examination) is an elite math competition benchmark that tests advanced problem-solving, algebra, geometry, and number theory. Scores above 80% (like GPT-5 Codex at 98.7% or GPT-5.1 at 94%) indicate exceptional mathematical reasoning suitable for engineering, scientific computing, and quantitative analysis. Models scoring below 50% may struggle with multi-step mathematical problems or require explicit problem breakdown.

How is AI model pricing calculated and what's considered cost-effective?

AI model pricing is measured per 1 million tokens (approximately 750,000 words). Input pricing covers text you send, while output pricing covers generated responses. Budget models like Llama 3.3 70B cost $0.54/$0.71 per million tokens, mid-tier models like GPT-5 nano cost $0.05/$0.40, while premium models like GPT-5 cost $1.25/$10. For typical applications with 3:1 input-to-output ratio, budget models can be 10-20x cheaper than flagship models while maintaining 70-80% performance.

Which AI models are best for coding and programming tasks?

Sort by Coding Index to see top programming models. Our Coding Index combines LiveCodeBench, SciCode, and coding benchmarks. Top performers include GPT-5.1 (57.5 index), GPT-5 mini (51.4), and GPT-5 Codex (53.5). These models excel at code generation, debugging, refactoring, and explaining complex algorithms. For budget-conscious developers, models with 40+ coding index scores offer excellent value for routine programming tasks.

How often are AI model benchmarks and rankings updated?

Our leaderboard syncs daily with Artificial Analysis API to ensure benchmark scores (MMLU-Pro, GPQA, AIME 2025), pricing, and inference speed data reflect the latest model versions. New model releases appear immediately under the "Newest" sort option. Benchmark scores can change when providers release updated versions - for example, GPT-5.1 released in November 2025 achieved 69.7 intelligence compared to GPT-5's 68.5 from August 2025.

What inference speed (tokens/second) do I need for my application?

Inference speed determines how fast models generate responses. For real-time chatbots and interactive applications, target 100+ tokens/second (models like gpt-oss-120B at 340 tok/s). For background processing and batch jobs, 50-100 tok/s is sufficient. Premium reasoning models like GPT-5 (103 tok/s) balance speed and capability. Note that higher inference speed doesn't always mean better quality - slower models often deliver more thoughtful, detailed responses.

Can I test these AI models for free before committing?

Yes! Try our free AI chat interface to test different models instantly without creating an account. Many providers also offer free tiers: OpenAI (ChatGPT with daily limits), Anthropic (Claude with usage caps), Google (Gemini free tier), and open-source models like Llama 3.3. Compare performance on your specific use case before upgrading to paid plans.