AI Model Ranking (LLM Leaderboard)

Newest AI Models

Latest language model releases sorted by date

Model
AI model name and provider organization
Price/1M
Cost per 1 million tokens β€” input (text you send) / output (text the model generates)
MMLU-Pro
Massive Multitask Language Understanding (Professional) - tests broad knowledge across 14 subjects including STEM, humanities, and social sciences
GPQA
Graduate-level Google-Proof Q&A benchmark - tests PhD-level reasoning and advanced intelligence
AIME 2025
American Invitational Mathematics Examination 2025 - tests advanced mathematical problem-solving ability
Release
When the model was released - newer models may have more capabilities
Compare
Meta AI provider logo - Muse Spark
#1 Muse Spark
by Meta
N/A / N/A - 88.4% - Apr 8, 2026
Chat now
xAI AI provider logo - Grok 4.20 0309 v2 (Reasoning)
#2 Grok 4.20 0309 v2 (Reasoning)
by xAI
$2.00 / $6.00 - 91.1% - Apr 7, 2026
Chat now
Z AI AI provider logo - GLM-5.1 (Non-reasoning)
#3 GLM-5.1 (Non-reasoning)
by Z AI
$1.40 / $4.40 - 83.9% - Apr 7, 2026
Chat now
Z AI AI provider logo - GLM-5.1 (Reasoning)
#4 GLM-5.1 (Reasoning)
by Z AI
$1.40 / $4.40 - 86.8% - Apr 7, 2026
Chat now
Upstage AI provider logo - Solar Pro 3
#5 Solar Pro 3
by Upstage
N/A / N/A - 72.4% - Apr 6, 2026
Chat now
Google AI provider logo - Gemma 4 E4B (Reasoning)
#6 Gemma 4 E4B (Reasoning)
by Google
N/A / N/A - 57.6% - Apr 3, 2026
Chat now
Google AI provider logo - Gemma 4 E4B (Non-reasoning)
#7 Gemma 4 E4B (Non-reasoning)
by Google
N/A / N/A - 54.9% - Apr 3, 2026
Chat now
Google AI provider logo - Gemma 4 31B (Reasoning)
#8 Gemma 4 31B (Reasoning)
by Google
N/A / N/A - 85.7% - Apr 2, 2026
Chat now
Google AI provider logo - Gemma 4 26B A4B (Reasoning)
#9 Gemma 4 26B A4B (Reasoning)
by Google
$0.13 / $0.40 - 79.2% - Apr 2, 2026
Chat now
Google AI provider logo - Gemma 4 E2B (Non-reasoning)
#10 Gemma 4 E2B (Non-reasoning)
by Google
N/A / N/A - 40.5% - Apr 2, 2026
Chat now
Google AI provider logo - Gemma 4 E2B (Reasoning)
#11 Gemma 4 E2B (Reasoning)
by Google
N/A / N/A - 43.3% - Apr 2, 2026
Chat now
Google AI provider logo - Gemma 4 31B (Non-reasoning)
#12 Gemma 4 31B (Non-reasoning)
by Google
N/A / N/A - 76.3% - Apr 2, 2026
Chat now
Google AI provider logo - Gemma 4 26B A4B (Non-reasoning)
#13 Gemma 4 26B A4B (Non-reasoning)
by Google
N/A / N/A - 71.4% - Apr 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.6 Plus
#14 Qwen3.6 Plus
by Alibaba
$0.50 / $3.00 - 88.2% - Apr 2, 2026
Chat now
Arcee AI AI provider logo - Trinity Large Thinking
#15 Trinity Large Thinking
by Arcee AI
$0.23 / $0.88 - 75.2% - Apr 1, 2026
Chat now
Z AI AI provider logo - GLM 5V Turbo (Reasoning)
#16 GLM 5V Turbo (Reasoning)
by Z AI
N/A / N/A - 80.9% - Apr 1, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 Omni Plus
#17 Qwen3.5 Omni Plus
by Alibaba
$0.40 / $4.80 - 82.6% - Mar 30, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 Omni Flash
#18 Qwen3.5 Omni Flash
by Alibaba
$0.10 / $0.80 - 74.2% - Mar 30, 2026
Chat now
KwaiKAT AI provider logo - KAT Coder Pro V2
#19 KAT Coder Pro V2
by KwaiKAT
$0.30 / $1.20 - 85.5% - Mar 27, 2026
Chat now
NVIDIA AI provider logo - Nemotron Cascade 2 30B A3B
#20 Nemotron Cascade 2 30B A3B
by NVIDIA
N/A / N/A - 76.3% - Mar 19, 2026
Chat now
Xiaomi AI provider logo - MiMo-V2-Omni
#21 MiMo-V2-Omni
by Xiaomi
N/A / N/A - 82.8% - Mar 19, 2026
Chat now
MiniMax AI provider logo - MiniMax-M2.7
#22 MiniMax-M2.7
by MiniMax
$0.30 / $1.20 - 87.4% - Mar 18, 2026
Chat now
Xiaomi AI provider logo - MiMo-V2-Pro
#23 MiMo-V2-Pro
by Xiaomi
$1.00 / $3.00 - 87.0% - Mar 18, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 mini (medium)
#24 GPT-5.4 mini (medium)
by OpenAI
$0.75 / $4.50 - 82.3% - Mar 17, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 nano (xhigh)
#25 GPT-5.4 nano (xhigh)
by OpenAI
$0.20 / $1.25 - 81.7% - Mar 17, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 mini (xhigh)
#26 GPT-5.4 mini (xhigh)
by OpenAI
$0.75 / $4.50 - 87.5% - Mar 17, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 mini (Non-Reasoning)
#27 GPT-5.4 mini (Non-Reasoning)
by OpenAI
$0.75 / $4.50 - 60.6% - Mar 17, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 nano (medium)
#28 GPT-5.4 nano (medium)
by OpenAI
$0.20 / $1.25 - 76.1% - Mar 17, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 nano (Non-Reasoning)
#29 GPT-5.4 nano (Non-Reasoning)
by OpenAI
$0.20 / $1.25 - 55.8% - Mar 17, 2026
Chat now
Mistral AI provider logo - Mistral Small 4 (Non-reasoning)
#30 Mistral Small 4 (Non-reasoning)
by Mistral
$0.15 / $0.60 - 57.1% - Mar 16, 2026
Chat now
Mistral AI provider logo - Mistral Small 4 (Reasoning)
#31 Mistral Small 4 (Reasoning)
by Mistral
$0.15 / $0.60 - 76.9% - Mar 16, 2026
Chat now
NVIDIA AI provider logo - NVIDIA Nemotron 3 Nano 4B
#32 NVIDIA Nemotron 3 Nano 4B
by NVIDIA
N/A / N/A - 51.3% - Mar 16, 2026
Chat now
Z AI AI provider logo - GLM-5-Turbo
#33 GLM-5-Turbo
by Z AI
N/A / N/A - 84.7% - Mar 15, 2026
Chat now
NVIDIA AI provider logo - NVIDIA Nemotron 3 Super 120B A12B (Reasoning)
#34 NVIDIA Nemotron 3 Super 120B A12B (Reasoning)
by NVIDIA
$0.30 / $0.75 - 80.0% - Mar 11, 2026
Chat now
xAI AI provider logo - Grok 4.20 0309 (Reasoning)
#35 Grok 4.20 0309 (Reasoning)
by xAI
$2.00 / $6.00 - 88.5% - Mar 10, 2026
Chat now
xAI AI provider logo - Grok 4.20 0309 (Non-reasoning)
#36 Grok 4.20 0309 (Non-reasoning)
by xAI
$2.00 / $6.00 - 78.5% - Mar 10, 2026
Chat now
Sarvam AI provider logo - Sarvam 105B (high)
#37 Sarvam 105B (high)
by Sarvam
N/A / N/A - 73.8% - Mar 6, 2026
Chat now
Sarvam AI provider logo - Sarvam 30B (high)
#38 Sarvam 30B (high)
by Sarvam
N/A / N/A - 63.3% - Mar 6, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 (xhigh)
#39 GPT-5.4 (xhigh)
by OpenAI
$2.50 / $15.00 - 92.0% - Mar 5, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 (Non-reasoning)
#40 GPT-5.4 (Non-reasoning)
by OpenAI
$2.50 / $15.00 - 74.8% - Mar 5, 2026
Chat now
OpenAI AI provider logo - GPT-5.4 Pro (xhigh)
#41 GPT-5.4 Pro (xhigh)
by OpenAI
$30.00 / $180.00 - - - Mar 5, 2026
Chat now
Google AI provider logo - Gemini 3.1 Flash-Lite Preview
#42 Gemini 3.1 Flash-Lite Preview
by Google
$0.25 / $1.50 - 82.2% - Mar 3, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 2B (Reasoning)
#43 Qwen3.5 2B (Reasoning)
by Alibaba
$0.02 / $0.10 - 45.6% - Mar 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 9B (Reasoning)
#44 Qwen3.5 9B (Reasoning)
by Alibaba
$0.07 / $0.17 - 80.6% - Mar 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 9B (Non-reasoning)
#45 Qwen3.5 9B (Non-reasoning)
by Alibaba
$0.04 / $0.20 - 78.6% - Mar 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 2B (Non-reasoning)
#46 Qwen3.5 2B (Non-reasoning)
by Alibaba
$0.02 / $0.10 - 43.8% - Mar 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 0.8B (Reasoning)
#47 Qwen3.5 0.8B (Reasoning)
by Alibaba
$0.01 / $0.05 - 11.1% - Mar 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 4B (Reasoning)
#48 Qwen3.5 4B (Reasoning)
by Alibaba
$0.03 / $0.15 - 77.1% - Mar 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 0.8B (Non-reasoning)
#49 Qwen3.5 0.8B (Non-reasoning)
by Alibaba
$0.01 / $0.05 - 23.6% - Mar 2, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 4B (Non-reasoning)
#50 Qwen3.5 4B (Non-reasoning)
by Alibaba
$0.03 / $0.15 - 71.2% - Mar 2, 2026
Chat now
Liquid AI AI provider logo - LFM2 24B A2B
#51 LFM2 24B A2B
by Liquid AI
$0.03 / $0.12 - 47.4% - Feb 25, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 27B (Reasoning)
#52 Qwen3.5 27B (Reasoning)
by Alibaba
$0.30 / $2.40 - 85.8% - Feb 24, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 35B A3B (Reasoning)
#53 Qwen3.5 35B A3B (Reasoning)
by Alibaba
$0.25 / $2.00 - 84.5% - Feb 24, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 122B A10B (Non-reasoning)
#54 Qwen3.5 122B A10B (Non-reasoning)
by Alibaba
$0.40 / $3.20 - 82.7% - Feb 24, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 122B A10B (Reasoning)
#55 Qwen3.5 122B A10B (Reasoning)
by Alibaba
$0.40 / $3.20 - 85.7% - Feb 24, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 35B A3B (Non-reasoning)
#56 Qwen3.5 35B A3B (Non-reasoning)
by Alibaba
$0.25 / $2.00 - 81.9% - Feb 24, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 27B (Non-reasoning)
#57 Qwen3.5 27B (Non-reasoning)
by Alibaba
$0.30 / $2.40 - 84.2% - Feb 24, 2026
Chat now
Inception AI provider logo - Mercury 2
#58 Mercury 2
by Inception
$0.25 / $0.75 - 77.0% - Feb 20, 2026
Chat now
Google AI provider logo - Gemini 3.1 Pro Preview
#59 Gemini 3.1 Pro Preview
by Google
$2.00 / $12.00 - 94.1% - Feb 19, 2026
Chat now
Anthropic AI provider logo - Claude Sonnet 4.6 (Non-reasoning, High Effort)
#60 Claude Sonnet 4.6 (Non-reasoning, High Effort)
by Anthropic
$3.00 / $15.00 - 79.9% - Feb 17, 2026
Chat now
Anthropic AI provider logo - Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)
#61 Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)
by Anthropic
$3.00 / $15.00 - 87.5% - Feb 17, 2026
Chat now
Anthropic AI provider logo - Claude Sonnet 4.6 (Non-reasoning, Low Effort)
#62 Claude Sonnet 4.6 (Non-reasoning, Low Effort)
by Anthropic
$3.00 / $15.00 - 79.7% - Feb 17, 2026
Chat now
Cohere AI provider logo - Tiny Aya Global
#63 Tiny Aya Global
by Cohere
N/A / N/A - 30.5% - Feb 17, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 397B A17B (Reasoning)
#64 Qwen3.5 397B A17B (Reasoning)
by Alibaba
$0.60 / $3.60 - 89.3% - Feb 16, 2026
Chat now
Alibaba AI provider logo - Qwen3.5 397B A17B (Non-reasoning)
#65 Qwen3.5 397B A17B (Non-reasoning)
by Alibaba
$0.60 / $3.60 - 86.1% - Feb 16, 2026
Chat now
MiniMax AI provider logo - MiniMax-M2.5
#66 MiniMax-M2.5
by MiniMax
$0.30 / $1.20 - 84.8% - Feb 12, 2026
Chat now
Nanbeige AI provider logo - Nanbeige4.1-3B
#67 Nanbeige4.1-3B
by Nanbeige
N/A / N/A - 84.9% - Feb 11, 2026
Chat now
Z AI AI provider logo - GLM-5 (Reasoning)
#68 GLM-5 (Reasoning)
by Z AI
$1.00 / $3.20 - 82.0% - Feb 11, 2026
Chat now
Z AI AI provider logo - GLM-5 (Non-reasoning)
#69 GLM-5 (Non-reasoning)
by Z AI
$1.00 / $3.20 - 66.6% - Feb 11, 2026
Chat now
Trillion Labs AI provider logo - Tri-21B-think Preview
#70 Tri-21B-think Preview
by Trillion Labs
N/A / N/A - 53.8% - Feb 10, 2026
Chat now
Trillion Labs AI provider logo - Tri-21B-Think
#71 Tri-21B-Think
by Trillion Labs
N/A / N/A - 60.1% - Feb 10, 2026
Chat now
OpenAI AI provider logo - GPT-5.3 Codex (xhigh)
#72 GPT-5.3 Codex (xhigh)
by OpenAI
$1.75 / $14.00 - 91.5% - Feb 5, 2026
Chat now
Google AI provider logo - Gemini 3 Deep Think
#73 Gemini 3 Deep Think
by Google
N/A / N/A - - - Feb 5, 2026
Chat now
Anthropic AI provider logo - Claude Opus 4.6 (Non-reasoning, High Effort)
#74 Claude Opus 4.6 (Non-reasoning, High Effort)
by Anthropic
$5.00 / $25.00 - 84.0% - Feb 5, 2026
Chat now
Anthropic AI provider logo - Claude Opus 4.6 (Adaptive Reasoning, Max Effort)
#75 Claude Opus 4.6 (Adaptive Reasoning, Max Effort)
by Anthropic
$5.00 / $25.00 - 89.6% - Feb 5, 2026
Chat now
Alibaba AI provider logo - Qwen3 Coder Next
#76 Qwen3 Coder Next
by Alibaba
$0.35 / $1.20 - 73.7% - Feb 3, 2026
Chat now
StepFun AI provider logo - Step 3.5 Flash
#77 Step 3.5 Flash
by StepFun
$0.10 / $0.30 - 83.1% - Feb 2, 2026
Chat now
LongCat AI provider logo - LongCat Flash Lite
#78 LongCat Flash Lite
by LongCat
N/A / N/A - 63.6% - Jan 28, 2026
Chat now
Kimi AI provider logo - Kimi K2.5 (Reasoning)
#79 Kimi K2.5 (Reasoning)
by Kimi
$0.60 / $3.00 - 87.9% - Jan 27, 2026
Chat now
Kimi AI provider logo - Kimi K2.5 (Non-reasoning)
#80 Kimi K2.5 (Non-reasoning)
by Kimi
$0.60 / $3.00 - 78.9% - Jan 27, 2026
Chat now
Alibaba AI provider logo - Qwen3 Max Thinking
#81 Qwen3 Max Thinking
by Alibaba
$1.20 / $6.00 - 86.1% - Jan 26, 2026
Chat now
Liquid AI AI provider logo - LFM2.5-1.2B-Thinking
#82 LFM2.5-1.2B-Thinking
by Liquid AI
N/A / N/A - 33.9% - Jan 20, 2026
Chat now
StepFun AI provider logo - Step3 VL 10B
#83 Step3 VL 10B
by StepFun
N/A / N/A - 69.0% - Jan 20, 2026
Chat now
Z AI AI provider logo - GLM-4.7-Flash (Reasoning)
#84 GLM-4.7-Flash (Reasoning)
by Z AI
$0.07 / $0.40 - 58.1% - Jan 19, 2026
Chat now
Z AI AI provider logo - GLM-4.7-Flash (Non-reasoning)
#85 GLM-4.7-Flash (Non-reasoning)
by Z AI
$0.07 / $0.40 - 45.2% - Jan 19, 2026
Chat now
Allen Institute for AI AI provider logo - Olmo 3.1 32B Instruct
#86 Olmo 3.1 32B Instruct
by Allen Institute for AI
$0.20 / $0.60 - 53.9% - Jan 13, 2026
Chat now
Liquid AI AI provider logo - LFM2.5-VL-1.6B
#87 LFM2.5-VL-1.6B
by Liquid AI
N/A / N/A - 28.9% - Jan 5, 2026
Chat now
Liquid AI AI provider logo - LFM2.5-1.2B-Instruct
#88 LFM2.5-1.2B-Instruct
by Liquid AI
N/A / N/A - 32.6% - Jan 5, 2026
Chat now
TII UAE AI provider logo - Falcon-H1R-7B
#89 Falcon-H1R-7B
by TII UAE
N/A / N/A 72.5% 66.1% 80.0% Jan 4, 2026
Chat now
LG AI Research AI provider logo - K-EXAONE (Reasoning)
#90 K-EXAONE (Reasoning)
by LG AI Research
N/A / N/A 83.8% 78.3% 90.3% Dec 31, 2025
Chat now
LG AI Research AI provider logo - K-EXAONE (Non-reasoning)
#91 K-EXAONE (Non-reasoning)
by LG AI Research
N/A / N/A 81.0% 69.5% 44.0% Dec 31, 2025
Chat now
Naver AI provider logo - HyperCLOVA X SEED Think (32B)
#92 HyperCLOVA X SEED Think (32B)
by Naver
N/A / N/A 78.5% 61.5% 59.0% Dec 26, 2025
Chat now
MiniMax AI provider logo - MiniMax-M2.1
#93 MiniMax-M2.1
by MiniMax
$0.30 / $1.20 87.5% 83.0% 82.7% Dec 23, 2025
Chat now
Z AI AI provider logo - GLM-4.7 (Non-reasoning)
#94 GLM-4.7 (Non-reasoning)
by Z AI
$0.55 / $2.15 79.4% 66.4% 48.0% Dec 22, 2025
Chat now
Z AI AI provider logo - GLM-4.7 (Reasoning)
#95 GLM-4.7 (Reasoning)
by Z AI
$0.60 / $2.20 85.6% 85.9% 95.0% Dec 22, 2025
Chat now
Google AI provider logo - Gemini 3 Flash Preview (Reasoning)
#96 Gemini 3 Flash Preview (Reasoning)
by Google
$0.50 / $3.00 89.0% 89.8% 97.0% Dec 17, 2025
Chat now
Google AI provider logo - Gemini 3 Flash Preview (Non-reasoning)
#97 Gemini 3 Flash Preview (Non-reasoning)
by Google
$0.50 / $3.00 88.2% 81.2% 55.7% Dec 17, 2025
Chat now
Upstage AI provider logo - Solar Open 100B (Reasoning)
#98 Solar Open 100B (Reasoning)
by Upstage
N/A / N/A - 65.7% - Dec 17, 2025
Chat now
Xiaomi AI provider logo - MiMo-V2-Flash (Feb 2026)
#99 MiMo-V2-Flash (Feb 2026)
by Xiaomi
$0.10 / $0.30 - 83.5% - Dec 16, 2025
Chat now
Xiaomi AI provider logo - MiMo-V2-Flash (Non-reasoning)
#100 MiMo-V2-Flash (Non-reasoning)
by Xiaomi
$0.10 / $0.30 74.4% 65.6% 67.7% Dec 16, 2025
Chat now

Showing 100 of 474 models

EU Made in Europe

Chat with 100+ AI Models in one App.

Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, GLM-5, Kimi K2.5 and many more.

Understanding the AI Model Leaderboard

This comprehensive AI model leaderboard helps you compare and choose the best large language models (LLMs) for your needs. We track standardized AI benchmarks, token pricing, inference speed, and model capabilities across all major AI providers like OpenAI, Anthropic, Google, Meta, and DeepSeek.

Core AI Benchmarks Explained

MMLU-Pro Tests broad knowledge across 14 academic subjects
GPQA PhD-level reasoning & problem-solving
AIME 2025 Elite mathematical reasoning
Coding Index LiveCodeBench + SciCode composite
Math Index AIME + MATH-500 composite

Key Metrics to Consider

Token Pricing Input vs output cost per 1M tokens
Inference Speed Tokens/sec for response time
Release Date Latest techniques & knowledge
Benchmark Scores 0-100% capability comparison

How to Choose the Right AI Model for Your Use Case

For Research & Analysis

Prioritize models with high MMLU-Pro (70%+) and GPQA (60%+) scores for complex reasoning tasks, academic research, and technical documentation

For Cost Optimization

Sort by input/output pricing - smaller models often deliver 80% of flagship performance at 10% of the cost for simple tasks

For Math & STEM

Filter by Math Index or AIME 2025 scores (50%+) for quantitative analysis, engineering calculations, and scientific applications

All benchmark scores and pricing data are updated daily from Artificial Analysis to reflect the latest model versions and capabilities. Use the sort filters above to find AI models by intelligence, cost, coding ability, math performance, speed, or release date.

Frequently Asked Questions

What is MMLU-Pro and why is it the standard AI intelligence benchmark?

MMLU-Pro (Massive Multitask Language Understanding - Professional) is the most comprehensive AI benchmark, testing models across 14 academic subjects including mathematics, science, history, law, and ethics. Scores range from 46% (basic competency) to 87% (near-expert level). Models scoring above 75% demonstrate strong general intelligence suitable for professional applications, while scores below 60% indicate limitations in complex reasoning tasks.

What does GPQA measure and which models score highest?

GPQA (Graduate-level Google-Proof Q&A) tests PhD-level reasoning with questions designed to be "Google-proof" - requiring deep understanding rather than simple fact retrieval. Top models like GPT-5.1 (87.3%), GPT-5 mini (82.8%), and o3 (82.7%) excel at GPQA, making them ideal for research, technical analysis, and complex problem-solving. Models below 50% GPQA struggle with advanced reasoning and may provide superficial answers to complex questions.

What is AIME 2025 and how does it evaluate AI mathematical ability?

AIME 2025 (American Invitational Mathematics Examination) is an elite math competition benchmark that tests advanced problem-solving, algebra, geometry, and number theory. Scores above 80% (like GPT-5 Codex at 98.7% or GPT-5.1 at 94%) indicate exceptional mathematical reasoning suitable for engineering, scientific computing, and quantitative analysis. Models scoring below 50% may struggle with multi-step mathematical problems or require explicit problem breakdown.

How is AI model pricing calculated and what's considered cost-effective?

AI model pricing is measured per 1 million tokens (approximately 750,000 words). Input pricing covers text you send, while output pricing covers generated responses. Budget models like Llama 3.3 70B cost $0.54/$0.71 per million tokens, mid-tier models like GPT-5 nano cost $0.05/$0.40, while premium models like GPT-5 cost $1.25/$10. For typical applications with 3:1 input-to-output ratio, budget models can be 10-20x cheaper than flagship models while maintaining 70-80% performance.

Which AI models are best for coding and programming tasks?

Sort by Coding Index to see top programming models. Our Coding Index combines LiveCodeBench, SciCode, and coding benchmarks. Top performers include GPT-5.1 (57.5 index), GPT-5 mini (51.4), and GPT-5 Codex (53.5). These models excel at code generation, debugging, refactoring, and explaining complex algorithms. For budget-conscious developers, models with 40+ coding index scores offer excellent value for routine programming tasks.

How often are AI model benchmarks and rankings updated?

Our leaderboard syncs daily with Artificial Analysis API to ensure benchmark scores (MMLU-Pro, GPQA, AIME 2025), pricing, and inference speed data reflect the latest model versions. New model releases appear immediately under the "Newest" sort option. Benchmark scores can change when providers release updated versions - for example, GPT-5.1 released in November 2025 achieved 69.7 intelligence compared to GPT-5's 68.5 from August 2025.

What inference speed (tokens/second) do I need for my application?

Inference speed determines how fast models generate responses. For real-time chatbots and interactive applications, target 100+ tokens/second (models like gpt-oss-120B at 340 tok/s). For background processing and batch jobs, 50-100 tok/s is sufficient. Premium reasoning models like GPT-5 (103 tok/s) balance speed and capability. Note that higher inference speed doesn't always mean better quality - slower models often deliver more thoughtful, detailed responses.

Can I test these AI models for free before committing?

Yes! Try our free AI chat interface to test different models instantly without creating an account. Many providers also offer free tiers: OpenAI (ChatGPT with daily limits), Anthropic (Claude with usage caps), Google (Gemini free tier), and open-source models like Llama 3.3. Compare performance on your specific use case before upgrading to paid plans.

Customer Support