AI Model Ranking (LLM Leaderboard)
Most Intelligent AI Models
Language models ranked by GPQA reasoning score
| Model AI model name and provider organization | Price/1M Cost per 1 million tokens — input (text you send) / output (text the model generates) |
MMLU-Pro
Massive Multitask Language Understanding (Professional) - tests broad knowledge across 14 subjects including STEM, humanities, and social sciences |
GPQA
Graduate-level Google-Proof Q&A benchmark - tests PhD-level reasoning and advanced intelligence |
AIME 2025
American Invitational Mathematics Examination 2025 - tests advanced mathematical problem-solving ability | Release When the model was released - newer models may have more capabilities | Compare |
|---|---|---|---|---|---|---|
| #1 Gemini 3.1 Pro Preview by Google | $2.00 / $12.00 | - | 94.1% | - | Feb 19, 2026 | |
| #2 GPT-5.4 (xhigh) by OpenAI | $2.50 / $15.00 | - | 92.0% | - | Mar 5, 2026 | |
| #3 GPT-5.3 Codex (xhigh) by OpenAI | $1.75 / $14.00 | - | 91.5% | - | Feb 5, 2026 | |
| #4 Claude Opus 4.6 (Adaptive Reasoning, Max Effort) by Anthropic | $5.00 / $25.00 | - | 89.6% | - | Feb 5, 2026 | |
| #5 Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) by Anthropic | $3.00 / $15.00 | - | 87.5% | - | Feb 17, 2026 | |
| #6 GPT-5.2 (xhigh) by OpenAI | $1.75 / $14.00 | 87.4% | 90.3% | 99.0% | Dec 11, 2025 | |
| #7 GLM-5 (Reasoning) by Z AI | $1.00 / $3.20 | - | 82.0% | - | Feb 11, 2026 | |
| #8 Claude Opus 4.5 (Reasoning) by Anthropic | $5.00 / $25.00 | 89.5% | 86.6% | 91.3% | Nov 24, 2025 | |
| #9 GPT-5.2 Codex (xhigh) by OpenAI | $1.75 / $14.00 | - | 89.9% | - | Dec 11, 2025 | |
| #10 Gemini 3 Pro Preview (high) by Google | $2.00 / $12.00 | 89.8% | 90.8% | 95.7% | Nov 18, 2025 | |
| #11 GPT-5.1 (high) by OpenAI | $1.25 / $10.00 | 87.0% | 87.3% | 94.0% | Nov 13, 2025 | |
| #12 Kimi K2.5 (Reasoning) by Kimi | $0.60 / $3.00 | - | 87.9% | - | Jan 27, 2026 | |
| #13 GPT-5.2 (medium) by OpenAI | $1.75 / $14.00 | 85.9% | 86.4% | 96.7% | Dec 11, 2025 | |
| #14 Claude Opus 4.6 (Non-reasoning, High Effort) by Anthropic | $5.00 / $25.00 | - | 84.0% | - | Feb 5, 2026 | |
| #15 Gemini 3 Flash Preview (Reasoning) by Google | $0.50 / $3.00 | 89.0% | 89.8% | 97.0% | Dec 17, 2025 | |
| #16 Qwen3.5 397B A17B (Reasoning) by Alibaba | $0.60 / $3.60 | - | 89.3% | - | Feb 16, 2026 | |
| #17 GPT-5 (high) by OpenAI | $1.25 / $10.00 | 87.1% | 85.4% | 94.3% | Aug 7, 2025 | |
| #18 GPT-5 Codex (high) by OpenAI | $1.25 / $10.00 | 86.5% | 83.7% | 98.7% | Sep 23, 2025 | |
| #19 Claude Sonnet 4.6 (Non-reasoning, High Effort) by Anthropic | $3.00 / $15.00 | - | 79.9% | - | Feb 17, 2026 | |
| #20 GPT-5.1 Codex (high) by OpenAI | $1.25 / $10.00 | 86.0% | 86.0% | 95.7% | Nov 13, 2025 | |
| #21 Claude Opus 4.5 (Non-reasoning) by Anthropic | $5.00 / $25.00 | 88.9% | 81.0% | 62.7% | Nov 24, 2025 | |
| #22 Claude 4.5 Sonnet (Reasoning) by Anthropic | $3.00 / $15.00 | 87.5% | 83.4% | 88.0% | Sep 29, 2025 | |
| #23 Claude Sonnet 4.6 (Non-reasoning, Low Effort) by Anthropic | $3.00 / $15.00 | - | 79.7% | - | Feb 17, 2026 | |
| #24 Qwen3.5 27B (Reasoning) by Alibaba | $0.30 / $2.40 | - | 85.8% | - | Feb 24, 2026 | |
| #25 GLM-4.7 (Reasoning) by Z AI | $0.60 / $2.20 | 85.6% | 85.9% | 95.0% | Dec 22, 2025 | |
| #26 GPT-5 (medium) by OpenAI | $1.25 / $10.00 | 86.7% | 84.2% | 91.7% | Aug 7, 2025 | |
| #27 MiniMax-M2.5 by MiniMax | $0.30 / $1.20 | - | 84.8% | - | Feb 12, 2026 | |
| #28 DeepSeek V3.2 (Reasoning) by DeepSeek | $0.28 / $0.42 | 86.2% | 84.0% | 92.0% | Dec 1, 2025 | |
| #29 Qwen3.5 122B A10B (Reasoning) by Alibaba | $0.40 / $3.20 | - | 85.7% | - | Feb 24, 2026 | |
| #30 Grok 4 by xAI | $3.00 / $15.00 | 86.6% | 87.7% | 92.7% | Jul 10, 2025 | |
| #31 MiMo-V2-Flash (Feb 2026) by Xiaomi | $0.10 / $0.30 | - | 83.5% | - | Dec 16, 2025 | |
| #32 Gemini 3 Pro Preview (low) by Google | $2.00 / $12.00 | 89.5% | 88.7% | 86.7% | Nov 18, 2025 | |
| #33 GPT-5 mini (high) by OpenAI | $0.25 / $2.00 | 83.7% | 82.8% | 90.7% | Aug 7, 2025 | |
| #34 Kimi K2 Thinking by Kimi | $0.60 / $2.50 | 84.8% | 83.8% | 94.7% | Nov 6, 2025 | |
| #35 o3-pro by OpenAI | $20.00 / $80.00 | - | 84.5% | - | Jun 10, 2025 | |
| #36 GLM-5 (Non-reasoning) by Z AI | $1.00 / $3.20 | - | 66.6% | - | Feb 11, 2026 | |
| #37 Qwen3.5 397B A17B (Non-reasoning) by Alibaba | $0.60 / $3.60 | - | 86.1% | - | Feb 16, 2026 | |
| #38 Qwen3 Max Thinking by Alibaba | $1.20 / $6.00 | - | 86.1% | - | Jan 26, 2026 | |
| #39 MiniMax-M2.1 by MiniMax | $0.30 / $1.20 | 87.5% | 83.0% | 82.7% | Dec 23, 2025 | |
| #40 GPT-5 (low) by OpenAI | $1.25 / $10.00 | 86.0% | 80.8% | 83.0% | Aug 7, 2025 | |
| #41 MiMo-V2-Flash (Reasoning) by Xiaomi | $0.10 / $0.30 | 84.3% | 84.6% | 96.3% | Dec 16, 2025 | |
| #42 GPT-5 mini (medium) by OpenAI | $0.25 / $2.00 | 82.8% | 80.3% | 85.0% | Aug 7, 2025 | |
| #43 Claude 4 Sonnet (Reasoning) by Anthropic | $3.00 / $15.00 | 84.2% | 77.7% | 74.3% | May 22, 2025 | |
| #44 GPT-5.1 Codex mini (high) by OpenAI | $0.25 / $2.00 | 82.0% | 81.3% | 91.7% | Nov 13, 2025 | |
| #45 Grok 4.1 Fast (Reasoning) by xAI | $0.20 / $0.50 | 85.4% | 85.3% | 89.3% | Nov 19, 2025 | |
| #46 o3 by OpenAI | $2.00 / $8.00 | 85.3% | 82.7% | 88.3% | Apr 16, 2025 | |
| #47 Step 3.5 Flash by StepFun | $0.10 / $0.30 | - | 83.1% | - | Feb 2, 2026 | |
| #48 Kimi K2.5 (Non-reasoning) by Kimi | $0.60 / $3.00 | - | 78.9% | - | Jan 27, 2026 | |
| #49 Qwen3.5 27B (Non-reasoning) by Alibaba | $0.30 / $2.40 | - | 84.2% | - | Feb 24, 2026 | |
| #50 Claude 4.5 Haiku (Reasoning) by Anthropic | $1.00 / $5.00 | 76.0% | 67.2% | 83.7% | Oct 15, 2025 | |