GPT-5 (medium) vs Grok 4
Comparing 2 AI models · 6 benchmarks · OpenAI, xAI
Most Affordable
Op
GPT-5 (medium)
$1.25/1M
Highest Intelligence
xA
Grok 4
87.7% GPQA
Best for Coding
xA
Grok 4
55.1 Coding Index
Price Difference
2.4x
input cost range
Composite Indices
Intelligence, Coding, Math
Standard Benchmarks
Academic and industry benchmarks
Benchmark Winners
6 tests
Op
GPT-5 (medium)
2
- MMLU Pro
- MATH 500
xA
Grok 4
4
- GPQA
- HLE
- LiveCodeBench
- AIME 2025
| Metric | Op GPT-5 (medium) | xA Grok 4 |
|---|---|---|
| Pricing Per 1M tokens | ||
| Input Cost | $1.25/1M | $3.00/1M |
| Output Cost | $10.00/1M | $15.00/1M |
| Blended Cost 3:1 input/output ratio | $3.44/1M | $6.00/1M |
| Specifications | ||
| Organization Model creator | OpenAI | xAI |
| Release Date Launch date | Aug 7, 2025 | Jul 10, 2025 |
| Performance & Speed | ||
| Throughput Output speed | — | 37.2 tok/s |
| Time to First Token (TTFT) Initial response delay | — | 9172ms |
| Latency Time to first answer token | — | 9172ms |
| Composite Indices | ||
| Intelligence Index Overall reasoning capability | 66.4 | 65.3 |
| Coding Index Programming ability | 49.2 | 55.1 |
| Math Index Mathematical reasoning | 91.7 | 92.7 |
| Standard Benchmarks | ||
| GPQA Graduate-level reasoning | 84.2% | 87.7% |
| MMLU Pro Advanced knowledge | 86.7% | 86.6% |
| HLE Hard language evaluation | 23.5% | 23.9% |
| LiveCodeBench Real-world coding tasks | ||