Model Comparison

MiMo-V2.5
vs. Small 3.1

Comparing 2 AI models · 12 benchmarks · Xiaomi, Mistral

Recommended Pick

MiMo-V2.5 11 metric wins

Strongest on: Value, Reasoning, Intelligence

Best Value

MiMo-V2.5

100.0 value score

50.1 reasoning / $0.18/1M

Lowest Price

Small 3.1

$0.10/1M input price

Best Reasoning

MiMo-V2.5

50.1 reasoning score

Blends available reasoning benchmarks

Best for Coding

MiMo-V2.5

42.1 coding index

Composite Indices

Higher is better; speed and price are normalized

Standard Benchmarks

Only benchmarks with data are shown

Differences That Matter

Best value

MiMo-V2.5 has the strongest quality-to-price mix at 100.0 out of 100 value points.

Price gap

Small 3.1 is 1.3x cheaper on input tokens than MiMo-V2.5.

Speed gap

Small 3.1 generates about 2.1x as many tokens per second as MiMo-V2.5.

Reasoning gap

MiMo-V2.5 leads Small 3.1 by 29.2 points on reasoning.

Coding gap

MiMo-V2.5 leads Small 3.1 by 28.2 points on coding.

Live compare

Response Face-Off

Run one prompt through the selected models and compare response quality with live speed and cost context.

MiMo-V2.5

Xiaomi

Waiting

TTFT

—

Time

—

tok/s

—

Tokens

—

Cost

—

Waiting

Small 3.1

Mistral

Waiting

TTFT

—

Time

—

tok/s

—

Tokens

—

Cost

—

Waiting

Which answer was more useful?

AI Chat

Chat with 80+ models

Chat for free

Inference API

EU-hosted inference

Get API access

Full Comparison

Metric	Top Pick Xi MiMo-V2.5 Xiaomi	Mi Small 3.1 Mistral
Pricing per 1M tokens
Input Cost	$0.14/1M	$0.10/1M
Output Cost	$0.28/1M	$0.23/1M
Blended (3:1)	$0.18/1M	$0.14/1M
Specifications
Organization	Xiaomi	Mistral
Release Date	Apr 22, 2026	Mar 17, 2025
Performance & Speed
Throughput	72.4 tok/s	152.4 tok/s
TTFT	1931ms	532ms
Latency	29541ms	532ms
Composite Indices
Value Score	100.0	53.1
Reasoning Score	50.1	20.9
Intelligence	40.1	8.6
Coding	42.1	13.9
Math	—	3.7
Standard Benchmarks
GPQA	84.9%	45.4%
MMLU Pro	—	65.9%
HLE	25.2%	4.8%
LiveCodeBench	—	21.2%
MATH 500	—	70.7%
AIME 2025	—	3.7%
AIME (Original)	—	9.3%
SciCode	43.1%	26.5%
LCR	62.7%	19.7%
IFBench	67.1%	29.9%
TAU-bench v2	90.6%	25.1%
TerminalBench Hard	41.7%	7.6%