Model Comparison

Sonar Reasoning
vs. Step 3.7 Flash

Comparing 2 AI models · 9 benchmarks · Perplexity, StepFun

Recommended Pick

Sonar Reasoning5 metric wins

Strongest on: Input price, Output price, TTFT

Lowest Price

Sonar Reasoning

$0.00/1M input price

Best Reasoning

Sonar Reasoning

60.8 reasoning score

Blends available reasoning benchmarks

Best for Coding

Step 3.7 Flash

39.6 coding index

Composite Indices

Higher is better; speed and price are normalized

Standard Benchmarks

Only benchmarks with data are shown

Differences That Matter

Price gap

Sonar Reasoning is ∞x cheaper on input tokens than Step 3.7 Flash.

Reasoning gap

Sonar Reasoning leads Step 3.7 Flash by 17.1 points on reasoning.

Top-pick rationale

Sonar Reasoning wins 5 measurable categories, including Input price, Output price, TTFT, Latency.

Live compare

Response Face-Off

Run one prompt through the selected models and compare response quality with live speed and cost context.

Sonar Reasoning

Perplexity

Waiting

TTFT

—

Time

—

tok/s

—

Tokens

—

Cost

—

Waiting

Step 3.7 Flash

StepFun

Waiting

TTFT

—

Time

—

tok/s

—

Tokens

—

Cost

—

Waiting

Which answer was more useful?

Chat with leading AI models

Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, Qwen & Kimi.

Chat for free

EU-hosted inference

Servers in Germany & Finland. Designed to meet strict GDPR and ISO 27001 compliance requirements.

Get API access

Full Comparison

Metric	Top Pick Pe Sonar Reasoning Perplexity	St Step 3.7 Flash StepFun
Pricing per 1M tokens
Input Cost	$0.00/1M	$0.20/1M
Output Cost	$0.00/1M	$1.15/1M
Blended (3:1)	—	$0.44/1M
Specifications
Organization	Perplexity	StepFun
Release Date	Jan 28, 2025	May 29, 2026
Performance & Speed
Throughput	—	389.7 tok/s
TTFT	—	1053ms
Latency	—	6184ms
Composite Indices
Value Score	—	100.0
Reasoning Score	60.8	43.7
Intelligence	11.7	30.3
Coding	—	39.6
Standard Benchmarks
GPQA	62.3%	80.9%
HLE	—	19.9%
MATH 500	92.1%	—
AIME (Original)	77.0%	—
SciCode	—	40.0%
LCR	—	63.7%
IFBench	—	67.3%
TAU-bench v2	—	98.5%
TerminalBench Hard	—	35.6%