Step3
by StepFun
Step3 is a cutting-edge multimodal reasoning model—built on a Mixture-of-Experts architecture with 321B total parameters and 38B active. It is designed end-to-end to minimize decoding costs while delivering top-tier performance in vision–language reasoning. Through the co-design of Multi-Matrix Factorization Attention (MFA) and Attention-FFN Disaggregation (AFD), Step3 maintains exceptional efficiency across both flagship and low-end accelerators.
Capabilities
Pricing
Input Tokens
Per 1M tokens
Free
Output Tokens
Per 1M tokens
Free
Image Processing
Per 1M tokens
$0.00/1M tokens
Supported Modalities
Input
image
text
Output
text
Specifications
- Context Length
- 66K tokens
- Provider
- StepFun
- Released
- Aug 28, 2025
- Model ID
- stepfun-ai/step3
More from StepFun
View all modelsCompare Models
Select a model to compare with Step3