MiMo-V2-Omni

Name: MiMo-V2-Omni
Brand: Xiaomi

byXiaomi

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities, 256K context window.

Chat withMiMo-V2-Omni

Input Price$0.00/1M tokens

Output Price$0.00/1M tokens

Intelligence43.4

Coding35.5

Specifications

Technical details and pricing.

ProviderXiaomi

Context Window262,144 tokens

Release DateMar 19, 2026

ModalitiesText, Audio, Image, Video → Text

CapabilitiesVision, Audio Input

Benchmarks

7 benchmark scores from Artificial Analysis.

GPQA82.8%

HLE19.9%

SciCode36.7%

LCR66.7%

IFBench53.5%

Tau291.2%

TerminalBench Hard34.8%

Composite Indices

Intelligence, Coding, Math

Standard Benchmarks

Academic and industry benchmarks

Frequently Asked Questions

What is MiMo-V2-Omni good for?

Use MiMo-V2-Omni for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.

How much does MiMo-V2-Omni cost?

Pricing is based on usage. Current rates are $0.00/1M tokens for input and $0.00/1M tokens for output.

Can I try MiMo-V2-Omni for free?

Yes. You can start a chat instantly and test the model before deciding on a plan.

Does MiMo-V2-Omni support images or audio?

MiMo-V2-Omni can understand images.

Similar Models

Other models you might want to explore.

MiMo-V2-Pro

Xiaomi

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios.

Details →

MiMo-V2-Flash

Xiaomi

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi.

Details →

Riverflow V2 Pro

Sourceful

Riverflow V2 Pro is the most powerful variant of Sourceful's Riverflow 2.0 lineup, best for top-tier control and perfect text rendering.

Details →

Benchmarks and pricing are sourced from Artificial Analysis where available. OpenRouter specs are used as a fallback.