Qwen logo

Qwen3 VL 30B A3B Thinking

30B

by Qwen

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results. For agentic use, it handles multi-image multi-turn instructions, video timeline alignments, GUI automation, and visual coding from sketches to debugged UI. Text performance matches flagship Qwen3 models, suiting document AI, OCR, UI assistance, spatial tasks, and agent research.

Chat with Qwen3 VL 30B A3B Thinking

Capabilities

Vision

Pricing

Input Tokens
Per 1M tokens
Free
Output Tokens
Per 1M tokens
Free
Image Processing
Per 1M tokens
$0.00/1M tokens

Supported Modalities

Input

text
image

Output

text

Specifications

Context Length
131K tokens
Provider
Qwen
Released
Oct 6, 2025
Model ID
qwen/qwen3-vl-30b-a3b-thinking

Ready to try it?

Start chatting with Qwen3 VL 30B A3B Thinking right now. No credit card required.

Start Chatting

More from Qwen

View all models