ByteDance Models
ByteDance logo

UI-TARS 7B

7B

byByteDance

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...

Input Price$0.10/1M tokens
Output Price$0.20/1M tokens
Context Window128,000 tokens
Modalitiesimage, text

Specifications

Technical details and pricing.

ProviderByteDance
Context Window128,000 tokens
Release DateJul 22, 2025
ModalitiesImage, Text β†’ Text
CapabilitiesVision

Frequently Asked Questions

What is UI-TARS 7B good for?

Use UI-TARS 7B for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.

How much does UI-TARS 7B cost?

Pricing is based on usage. Current rates are $0.10/1M tokens for input and $0.20/1M tokens for output.

Can I try UI-TARS 7B for free?

Yes. You can start a chat instantly and test the model before deciding on a plan.

Does UI-TARS 7B support images or audio?

UI-TARS 7B can understand images.

Pricing, context, and capability data are based on the live model catalog.