ByteDance Models
UI-TARS 7B
7BbyByteDance
UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobile systems, and games. Built by ByteDance, it builds upon the UI-TARS framework with reinforcement...
Input Price$0.10/1M tokens
Output Price$0.20/1M tokens
Context Window128,000 tokens
Modalitiesimage, text
Specifications
Technical details and pricing.
ProviderByteDance
Context Window128,000 tokens
Release DateJul 22, 2025
ModalitiesImage, Text β Text
CapabilitiesVision
Frequently Asked Questions
What is UI-TARS 7B good for?
Use UI-TARS 7B for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.
How much does UI-TARS 7B cost?
Pricing is based on usage. Current rates are $0.10/1M tokens for input and $0.20/1M tokens for output.
Can I try UI-TARS 7B for free?
Yes. You can start a chat instantly and test the model before deciding on a plan.
Does UI-TARS 7B support images or audio?
UI-TARS 7B can understand images.
Similar Models
Other models you might want to explore.
Pricing, context, and capability data are based on the live model catalog.