UI-TARS-desktop

Browser

ByteDance

Product overview

Name of Agent: UI-TARS-desktop
Short description of agent: UI-TARS Desktop is a desktop application that provides a native GUI Agent based on the UI-TARS model. (link)
Date of release: 03/07/2025: Agent TARS Beta Release (link)18/03/2025: Agent TARS App (Preview) (link, archived)
Advertised use: It primarily ships a local and remote computer as well as browser operators (link)
Monetisation/Usage price: Agent TARS / UI-TARS is the application layer free to run, users are able to query Doubao model APIs from VolcanoEngine (link, archived).
Who is using it?: Open‑source users and researchers
Category: Browser

Company & accountability

Developer: ByteDance
Name of legal entity: Beijing Douyin Information Service Co., Ltd. (link, archived)
Place of legal incorporation: Beijing, China (link, archived)
For profit company?: Yes (link, archived)
Parent company?: Not applicable
Governance documents analysis: No dedicated terms‑of‑service or privacy‑policy for the agent could be found.
AI safety/trust framework: None found
Compliance with existing standards: None found

Technical capabilities & system architecture

Model specifications: "UI-TARS Desktop is a native GUI agent for your local computer, driven by UI-TARS and Seed-1.5-VL/1.6 series models." (link)
Observation space: Screenshots and GUI metadata – the agent receives pixel‑level screenshots and extracts GUI element information (type, text, coordinates) to form observations. (link)
Action space: "Supports common desktop operations: mouse clicks (single, double, right), drag actions, keyboard shortcuts, text input, scrolling, etc."
Memory architecture: - Long‑term memory is used to retain history across interactions. - No evidence of episodic memory across different conversations
User interface and interaction design: CLI and web UI – Agent TARS ships with a command‑line interface and a web UI for launching tasks; UI-TARS ships with a Desktop version. (link)
User roles: Operator (Users can run the agents)
Component accessibility: Model is closed source Agent TARS code is released under Apache License 2.0 (link)

Autonomy & control

Autonomy level and planning depth: L4: UI‑TARS autonomously plans multi‑step tasks using system‑2 reasoning; the user acts as an approver and intervenes only when necessary (link)
User approval requirements for different decision types: User input is needed for certain kinds of tasks. Tasks requiring authentication, payments or external API keys require manual user input; otherwise, the agent acts independently.
Execution monitoring, traces, and transparency: Visible CoT and action trace documenting all activity
Emergency stop and shut down mechanisms and user control: User can pause/stop the agent at any time by interrupting the CLI or clicking the "Terminate" buttom on UI
Usage monitoring and statistics and patterns: No usage statistics published One can find out the model API usage statistics on VolcanoEngine

Ecosystem interaction

Identify to humans?: None found
Identifies technically?: None found
Interoperability standards and integrations: - MCP support (link, archived)
Web conduct: None found

Safety, evaluation & impact

Technical guardrails and safety measures: None found, although research on AgentArmor (link)
Sandboxing and containment approaches: Agent TARS CLI v0.3.0 features exclusive support for AIO agent Sandbox (link) as isolated all-in-one tools execution environment
What types of risks were evaluated?: None found
(Internal) safety evaluations and results: None found
Third-party testing, audits, and red-teaming: None found
Benchmark performance and demonstrated capabilities: GUI Benchmarks (computer use, mobile use, browser use) Game Benchmarks (15 games collections) (link)
Bug bounty programmes and vulnerability disclosure: None found
Any known incidents?: None found