UI-TARS-desktop
BrowserByteDance
Product overview
Name of Agent: UI-TARS-desktop
Short description of agent: UI-TARS Desktop is a desktop application that provides a native GUI Agent based on the UI-TARS model. (link)
Advertised use: It primarily ships a local and remote computer as well as browser operators (link)
Who is using it?: Open‑source users and researchers
Website: (https://agent-tars.com/, archived)
Category: Browser
Company & accountability
Developer: ByteDance
Parent company?: Not applicable
Governance documents analysis: No dedicated terms‑of‑service or privacy‑policy for the agent could be found.
AI safety/trust framework: None found
Compliance with existing standards: None found
Technical capabilities & system architecture
Model specifications: "UI-TARS Desktop is a native GUI agent for your local computer, driven by UI-TARS and Seed-1.5-VL/1.6 series models." (link)
Documention: UI‑TARS GitHub README
Observation space: Screenshots and GUI metadata – the agent receives pixel‑level screenshots and extracts GUI element information (type, text, coordinates) to form observations. (link)
Action space: "Supports common desktop operations: mouse clicks (single, double, right), drag actions, keyboard shortcuts, text input, scrolling, etc."
Memory architecture: - Long‑term memory is used to retain history across interactions.
- No evidence of episodic memory across different conversations
User interface and interaction design: CLI and web UI – Agent TARS ships with a command‑line interface and a web UI for launching tasks; UI-TARS ships with a Desktop version. (link)
User roles: Operator (Users can run the agents)
Component accessibility: Model is closed source
Agent TARS code is released under Apache License 2.0 (link)
Autonomy & control
Autonomy level and planning depth: L4: UI‑TARS autonomously plans multi‑step tasks using system‑2 reasoning; the user acts as an approver and intervenes only when necessary (link)
User approval requirements for different decision types: User input is needed for certain kinds of tasks. Tasks requiring authentication, payments or external API keys require manual user input; otherwise, the agent acts independently.
Execution monitoring, traces, and transparency: Visible CoT and action trace documenting all activity
Emergency stop and shut down mechanisms and user control: User can pause/stop the agent at any time by interrupting the CLI or clicking the "Terminate" buttom on UI
Usage monitoring and statistics and patterns: No usage statistics published
One can find out the model API usage statistics on VolcanoEngine
Ecosystem interaction
Safety, evaluation & impact
Technical guardrails and safety measures: None found, although research on AgentArmor (link)
Sandboxing and containment approaches: Agent TARS CLI v0.3.0 features exclusive support for AIO agent Sandbox (link) as isolated all-in-one tools execution environment
What types of risks were evaluated?: None found
(Internal) safety evaluations and results: None found
Third-party testing, audits, and red-teaming: None found
Benchmark performance and demonstrated capabilities: GUI Benchmarks (computer use, mobile use, browser use)
Game Benchmarks (15 games collections)
(link)
Bug bounty programmes and vulnerability disclosure: None found
Any known incidents?: None found