The development of autonomous agents for graphical user interfaces (GUIs) presents major challenges in artificial intelligence. While recent advances in native agent models have shown promise by unifying perception, reasoning, action, and memory through end-to-end learning, open problems remain in data scalability, multi-turn reinforcement learning (RL), the limitations of GUI-only operation, and environment stability. In this technical report, we present UI-TARS-2, a native GUI-centered agent model that addresses these challenges through a systematic training methodology: a data flywheel for scalable data generation, a stabilized multi-turn RL framework, a hybrid GUI environment that integrates file systems and terminals, and a unified sandbox platform for large-scale rollouts. Empirical evaluation demonstrates that UI-TARS-2 achieves significant improvements over its predecessor UI-TARS-1.5. On GUI benchmarks, it reaches 88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena, and 73.3 on AndroidWorld, outperforming strong baselines such as Claude and OpenAI agents. In game environments, it attains a mean normalized score of 59.8 across a 15-game suite-roughly 60% of human-level performance-and remains competitive with frontier proprietary models (e.g., OpenAI o3) on LMGame-Bench. Additionally, the model can generalize to long-horizon information-seeking tasks and software engineering benchmarks, highlighting its robustness across diverse agent tasks. Detailed analyses of training dynamics further provide insights into achieving stability and efficiency in large-scale agent RL. These results underscore UI-TARS-2's potential to advance the state of GUI agents and exhibit strong generalization to real-world interactive scenarios.
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
The development of autonomous agents for graphical user interfaces (GUIs) presents major challenges in artificial intelligence.
- Year
- 2025
- Venue
- arXiv 2025
- Authors
- 112
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2509.02544v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar
Abstract
Authors
112Ge ZhangJiazhan FengJie TangYujia QinXin LiuChen LiQianli MaZhongkai ZhaoZhi ZhangPengFei LiuHao YuWenhao HuangYaohui WangQi LiuBo LiJiaheng LiuHao ChenShuyue GuoZhenzhu YangZiHao WangBo ZhouZhiyong WuYichi ZhangJian ChenAoyan LiSiyao LiuKai ShenJing SuShulin XinGuoliang LiWen HengChengquan JiangDaoguang ZanJunting LuWayne Xin ZhaoChenxin LiQinyu LuoYining YeShihao LiangZehui ChenFaming WuFuxing LengGuang ShiHaobin ChenJingjia HuangQinghao YeWenqian WangXiaobo QinXiaojun XiaoYi LinYoubin WuHaihua YangHaoming WangJunjie FangLi HanLin YanLongxiang LiuRenjie ZhengSonghua CaiWanjun ZhongXujing LiYuwen XiongZhiyuan ZengTianhao ChengHuatong SongQinghao ZhaoYuxin SongHanbin WangHaoyang ZouChong LiuHangyu GuoLi LiJiajun ShiYiwen WangDehua MaShijue HuangChen DunHongyi GuoKaiyu ShiPeiyao ZhaoBaoquan ZhongXinchun ZhangYuanfan LiHaotian ZhouJinlin PangWenqi FuJiale YangQihua HanTaoran LuWoyu LinXiaokang TongXinyao LiYu MiaoZhengxuan JiangZili LiZiyuan ZhaoFeng LinHongda ZhuJunda DuKai CaiKuanye LiLichen YuanMeilan HanMinchao WangXiaobo MaXiaolong HuangXinjie ChenYidi DuYilin ChenZhaojian LiChaolin JinHaoli Chen