Recent advances in multimodal large language models unlock unprecedented opportunities for GUI automation. However, a fundamental challenge remains: how to efficiently acquire high-quality training data while maintaining annotation reliability? We introduce a self-evolving training pipeline powered by the Calibrated Step Reward System, which converts model-generated trajectories into reliable training signals through trajectory-level calibration, achieving >90% annotation accuracy with 10-100x lower cost. Leveraging this pipeline, we introduce Step-GUI, a family of models (4B/8B) that achieves state-of-the-art GUI performance (8B: 80.2% AndroidWorld, 48.5% OSWorld, 62.6% ScreenShot-Pro) while maintaining robust general capabilities. As GUI agent capabilities improve, practical deployment demands standardized interfaces across heterogeneous devices while protecting user privacy. To this end, we propose GUI-MCP, the first Model Context Protocol for GUI automation with hierarchical architecture that combines low-level atomic operations and high-level task delegation to local specialist models, enabling high-privacy execution where sensitive data stays on-device. Finally, to assess whether agents can handle authentic everyday usage, we introduce AndroidDaily, a benchmark grounded in real-world mobile usage patterns with 3146 static actions and 235 end-to-end tasks across high-frequency daily scenarios (8B: static 89.91%, end-to-end 52.50%). Our work advances the development of practical GUI agents and demonstrates strong potential for real-world deployment in everyday digital interactions.
Step-GUI Technical Report
Recent advances in multimodal large language models unlock unprecedented opportunities for GUI automation.
- Year
- 2025
- Venue
- arXiv 2025
- Stars
- 2.2k
- Authors
- 96
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2512.15431ARXIV-DEFAULT
- TL;DR
- Semantic Scholar
Topics
3Abstract
Authors
96Daxin JiangXin LiuJing LiHao WuMin XuHang LiShuang LuoNa WangXin HuangLiang ZhaoGuopeng LiZheng GeYibo ZhuBinxing JiaoXiangyu ZhangXin ZhouGuodong LiuJingyang ZhangJia WangDong LiHongming ChenXu ZhouQiong GaoLei LeiWen SunNing WangBrian LiJingJing XieKaijun TanKang AnLieyu ShiLiguo TanMengqiang RenShiliang YangXiaojia LiuXuanti FengXuedan CaiYeqing ShenYingxiu ZhaoZejia WengZhiguo HuangManjiao LiuYukang ShiNan WuZiyang MengZhonghao YanMei ChenShuli GaoXuan WenLiying ShiHaolong YanYineng DengChenyang LiJunhao HuangJin GaoXingbin LiuZhirui WangXiaojie HouZhimin FanMi YangMengmeng DuanDanxun LiangHang ChengJie DongRenjie YuShunshan LiYiting DaiYingdan LiangZelin ChenChengxu YanChunqin XuFengqiong XiaoGuanghao FanGuozhen PengHongbing LiJianyong LiJiaju RenJiayu YuanJianpeng YinKai CaoMao LuoMingxin WanPeiyao MaQingzhou ZhangQiao WangQinlin ZengQiongyao LiShangwu ZhongShaofan LiuShisi GaoXianwei ZhuXin LiangYunfang XuYuqing ZengYixun ZhangZhuoyu Wang