Kanzhi Cheng
- Papers
- 15
Cite
Notes
Only stored in your browser.
Authored papers
15OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis
arXiv 2026
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent
arXiv 2026
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents
arXiv 2026
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions
arXiv 2026
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
arXiv 2025
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
arXiv 2025
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
arXiv 2025
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
arXiv 2025
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
arXiv 2025
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
arXiv 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
arXiv 2024
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
arXiv 2024
Vision-Language Models Can Self-Improve Reasoning via Reflection
arXiv 2024
A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond
arXiv 2024
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models
arXiv 2024
Affiliations
Frequent co-authors
10from 15 papers