Cite
Notes
Only stored in your browser.
Attribution
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
arXiv 2026
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
arXiv 2025
from 2 papers
Junxian He
Zhaochen Su
Cheng Wang
Fan Zhou
Graham Neubig
professor
Guanyu Jiang
Hangyu Guo
Haoze Wu
Jian Zhao
Jincheng Gao