Cite
Notes
Only stored in your browser.
Attribution
AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation
arXiv 2026
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
arXiv 2025
from 2 papers
Hui Su
Qi Gu
Xi Su
Xunliang Cai
Chengcheng Han
Dengchang Zhao
Fuli Feng
Hongyan Hao
Kefeng Zhang
Man Gao