Cite
Notes
Only stored in your browser.
Attribution
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions
arXiv 2026
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents
from 2 papers
Fangzhi Xu
Hang Yan
Jian Zhang
researcher
Jun Liu
Kanzhi Cheng
Qika Lin
Qiushi Sun
Zichen Ding
Ben Kao
Haoran Luo