Xin Lan

Cite

Notes

Only stored in your browser.

Attribution

2papers

Authored papers

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

arXiv 2026

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

arXiv 2026

No known affiliations.

from 2 papers

Steven Dillmann

Xuandong Zhao

Yuanli Wang

Ahson Saiyed

Akshay Anand

Alex Dimakis

Alexander G. Shaw

Andrew Lanpouthakoun

Andy Konwinski

founder

Anurag Kashyap