Cite
Notes
Only stored in your browser.
Attribution
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
arXiv 2026
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
from 2 papers
Steven Dillmann
Xuandong Zhao
Yuanli Wang
Ahson Saiyed
Akshay Anand
Alex Dimakis
Alexander G. Shaw
Andrew Lanpouthakoun
Andy Konwinski
founder
Anurag Kashyap