benchflow is an org.
Cite
Notes
Only stored in your browser.
SkillsBench is an evaluation framework that measures how skills work, and the first dataset that measures how powerful models are at using skills on expert-curated tasks across high-GDP-value, diverse domains.