0

WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?

ServiceNow + MILA benchmark of 33 enterprise knowledge-work tasks (forms, dashboards, service catalogs) on a real ServiceNow instance.

Year
2024
Venue
ICML
Authors
12
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Introduces 3 artifacts - 1 eval, 2 tools

TL;DR

Semantic Scholar

This work proposes WorkArena, a remote-hosted benchmark of 33 tasks based on the widely-used ServiceNow platform and introduces BrowserGym, an environment for the design and evaluation of such agents, offering a rich set of actions as well as multimodal observations.

Artifacts

3

Authors

12