0

WorkArena

Active

Browser-based enterprise web tasks on a live ServiceNow instance - list filtering, form filling, knowledge search - covering daily knowledge-worker workflows.

Domain
agentic
Format
Custom
Size
682 tasks
License
Apache-2.0
Published
Mar 2024
Notable for
Benchmark for evaluating browser use, tool calling and planning in the agentic domain.

Cite

Notes

Only stored in your browser.

Related tools

2
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

Papers

2

FAQ

What is WorkArena?
Browser-based enterprise web tasks on a live ServiceNow instance - list filtering, form filling, knowledge search - covering daily knowledge-worker workflows.
What capabilities does WorkArena test?
WorkArena evaluates browser use, tool calling, planning.
How can a model improve its WorkArena score?
Tools linked to WorkArena on Sophon include BrowserGym, Openenv Browsergym RL Env (Hugging Face) - RL environments, datasets, and scaffolds that target this eval.
What license is WorkArena under?
WorkArena is available under Apache-2.0.