0

OSWorld

Active

369 computer-use tasks across Ubuntu, Windows, and macOS environments testing whether agents can operate a real desktop via screenshots and mouse/keyboard.

Publisher
XLANG Lab
Domain
agentic
Format
Custom
Size
369 tasks
License
Apache-2.0
Published
Apr 2024
Updates
Monthly
Notable for
The original 2024 leaderboard from the OSWorld authors at XLANG (HKU), now complemented by the HUD-AI OSWorld-Verified leaderboard.
Official leaderboard
os-world.github.io

Cite

Notes

Only stored in your browser.