OSWorld
Active
369 computer-use tasks across Ubuntu, Windows, and macOS environments testing whether agents can operate a real desktop via screenshots and mouse/keyboard.
- Publisher
- XLANG Lab
- Capabilities
- Computer UsePlanningTool Calling
- Domain
- agentic
- Format
- Custom
- Size
- 369 tasks
- License
- Apache-2.0
- Published
- Apr 2024
- Updates
- Monthly
- Notable for
- The original 2024 leaderboard from the OSWorld authors at XLANG (HKU), now complemented by the HUD-AI OSWorld-Verified leaderboard.
- Canonical
- os-world.github.io
- Official leaderboard
- os-world.github.io
- Also on
Cite
Notes
Only stored in your browser.
Where it's ranked
1Papers
2Contributors
2FAQ
- What is OSWorld?
- 369 computer-use tasks across Ubuntu, Windows, and macOS environments testing whether agents can operate a real desktop via screenshots and mouse/keyboard.
- What capabilities does OSWorld test?
- OSWorld evaluates computer use, planning, tool calling.
- What license is OSWorld under?
- OSWorld is available under Apache-2.0.