CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale
Active
A large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on real-world vulnerability analysis tasks. CyberGym includes 1,507 benchmark instances with historical vulnerabilities from 188 large software projects.
- Publisher
- University of California, Berkeley
- Domain
- Cybersecurity
- License
- mit
- Published
- Feb 2026
- Notable for
- Benchmark for evaluating Cybersecurity.
Cite
Notes
Only stored in your browser.
FAQ
- What is CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale?
- A large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on real-world vulnerability analysis tasks. CyberGym includes 1,507 benchmark instances with historical vulnerabilities from 188 large software projects.
- What license is CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale under?
- CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale is available under mit.