CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks
Active
A suite of cybersecurity evaluation benchmarks adapted from Meta's PurpleLlama CybersecurityBenchmarks. Includes MITRE ATT&CK compliance testing, false refusal rate measurement, insecure code detection,
- Publisher
- Meta Platforms
- Domain
- Cybersecurity
- License
- mit
- Published
- Apr 2026
- Notable for
- Benchmark for evaluating Cybersecurity.
Cite
Notes
Only stored in your browser.
Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks?
- A suite of cybersecurity evaluation benchmarks adapted from Meta's PurpleLlama CybersecurityBenchmarks. Includes MITRE ATT&CK compliance testing, false refusal rate measurement, insecure code detection,
- How can a model improve its CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks score?
- Tools linked to CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks on Sophon include Cybersoceval RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
- What license is CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks under?
- CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks is available under mit.