0

CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks

Active

A suite of cybersecurity evaluation benchmarks adapted from Meta's PurpleLlama CybersecurityBenchmarks. Includes MITRE ATT&CK compliance testing, false refusal rate measurement, insecure code detection,

Domain
Cybersecurity
License
mit
Published
Apr 2026
Notable for
Benchmark for evaluating Cybersecurity.

Cite

Notes

Only stored in your browser.

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks?
A suite of cybersecurity evaluation benchmarks adapted from Meta's PurpleLlama CybersecurityBenchmarks. Includes MITRE ATT&CK compliance testing, false refusal rate measurement, insecure code detection,
How can a model improve its CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks score?
Tools linked to CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks on Sophon include Cybersoceval RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks under?
CyberSecEval 4: Advanced Cybersecurity Evaluation Benchmarks is available under mit.