WMDP: Measuring and Reducing Malicious Use With Unlearning
Active
A dataset of 3,668 multiple-choice questions developed by a consortium of academics and technical consultants that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security.
- Publisher
- Center for AI Safety (CAIS)
- Domain
- Safeguards
- License
- mit
- Published
- Oct 2024
- Notable for
- Benchmark for evaluating Safeguards.
Cite
Notes
Only stored in your browser.
Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is WMDP: Measuring and Reducing Malicious Use With Unlearning?
- A dataset of 3,668 multiple-choice questions developed by a consortium of academics and technical consultants that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security.
- How can a model improve its WMDP: Measuring and Reducing Malicious Use With Unlearning score?
- Tools linked to WMDP: Measuring and Reducing Malicious Use With Unlearning on Sophon include WMDP RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
- What license is WMDP: Measuring and Reducing Malicious Use With Unlearning under?
- WMDP: Measuring and Reducing Malicious Use With Unlearning is available under mit.