0

WMDP: Measuring and Reducing Malicious Use With Unlearning

Active

A dataset of 3,668 multiple-choice questions developed by a consortium of academics and technical consultants that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security.

Domain
Safeguards
License
mit
Published
Oct 2024
Notable for
Benchmark for evaluating Safeguards.

Cite

Notes

Only stored in your browser.

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is WMDP: Measuring and Reducing Malicious Use With Unlearning?
A dataset of 3,668 multiple-choice questions developed by a consortium of academics and technical consultants that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security.
How can a model improve its WMDP: Measuring and Reducing Malicious Use With Unlearning score?
Tools linked to WMDP: Measuring and Reducing Malicious Use With Unlearning on Sophon include WMDP RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
What license is WMDP: Measuring and Reducing Malicious Use With Unlearning under?
WMDP: Measuring and Reducing Malicious Use With Unlearning is available under mit.