0

GDM Dangerous Capabilities: Self-proliferation

Active

Ten real-world–inspired tasks from Google DeepMind's Dangerous Capabilities Evaluations assessing self-proliferation behaviors (e.g., email setup, model installation, web agent setup, wallet operations). Supports end-to-end, milestones, and expert best-of-N modes.

Domain
Scheming
License
mit
Published
Feb 2026
Notable for
Benchmark for evaluating Scheming.

Cite

Notes

Only stored in your browser.

FAQ

What is GDM Dangerous Capabilities: Self-proliferation?
Ten real-world–inspired tasks from Google DeepMind's Dangerous Capabilities Evaluations assessing self-proliferation behaviors (e.g., email setup, model installation, web agent setup, wallet operations). Supports end-to-end, milestones, and expert best-of-N modes.
What license is GDM Dangerous Capabilities: Self-proliferation under?
GDM Dangerous Capabilities: Self-proliferation is available under mit.