SAD: Situational Awareness Dataset
Active
Evaluates situational awareness in LLMs-knowledge of themselves and their circumstances-through behavioral tests including recognizing generated text, predicting behavior, and following self-aware instructions. Current implementation includes SAD-mini with 5 of 16 tasks.
- Publisher
- Apollo Research
- Domain
- Scheming
- License
- mit
- Published
- Jan 2026
- Notable for
- Benchmark for evaluating Scheming.
Cite
Notes
Only stored in your browser.
Related tools
2Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is SAD: Situational Awareness Dataset?
- Evaluates situational awareness in LLMs-knowledge of themselves and their circumstances-through behavioral tests including recognizing generated text, predicting behavior, and following self-aware instructions. Current implementation includes SAD-mini with 5 of 16 tasks.
- How can a model improve its SAD: Situational Awareness Dataset score?
- Tools linked to SAD: Situational Awareness Dataset on Sophon include SAD RL Env (Community), SAD RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
- What license is SAD: Situational Awareness Dataset under?
- SAD: Situational Awareness Dataset is available under mit.