The Art of Saying No: Contextual Noncompliance in Language Models
Active
Dataset with 1001 samples to test noncompliance capabilities of language models. Contrast set of 379 samples.
- Publisher
- Allen Institute for AI
- Domain
- Safeguards
- License
- mit
- Published
- Oct 2025
- Notable for
- Benchmark for evaluating Safeguards.
Cite
Notes
Only stored in your browser.
Related tools
2Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is The Art of Saying No: Contextual Noncompliance in Language Models?
- Dataset with 1001 samples to test noncompliance capabilities of language models. Contrast set of 379 samples.
- How can a model improve its The Art of Saying No: Contextual Noncompliance in Language Models score?
- Tools linked to The Art of Saying No: Contextual Noncompliance in Language Models on Sophon include Coconot RL Env (Community), Coconot RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
- What license is The Art of Saying No: Contextual Noncompliance in Language Models under?
- The Art of Saying No: Contextual Noncompliance in Language Models is available under mit.