XSTest
Active
250 safe + 200 unsafe prompts crafted to test exaggerated safety - does the model refuse benign requests that superficially resemble unsafe ones?
- Publisher
- University of Oxford
- Capabilities
- SafetyInstruction Following
- Domain
- safety
- Format
- HF Dataset
- Size
- 450 tasks
- License
- CC-BY-4.0
- Published
- Jul 2023
- Notable for
- Benchmark for evaluating safety and instruction following in the safety domain.
- Canonical
- github.com/paul-rottger/xstest
Cite
Notes
Only stored in your browser.
Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is XSTest?
- 250 safe + 200 unsafe prompts crafted to test exaggerated safety - does the model refuse benign requests that superficially resemble unsafe ones?
- What capabilities does XSTest test?
- XSTest evaluates safety, instruction following.
- How can a model improve its XSTest score?
- Tools linked to XSTest on Sophon include PKU-SafeRLHF - RL environments, datasets, and scaffolds that target this eval.
- What license is XSTest under?
- XSTest is available under CC-BY-4.0.