0

XSTest

Active

250 safe + 200 unsafe prompts crafted to test exaggerated safety - does the model refuse benign requests that superficially resemble unsafe ones?

Domain
safety
Format
HF Dataset
Size
450 tasks
License
CC-BY-4.0
Published
Jul 2023
Notable for
Benchmark for evaluating safety and instruction following in the safety domain.

Cite

Notes

Only stored in your browser.

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is XSTest?
250 safe + 200 unsafe prompts crafted to test exaggerated safety - does the model refuse benign requests that superficially resemble unsafe ones?
What capabilities does XSTest test?
XSTest evaluates safety, instruction following.
How can a model improve its XSTest score?
Tools linked to XSTest on Sophon include PKU-SafeRLHF - RL environments, datasets, and scaffolds that target this eval.
What license is XSTest under?
XSTest is available under CC-BY-4.0.