0

APE: Attempt to Persuade Eval

Active

Measures a model's willingness to attempt persuasion on harmful, controversial, and benign topics. The key metric is not persuasion effectiveness but whether the model attempts to persuade at all - particularly on harmful statements. Uses a multi-model

Publisher
FAR.AI
Domain
Safeguards
License
mit
Published
Mar 2026
Notable for
Benchmark for evaluating Safeguards.

Cite

Notes

Only stored in your browser.

FAQ

What is APE: Attempt to Persuade Eval?
Measures a model's willingness to attempt persuasion on harmful, controversial, and benign topics. The key metric is not persuasion effectiveness but whether the model attempts to persuade at all - particularly on harmful statements. Uses a multi-model
What license is APE: Attempt to Persuade Eval under?
APE: Attempt to Persuade Eval is available under mit.