FAR.AI is a lab.
Cite
Notes
Only stored in your browser.
Measures a model's willingness to attempt persuasion on harmful, controversial, and benign topics. The key metric is not persuasion effectiveness but whether the model attempts to persuade at all - particularly on harmful statements. Uses a multi-model