Cite
Notes
Only stored in your browser.
Attribution
PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach
arXiv 2025
from 1 papers
Alex McAvoy
Furong Huang
Shayan Shabihi
Udari Madhushani Sehwag
Vikash Sehwag
Yuancheng Xu