Cite
Notes
Only stored in your browser.
Attribution
PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach
arXiv 2025
from 1 papers
Dalton Towers
Furong Huang
Shayan Shabihi
Udari Madhushani Sehwag
Vikash Sehwag
Yuancheng Xu