Cite
Notes
Only stored in your browser.
Attribution
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
arXiv 2024
Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks
arXiv 2023
from 2 papers
Ashish Hooda
Atul Prakash
Neal Mangaokar
Somesh Jha
Jihye Choi
Ryan Feng
Shreyas Chandrashekaran