Cite
Notes
Only stored in your browser.
Attribution
Safety Alignment of LMs via Non-cooperative Games
arXiv 2025
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
arXiv 2024
from 2 papers
Arman Zharmagambetov
Brandon Amos
Chuan Guo
Ilia Kulikov
Ivan Evtimov
Kamalika Chaudhuri
Rémi Munos
Yuandong Tian