Cite
Notes
Only stored in your browser.
Attribution
When Style Breaks Safety: Defending Language Models Against Superficial Style Alignment
arXiv 2025
Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
from 2 papers
Marzyeh Ghassemi
Narutatsu Ri
Sana Tonekaboni
Vinith Suriyakumar
Walter Gerych
Yik Siu Chan