Yuxin Xiao

Cite

Notes

Only stored in your browser.

Attribution

2papers

Authored papers

Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

arXiv 2025

When Style Breaks Safety: Defending Language Models Against Superficial Style Alignment

arXiv 2025

No known affiliations.

from 2 papers

Marzyeh Ghassemi

Narutatsu Ri

Sana Tonekaboni

Vinith Suriyakumar

Walter Gerych

Yik Siu Chan