Cite
Notes
Only stored in your browser.
Attribution
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
arXiv 2025
from 1 papers
Bhavya Kailkhura
Hengrui Gu
Jie Peng
Kaixiong Zhou
Shuhang Lin
Tianlong Chen
Wenyue Hua
Yang Ouyang