Cite
Notes
Only stored in your browser.
Attribution
One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue
arXiv 2026
The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search
arXiv 2025
On Speeding Up Language Model Evaluation
arXiv 2024
from 3 papers
Eli Chien
Pan Li
Peizhi Niu
Pin-Yu Chen
Rongzhe Wei
Xinjie Shen
Bo Li
Carla P. Gomes
Christian K. Belardi
Haoyu Wang