Xiangyu Qi

Papers: 7

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

7papers

Authored papers

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

arXiv 2026

2026

On Evaluating the Durability of Safeguards for Open-Weight LLMs

arXiv 2024

2024

Safety Alignment Should Be Made More Than Just a Few Tokens Deep

arXiv 2024

2024

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

arXiv 2024

2024

Visual Adversarial Examples Jailbreak Aligned Large Language Models

arXiv 2023

2023

Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

arXiv 2023

2023

Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

CVPR 2022 1

2021

Affiliations

No known affiliations.

Frequent co-authors

from 7 papers

Peter Henderson

Prateek Mittal

Tinghao Xie

Ashwinee Panda

Boyi Wei

Kaixuan Huang

Luxi He

Ruoxi Jia

Yangsibo Huang

Yi Zeng