Zhenhong Zhou
- Papers
- 8
Cite
Notes
Only stored in your browser.
8papers
Authored papers
8A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook
arXiv 2026
MemEvolve: Meta-Evolution of Agent Memory Systems
arXiv 2025
DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent
arXiv 2025
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
arXiv 2025
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
arXiv 2024
Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
arXiv 2024
On the Role of Attention Heads in Large Language Model Safety
arXiv 2024
Course-Correction: Safety Alignment Using Synthetic Preferences
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 8 papers