Sihao Hu
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation
arXiv 2025
Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable
arXiv 2025
Multi-Agent Reinforcement Learning with Focal Diversity Optimization
arXiv 2025
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey
arXiv 2024
Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning Attack
arXiv 2024
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
arXiv 2024
Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack
arXiv 2024
A Survey on Large Language Model-Based Game Agents
arXiv 2024
PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models
arXiv 2024
Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives
arXiv 2023
Affiliations
Frequent co-authors
8from 10 papers