Jiaming Ji
Peking University PhD; first author of PKU-SafeRLHF / BeaverTails; core member of PKU-Alignment group.
- Role
- researcher
- Currently at
- Peking University
- twitter.com/jiamingji_
- GitHub
- github.com/calico-1226
- Scholar
- scholar.google.com/citations
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security
arXiv 2026
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
arXiv 2025
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
arXiv 2025
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment
arXiv 2025
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
arXiv 2024
ProgressGym: Alignment with a Millennium of Moral Progress
arXiv 2024
Language Models Resist Alignment: Evidence From Data Compression
arXiv 2024
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
NeurIPS
Safe RLHF: Safe Reinforcement Learning from Human Feedback
arXiv 2023
SafeDreamer: Safe Reinforcement Learning with World Models
arXiv 2023
Baichuan 2: Open Large-scale Language Models
arXiv 2023
Affiliations
Previously
Frequent co-authors
10from 11 papers
Yaodong Yang
professor
Boyuan Chen
researcher
Josef Dai
researcher
Juntao Dai
researcher
Mickel Liu
researcher
Ruiyang Sun
researcher
Tianyi Qiu
Xuehai Pan
grad-student
Borong Zhang
Ce Bian
researcher