0

PKU-SafeRLHF

Fresh

Peking University's dual-axis safety + helpfulness preference dataset with explicit harm-category labels, designed for Safe RLHF training.

Type
Preference
Publisher
PKU-Alignment
Runtime
hf_parquet
License
CC-BY-NC-4.0
Size
330k+ preference pairs
Published
May 2026

Cite

Notes

Only stored in your browser.

Lift evidence

3

Models

Notable models trained on it

Beaver-7Bmany academic Safe RLHF reproductionscomponents of safety mixtures in Tülu 3, Llama-Guard training research

Papers

1

Contributors

3