HelpSteer2: Open-source Dataset for Training Top-Performing Reward Models

NVIDIA-released 10K-sample multi-attribute preference dataset (helpfulness, correctness, coherence, complexity, verbosity) for training reward models.

Open

Publisher: NVIDIA
Year: 2024
Venue: NeurIPS
ArXiv: arxiv.org/abs/2406.08673
Code: huggingface.co/datasets/nvidia/HelpSteer2
Authors: 9
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2406.08673
TL;DR: semanticscholar.org/paper/f590d8926dd12345a3bd22253461850f5ca4b3ed
Code: huggingface.co/datasets/nvidia/HelpSteer2

Attribution policy →

Introduces 1 artifact - 1 tool

TL;DR

Semantic Scholar

This work proposes SteerLM 2.0, a model alignment approach that can effectively make use of the rich multi-attribute score predicted by the reward models, and releases HelpSteer2, a permissively licensed preference dataset (CC-BY-4.0).

Artifacts

Tools

HelpSteer2

Authors

Daniel Egert Gerald Shen Jiaqi Zeng Jimmy Zhang Makesh Narsimhan Sreedhar Oleksii Kuchaiev Olivier Delalleau Yi Dong Zhilin Wang