Starling-7B: Improving Helpfulness and Harmlessness with RLAIF

UC Berkeley's Starling chat model trained with RLAIF on the Nectar 3.8M GPT-4-ranked preference dataset, one of the strongest 7B chat models of late 2023.

Open

Publisher: University of California, Berkeley
Year: 2024
Venue: ICML
ArXiv: arxiv.org/abs/2403.13780
Code: huggingface.co/datasets/berkeley-nest/Nectar
Authors: 8
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2403.13780
TL;DR: Semantic Scholar
Code: huggingface.co/datasets/berkeley-nest/Nectar

Attribution policy →

Introduces 1 artifact - 1 tool

Artifacts

Tools

Nectar

Authors

Banghua Zhu Evan Frick Hanlin Zhu Jian Zhang Jiantao Jiao Karthik Ganesan Tianhao Wu Wei-Lin Chiang