0

UltraFeedback

Fresh

OpenBMB's 64k-prompt preference dataset built with GPT-4 critiques across instruction-following, truthfulness, honesty, and helpfulness - the de facto open DPO baseline.

Type
Preference
Publisher
OpenBMB
Runtime
hf_parquet
License
MIT
Size
64k prompts (~256k responses, ~340k pairs in binarized variant)
Published
May 2026

Cite

Notes

Only stored in your browser.

Lift evidence

3

Models

Notable models trained on it

Zephyr-7B-betaStarling-7BNotusmany Llama-3 / Mistral DPO fine-tunes

Papers

1

Contributors

2