Tulu 3: Pushing Frontiers in Open Language Model Post-Training
Allen AI's fully open post-training recipe (data, code, weights) combining SFT, DPO, and a novel Reinforcement Learning with Verifiable Rewards (RLVR) stage that matches Llama 3 Instruct.
- Publisher
- Allen Institute for AI (Ai2)
- Year
- 2024
- Venue
- preprint
- Authors
- 24
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.
Introduces 4 artifacts - 2 tools, 2 models
TL;DR
Semantic Scholar
This work introduces Tulu 3, a family of fully-open state-of-the-art post-trained models, alongside its data, code, and training recipes, serving as a comprehensive guide for modern post-training techniques.
Artifacts
4Authors
24Alisa LiuFaeze BrahmanHamish IvisonHannaneh HajishirziJacob MorrisonLester James MirandaNathan LambertNouha DziriSaumya MalikShengyi "Costa" HuangValentina PyatkinYizhong WangLuca SoldainiYuling GuOyvind TafjordPradeep DasigiLester James V. MirandaNoah A. SmithShane LyuVictoria GrafJena D. HwangJiangjiang YangRonan Le BrasChris Wilhelm