0

InstructGPT

Closed

OpenAI's 2022 RLHF-fine-tuned GPT-3 family that introduced the SFT → reward-model → PPO recipe and was the immediate precursor to ChatGPT.

Publisher
OpenAI
Family
GPT
Params
1.3B, 6B, 175B (text-davinci-001/002 / text-davinci-003)
Context
2K
Input
Output
Providers
openai
Released
Jan 2022

Cite

Notes

Only stored in your browser.

Context
2K

Reported eval scores

0
No public eval scores tracked.

Introduced in

paperTraining language models to follow instructions with human feedback