InstructGPT

Closed

OpenAI's 2022 RLHF-fine-tuned GPT-3 family that introduced the SFT → reward-model → PPO recipe and was the immediate precursor to ChatGPT.

Publisher: OpenAI
Family: GPT
Params: 1.3B, 6B, 175B (text-davinci-001/002 / text-davinci-003)
Context: 2K
Input
Output
Providers: openai
License: Proprietary
Released: Jan 2022
Canonical: openai.com/index/instruction-following

Cite

Notes

Only stored in your browser.

Context

2K

Reported eval scores

0

No public eval scores tracked.

Introduced in

paperTraining language models to follow instructions with human feedback