Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Google paper isolating the 23 hardest BIG-Bench tasks (BBH) where prior models lagged humans, showing chain-of-thought prompting closes most of the gap.
- Publisher
- Google Research
- Year
- 2022
- Venue
- ACL
- Authors
- 13
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.
TL;DR
Semantic Scholar
This work finds that applying chain-of-thought (CoT) prompting to BBH tasks enables PaLM to surpass the average human-rater performance on 10 of the 23 tasks, and Codex to surpass it on 17 of the23 tasks.