0

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

Google paper isolating the 23 hardest BIG-Bench tasks (BBH) where prior models lagged humans, showing chain-of-thought prompting closes most of the gap.

Year
2022
Venue
ACL
Authors
13
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

TL;DR

Semantic Scholar

This work finds that applying chain-of-thought (CoT) prompting to BBH tasks enables PaLM to surpass the average human-rater performance on 10 of the 23 tasks, and Codex to surpass it on 17 of the23 tasks.

Authors

13