Program Synthesis with Large Language Models

Google paper that introduces MBPP - 974 short crowd-sourced Python problems with unit tests - and MathQA-Python, longtime companions to HumanEval.

Open

Publisher: Google Research
Year: 2021
Venue: preprint
ArXiv: arxiv.org/abs/2108.07732
Authors: 11
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2108.07732
TL;DR: semanticscholar.org/paper/a38e0f993e4805ba8a9beae4c275c91ffcec01df

Attribution policy →

Introduces 1 artifact - 1 eval

TL;DR

Semantic Scholar

The limits of the current generation of large language models for program synthesis in general purpose programming languages are explored, and the semantic grounding of these models is explored by fine-tuning them to predict the results of program execution.

Artifacts

Evals

Mostly Basic Python Problems (MBPP)

Authors

Augustus Odena Carrie Cai Charles Sutton David Dohan Ellen Jiang Henryk Michalewski Jacob Austin Maarten Bosma Maxwell Nye Michael Terry Quoc Le