0

HumanEval: Python Function Generation from Instructions

Active

Assesses how accurately language models can write correct Python functions based solely on natural-language instructions provided as docstrings.

Publisher
OpenAI
Domain
Coding
License
mit
Published
May 2026
Notable for
Benchmark for evaluating Coding.

Cite

Notes

Only stored in your browser.

Related tools

4
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

Papers

1

FAQ

What is HumanEval: Python Function Generation from Instructions?
Assesses how accurately language models can write correct Python functions based solely on natural-language instructions provided as docstrings.
How can a model improve its HumanEval: Python Function Generation from Instructions score?
Tools linked to HumanEval: Python Function Generation from Instructions on Sophon include Humaneval RL Env (Community), CODE Humaneval RL Env (Community), Humaneval Multiturn RL Env (Community), Humaneval Tools RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is HumanEval: Python Function Generation from Instructions under?
HumanEval: Python Function Generation from Instructions is available under mit.