Cite
Notes
Only stored in your browser.
Attribution
HumanEval Python code generation. Model completes a function, run against unit tests in an isolated subprocess. Pass/fail reward. Lightweight baseline
HumanEval Python code-gen with sandboxed unit-test execution, timeouts, import-safety, partial credit, and a syntactic-validity metric. For RL & eval.