Deveval
DevEval benchmark: comprehensive evaluation of LLMs across software development lifecycle (implementation, unit testing, acceptance testing) for 21 real-world repositories across Python, C++, Java, and JavaScript
- Domain
- agent-eval
- Published
- Nov 2025
Cite
Notes
Only stored in your browser.
FAQ
- What is Deveval?
- DevEval benchmark: comprehensive evaluation of LLMs across software development lifecycle (implementation, unit testing, acceptance testing) for 21 real-world repositories across Python, C++, Java, and JavaScript