0

code generation

Slug
code-generation
Evals
10
Tools
19
Models
446
Papers
8

Evals testing this capability

10
View all

Tools lifting evals here

19
View all

Top models on this capability

446

by avg parsed score across evals here

code generationBar chart with 21 bars. Highest value: Qwen3 (family) at 87.6.
21 models

Papers in this area

8